Lambda

22 Aug 2008

PHP has been in dire need of some new features for some time. PHP 5.3 plans to add a slew of new features, finally adding lambdas a/k/a anonymous functions to the language.

That is… …once it gets out of alpha. Then out of beta. Then onto production systems.

(PHP haters can now stop sipping quite so much hater-ade in that PHP now also supports namespaces, but I don’t find that new feature particularly interesting.)

What makes lambdas so interesting is how much simpler and cleaner they can make your code.

Consider the problem of wanting to sort a list where some of the items have the words ‘The’ or ‘A’ as a prefix. We want to sort the list

  • Beck
  • The Starting Line
  • Blue October
  • The All American Rejects

to be

  • The All American Rejects
  • Beck
  • Blue October
  • The Starting Line

and not

  • Beck
  • Blue October
  • The All American Rejects
  • The Starting Line

Without the use of lambdas, we have to do something like this in old PHP:

function stem_the($a)
{
    return preg_replace( 
    '/^(a|the) *(.*)$/i', '$2', $a 
    );
}

function cmp_the($a,$b)
{
    $a = stem_the($a);
    $b = stem_the($b);
    return strcmp($a,$b);
}

$bands = array( 'The All American Rejects',
        'Beck', 'Blue October',
        'The Starting Line' );

usort( $bands, 'cmp_the' );

usort()’s second argument is a string. This is really ugly. Old PHP didn’t have functions as first-class citizens, so that was the only way we could do something like this. It’s ugly and slow.

Moreover, we’re forced to create two functions stem_the() and cmp_the() (my names are always this creative). We’ve cluttered the global function namespace for our very trivial task.

Enter stage left, Dr. Martin Luther Lambda. His dream is promoting functions to first-class citizens…

With lambdas, we can rewrite that entire mess to be the clean, simple, elegant mess that follows:

$bands = array( 'The All American Rejects',
        'Beck', 'Blue October',
        'The Starting Line' );

usort( $bands, function($a,$b)
{
    list($a,$b) = array_map(function($s) {
    return preg_replace('/^(a|the) *(.*)$/i','$2', $s);
    }, array($a,$b));
    return strcmp($a,$b);
});

Notice that this code doesn’t create any new functions in the global namespace. It’s slightly harder to read initially (mostly because of the gratuitous use of the list() language construct), but it is vastly cleaner and, in a word elegant.

(Aside: calling super geeky things elegant is not even close to pass’e yet.)

We can also now perform the operation of currying-partially instantiating a function with multiple parameters. Let’s say we want to apply a function to each item in a list that adds 7 to that item. PHP has a built-in array_map() function, used above, that allows us to apply a function to every item in an array, but its argument is again a string, and we still have to create a global-namespace function to accomplish this.

Let’s just say we have a function add() such as

function add($x,$y) { return $x + $y; }

and we want to create a function add7($x) that returns $x + 7. We could simply create it:

function add7($x){ return $x + 7; }

Now what we want to do is partially instantiate our add() function to have a sort of ‘hard-coded’ first argument 7. In this way, we would define add7() to be

    function add7($x){ return add(7,$x); }

We just had to create another function in the global namespace: bloating our memory footprint and creating an obtrusive disaster of our codebase to accomplish a trivial task.

With currying, we could simply do this:

$add  = function()( $x, $y ) { return $x + $y; };
$add7 = curry($add,7);

The punchline is that curry() returns a function. It’s a robot that gives birth to new robots. Elegant?

So we can increment every item in an array by 7 very simply:

$nums = array( 1, 2, 3, 4 );
$add  = function()( $x, $y ) { return $x + $y; };
$nums = array_map( curry($add,7), $num );

(This example is very contrived, of course, since there was no reason to use currying here-we could have simply had

$nums = array( 1, 2, 3, 4 );
$nums = array_map( function($x){ return $x + 7; }, $num );

but cut me some slack.)

Unfortunately curry() is not yet part of the PHP standard library. It’s also not inherently obvious how to implement it-the necessary language constructs were only introduced a mere three weeks ago today.

Fortunately, I have implemented curry() for you. It’s a bit ugly, but it works. It’s actually fairly fast as well (although my initial tests of it were not exhaustive).

function curry($fn, $arg)
{
    return function() use ($fn, $arg)
    {
        $args = func_get_args();
        array_unshift( $args, $arg );
        return call_user_func_array( $fn, $args );
    };
}

It’s cool because you can chain it to itself. So if you wanted a function that always returned 7, you could simply say:

$return7 = curry(curry($add,3),4);

Or if you had a function that took three arguments and you wanted a new function with the first two ‘hard-coded’ (imagine g(x) = f(1,3,x) from math), you could simply “curry f()” twice:

$g = curry(curry($f,3),1);

A better implementation would be for curry() to take a variable number of arguments and to curry all of them in:

$g = curry($f,3,1); // or `curry($f,1,3)` maybe?

I’ll leave this as ‘an exercise for the reader’, but it’s not too difficult to implement.

Awesome, n’est-ce pas?

You, too, can have some PHP 5.3 alpha goodness by running the following nonsense on your command line:

mkdir -p "$HOME/src/php53alpha/build"
cd "$HOME/src/php53alpha"
wget http://downloads.php.net/johannes/php-5.3.0alpha1.tar.gz
tar zxf php-5.3.0alpha1.tar.gz
cd php-5.3.0alpha1
./configure --prefix="$PWD/../build"
make
make install

This installs the php binary in ~/src/php53alpha/build/bin, so you can run your .php scripts by simply calling, e.g.,

~/src/php53alpha/build/bin/php my_script_file.php

Also read:

Enjoy.

I recently had the wonderful experience of configuring UW’s Pubcookie for use on Ubuntu’s package-manager version of Apache2. I took modest notes while doing this, and it was spread out over several weeks whenever I could find the time, so maybe don’t take this as a guide so much as a few installation notes, but these are the spots that gave me the most trouble.

This accompanies the Pubcookie Apache Module installation guide at http://pubcookie.org/docs/install-mod_pubcookie-3.3.html and the UW-specific guide notes at http://www.washington.edu/computing/pubcookie/uwash-install.html.

(Of course following the convention that $ are commands run as a limited user while # commands are run as root.)

  1. Ensure Apache2 is installed and configured with OpenSSL:

    # apt-get install apache2-threaded-dev apache2 openssl
    # mkdir -p /etc/apache2/ssl
  2. Obtain Weblogin Server Registration for your hostname at https://server-reg.cac.washington.edu/pubcookie/

  3. Get the UW Root Cert from http://certs.cac.washington.edu/?req=svpem

    I put this file at /etc/apache2/ssl/server.pem. This is the server’s public key.

  4. Get the CA Bundle from http://www.washington.edu/computing/pubcookie/ca-bundle.crt

    I put this file at /etc/apache2/ssl/ca-bundle.crt

    This file allows the server to verify peers’ certificates and is used by keyclient.

  5. Generate your cert’s private key and have it signed by your CA.

    Information on how to generate a private key and a signature signing request are probably documented on whatever site is signing your certificate. The UW CA’s Technical Information can be found at https://www.washington.edu/computing/ca/infra/.

    Generating a request for the UW CA (and probably all other CAs as well) is simply a matter of:

    # cd /etc/apache2/ssl
    # openssl req -nodes -newkey 1024 \
        -keyout key.pem -out req.pem

    When I went through the request process, the CA gave me the following values to fill in to the request UI:

    Country (C)         US
    State (ST)          WA or Washington
    Organization (O)    Optional
    Organizational Unit (OU)    Optional
    Common Name (CN)    Your host's fully qualified domain name
  6. Put your private key at /etc/apache2/ssl/key.pem and your CA-signed certificate at /etc/apache2/ssl/cert.pem.

    (Note that the above step should generate the key and the request at these locations already.)

  7. Working in $HOME, get the Pubcookie tarball and unzip:

    $ mkdir -p $HOME/pubcookie
    $ wget http://www.pubcookie.org/downloads/pubcookie-3.3.3.tar.gz
    $ tar xzf pubcookie-3.3.3.tar.gz
  8. Modify the configure script to know where apache’s PREFIX is. This problem seems to come from the fact that Apache isn’t built from source locally when using aptitude.

    The diff for this modification is

    3783c3783
     <   APACHE_PREFIX=`$APXS -q PREFIX`
     ---
     >   APACHE_PREFIX="/usr/share/apache2" #`$APXS -q PREFIX`

    This via a message from the Pubcookie mailing list.

  9. Configure, compile install:

    $ cd $HOME/pubcookie/pubcookie-3.3.3/
    $ ./configure   \
        --enable-apache  \
        --prefix=/usr/local/pubcookie  \
        --with-apxs=/usr/bin/apxs2
    $ make
    $ sudo make install
  10. Based on information from the installation guide, the following serves as a good checkpoint:

    $ ls -F /usr/local/pubcookie
    keyclient*      keys/
  11. Here is my keyclient configuration file, /usr/local/pubcookie/config

    # ssl config
    ssl_key_file: /etc/apache2/ssl/key.pem
    ssl_cert_file: /etc/apache2/ssl/cert.pem
    
    # keyclient-specific config
    keymgt_uri: https://weblogin.washington.edu:2222
    ssl_ca_file: /etc/apache2/ssl/ca-bundle.crt
  12. Run keyclient to request a new key and to download the “granting” certificate:

    # cd /user/local/pubcookie
    # ./keyclient
    # ./keyclient -G keys/pubcookie_granting.cert
  13. Create a pubcookie load file so we can continue to use Ubuntu’s methodology for managing Apache extensions (e.g. using a2enmod and a2dismod, which really only create/modify symlinks in /etc/apache2/mods-enabled but are sometimes reportedly used by other installation scripts):

    # echo 'LoadModule pubcookie_module /usr/lib/apache2/modules/mod_pubcookie.so' \
    > /etc/apache2/mods-available/pubcookie.load
  14. Stop Apache and load the pubcookie module:

    # apache2ctl stop
    # a2enmod pubcookie
  15. Set Pubcookie directives in /etc/apache2/httpd.conf:

    PubcookieGrantingCertFile /usr/local/pubcookie/keys/pubcookie_granting.cert
    PubcookieSessionKeyFile /etc/apache2/ssl/key.pem
    PubcookieSessionCertFile /etc/apache2/ssl/cert.pem
    PubcookieLogin https://weblogin.washington.edu/
    PubcookieLoginMethod POST
    PubcookieDomain .washington.edu
    PubcookieKeyDir /usr/local/pubcookie/keys/
    PubCookieAuthTypeNames UWNetID null SecurID

    Note that Ubuntu’s Apache likes to have lots of configuration files. The main configuration happens in /etc/apache2/apache2.conf while “user” modifications appen in the httpd.conf file as per above. You will also need to have Apache listen for SSL requests on port 443 by modifying /etc/apache2/ports.conf to include

    Listen *:443
  16. Enable SSL on your default site. This can usually be done by modifying /etc/apache2/sites-available/default to include

    SSLEngine on
    SSLCertificateFile /etc/apache2/ssl/cert.pem
    SSLCertificateKeyFile /etc/apache2/ssl/key.pem

    You might be able to get away with only enabling SSL on select virtual hosts if your environment is such that you have multiple host names pointing to the same Apache instance. I was able to accomplish this to some degree but am still working out a few ambiguities that Apache isn’t telling me about.

  17. Restart Apache and you’re good to go:

    # apache2ctl -k start

You now have the usual .htaccess directives at your disposal. E.g.:

AuthType UWNetID
Require valid-user

Have some sorbet?

An End to WWW

09 Sep 2007

One of the problems I’ve always had with URLs is, well, they’re hard to say or hard to remember. Every domain name I’ve registered has had as its basic premise either ease of typing or memorability. I hate that my school (and work) TLD (top-level domain) is something as horridly long as “washington.edu”, and I’m driven to tears every time I have to type in a subdomain of that TLD. I have even purchased domain names for some of the pages on that domain because I hate typing them out so much.

Anything that contributes to this madness drives me to insanity.

This is 2007. We have CNAMES and mod-rewrite and all sorts of fun things that help users with URLs. You do not need the “www” in front of a URL. It’s almost as absurd as the “http://”. There are occasional exceptions, but 99% of the time, the sites are set up to automatically redirect to the proper page, and the user never even sees it.

So please, do your users (and me) a favor, and stop advertising your site with the darn w-w-w…