Memoization - Optimize your function calls

Memoization is an optimization technique where the idea is to create functions that cache computed values for various input sets so that subsequent invocations can return the value from the cache instead of re-computing the results. It is a simple space for time trade-off where processing time is reduced at the expense of increased memory use. I figured I'd implement an automatic memoization mechanism in JavaScript that will work well under certain well defined constraints. Here's what I came up with:

function memoize( fn ) {
    //
    // check if this function has already been memoized
    //
    if( typeof( fn.__fn ) == "undefined" ) {
        //
        // the cache where the results are to be stored
        //
        fn.__cache = {};

        //
        // memoized version of the function
        //
        fn.__fn = function() {
            //
            // build the key that represents the given input set;
            // note that this thing works only so long as the operation
            // "toString" returns a meaningful result on all of the
            // parameters
            //
            var key = "";
            for( var i = 0 ; i < arguments.length ; ++i )
                key += arguments[i].toString() + "-";

            //
            // if the result for the current parameter set already exists
            // in the cache then return that; otherwise call the original
            // routine and store the result
            //
            if( typeof( fn.__cache[key] ) == "undefined" )
                fn.__cache[key] = fn.apply( null, arguments );
            return fn.__cache[key];
        }
    }

    return fn.__fn;
}

Imagine that you are tasked with the job of writing a Javascript function that returns the factorial given a number. If you wished to write a memoized version of the function here's what you'd do:

var fac = memoize( function( n ) {
    if( n <= 0 )
        return 1;
    return ( n * fac( n - 1 ) );
} );

If you invoked fac passing 5 for instance, and traced the control flow, here's the sequence of calls you'd see:

. __fn(5)
.. fac(5)
... __fn(4)
.... fac(4)
..... __fn(3)
...... fac(3)
....... __fn(2)
........ fac(2)
......... __fn(1)
.......... fac(1)
........... __fn(0)
............ fac(0)

The number of dots to the left of the function names indicates the stack depth and here it tends to increase because fac is a recursive function and keeps calling itself till the termination criteria is met. This seems like a drawback since where you'd have had a sequence of calls to just fac now we see it interspersed with calls to __fn also, effectively doubling the number of function calls that needs to be made. The benefit however, is realized when we observe what happens when fac is invoked again, a second time, passing 5.

. __fn(5)

As you can see, this time around, there was only a single function call. No further computation was needed as the required result was already available in the cache and only a simple look-up operation was performed. Observe what happens when we pass 3 next.

. __fn(3)

This was resolved via cache look-up also because the first call with 5 had recursively invoked fac with 3 as well. On a similar note, passing 7 produces the following call sequence:

. __fn(7)
.. fac(7)
... __fn(6)
.... fac(6)
..... __fn(5)

Automatic memoization ensures that the original routine is called only the bare minimum number of times - in this case, for the values 7 and 6. The implementation of memoize given above is useful only so long as the following are held true:

  1. Each parameter passed to the actual function has a toString method that returns a meaningful value; if you passed a custom object for instance, you're going to have to define a toString method on it.
  2. The CPU overhead of constructing a unique key for a given parameter set and doing hashtable lookups should be lower (preferably significantly) than simply performing the actual work.
  3. For obvious reasons, the function should have at least 1 parameter!

If you've been following this blog, you know that I am trying to learn the Common Lisp programming language. I figured, I'll try and implement memoize in Common Lisp as well, to see if I'd learned enough of it to be able to do this. After looking up the Common Lisp Hyperspec a few times, I was in fact able to come up with an implementation that does essentially the same thing as what the JavaScript version above does. Here goes:

(defun make-key (lst)
  (format nil "~{~A-~}" lst))

(defun memoize (fn)
  (let ((cache (make-hash-table :test 'equal)))
    #'(lambda (&rest p)
    (let ((key (make-key p)))
      (if (not (gethash key cache))
        (setf (gethash key cache) (apply fn p)))
      (gethash key cache)))))

(setf fac (memoize #'(lambda (n)
                        (if (<= n 0) 1
                            (* n (funcall fac (- n 1)))))))
comments powered by Disqus