Memoization in JavaScript

Posted in: javascript
While reading Jason Hickey's Introduction to Objectve Caml I ran into some memoization examples that I found pretty interesting, and I wondered how memoization could be used/implemented in JavaScript. What is memoization? Memoization is a technique that uses side-effects to improve -at runtime- the performance of a function without altering its behavior. Roughly speaking, you can do memoization by storing function values that were returned in previous calls, and returning these values in further calls to the function instead of actually calling the function. Where can I use memoization? Not all functions can be memoized. In particular, pure functions can be memoized. A function is said to be pure when it behaves in a predictable way, in the sense that for each x, the function will always return the same associated y value (i.e a single-valued map). An example of a function that can be memoized is:
function square(x) {
     return x * x;
}
An example of a function that can't be memoized could be:
var index = 1;
function not_mem(x) {
     index = index + 1;
     return x + index;
}
By introducing side-effects we alter the inner state of the function, having different return values for the same input. In this particular case, one could say that not_mem(0) != not_mem(0) is always true! A somewhat generic way to do memoization In OCaml we can define a higher order function memo that takes a function f as parameter and returns a memoized function that is perhaps faster than the input function.
   let memo f =
      let table = ref [] in
      let rec find_or_apply entries x =
         match entries with
            (x', y) :: _ when x' = x -> y
          | _ :: entries -> find_or_apply entries x
          | [] ->
             let y = f x in
             table := (x, y) :: !table;
             y
      in
      (fun x -> find_or_apply !table x)
We can see that the memo function has a table structure where it stores the f function results as (x, y) pairs. For each call to the memoized f function, memo will search in the table structure for an (x, y) pair that matches in x the input value. If found, it will return the associated y value:
         match entries with
            (x', y) :: _ when x' = x -> y
If not found, it will partially apply x to the original f function, and store the result in table as an (x, y) pair, to be used on further calls. It will finally return the computed value just as the orginial f function would have done:
             let y = f x in
             table := (x, y) :: !table;
             y
A simple timing shows the performance improvements on further calls for the memoized fibonacci memo_fib function:
#  time memo_fib 40;;
Elapsed time: 14.581937 seconds
- : int = 102334155
# time memo_fib 40;;
Elapsed time: 0.000009 seconds
- : int = 102334155
Achieving the same thing in JavaScript For unary functions a simple definition of memo could be:
function memo(f) {
  return function (x) {
      f.memo = f.memo || {};
      return (x in f.memo)? f.memo[x] : f.memo[x] = f(x); 
  }; 
}
A couple of differences to notice for this version and the OCaml version are: Lets do some profiling. For this I'll be using the console.time and console.timeEnd firebug methods:
function memo(f) {
  return function (x) {
      f.memo = f.memo || {};
      return (x in f.memo)? f.memo[x] : f.memo[x] = f(x); 
  }; 
}

function fib(x) {
    if(x < 2) return 1; else return fib(x-1) + fib(x-2);
}

var memo_fib = memo(fib);
//first call
console.time("first call");
console.log(memo_fib(30));
console.timeEnd("first call");

//console will output:
//first call: 17264ms
//1346269

//second call (memoized)
console.time("memoized call");
console.log(memo_fib(30));
console.timeEnd("memoized call");

//console will output:
//memoized call: 4ms
//1346269
Beyond unary functions memoization in JavaScript The Ocaml and JavaScript versions of memo lack support for (n>1)-ary functions. If the OCaml memo function is applied to a binary function, it will only memoize the first partial application for this function. Lets define a function sum_fib and memo_sum_fib:
# let sum_fib a b = (fib a) + (fib b);;
<em>val sum_fib : int -> int -> int = &lt;fun></em>
# let memo_sum_fib = memo sum_fib;;
<em>val memo_sum_fib : int -> int -> int = &lt;fun></em>
Timing the memoized function might lead to some unexpected results:
# time memo_sum_fib 30 40;;
Elapsed time: 18.753172 seconds
- : int = 166926410
# time memo_sum_fib 30 40;;
Elapsed time: 18.753172 seconds
- : int = 166926410
This problem happens because the memoization happens at the first partial application level. That means that the table structure will hold (int, int -> int) elements, as opposed of a mapping from a pair of formal parameters to the returned value: (int * int, int). In JavaScript we have a bigger problem. Since partial application is not a "natural" feature of the language and the memo function is not designed to handle n-ary functions, this will lead to an error or unexpected results. At least the OCaml version returned the expected values :P. In JavaScript we can solve this problem by using the arguments object as the key to our f.memo hashtable. Our new memo function would now look like this:
function memo(f) {
  return function () {
      var args = Array.prototype.slice.call(arguments);
      f.memo = f.memo || {};
      return (args in f.memo)? f.memo[args] : 
                     f.memo[args] = f.apply(this, args); 
  }; 
}
Not only this function supports n-ary functions to be memoized, but also it performs a correct memoization, in the sense that it will store all formal parameters in f.memo, as the key to the value returned by this function. Lets do some profiling!
function memo(f) {
  return function () {
      var args = Array.prototype.slice.call(arguments);
      f.memo = f.memo || {};
      return (args in f.memo)? f.memo[args] :
                     f.memo[args] = f.apply(this, args);
  };
}

function fib(x) {
    if(x < 2) return 1; else return fib(x-1) + fib(x-2);
}

function sum_fib(a, b) {
    return fib(a) + fib(b);
}

var memo_sum_fib = memo(sum_fib);
console.time("first call");
console.log(memo_sum_fib(20, 30));
console.timeEnd("first call");

//console will output:
//first call: 17165ms
//1357215

console.time("memoized call");
console.log(memo_sum_fib(20, 30));
console.timeEnd("memoized call");

//console will output:
//memoized call: 5ms
//1357215
Finally, a nice trick you can do is to call the memoized function the same a the function passed in as a formal parameter. Repeating the last example will show a lot of improvements!
function memo(f) {
  return function (x) {
      f.memo = f.memo || {};
      return (x in f.memo)? f.memo[x] : f.memo[x] = f(x);
  };
}

function fib(x) {
    if(x < 2) return 1; else return fib(x-1) + fib(x-2);
}

var fib = memo(fib);
//first call
console.time("first call");
console.log(fib(30));
console.timeEnd("first call");

//console will output:
//first call: 7ms
//1346269

//second call (memoized)
console.time("memoized call");
console.log(fib(30));
console.timeEnd("memoized call");

//console will output:
//memoized call: 5ms
//1346269
Any critique or comment will be well appreciated. Bye!.
Comments
blog comments powered by Disqus