Hi! I'm Nicolas and I'm interested in information visualization, JavaScript and web standards.
I currently work as a Data Visualization Scientist at Twitter. I wrote PhiloGL, the JavaScript InfoVis Toolkit and V8-GL.
While reading Jason Hickey's Introduction to Objectve Caml I ran into some memoization examples that I found pretty interesting, and I wondered how memoization could be used/implemented in JavaScript.
What is memoization?
Memoization is a technique that uses side-effects to improve -at runtime- the performance of a function without altering its behavior. Roughly speaking, you can do memoization by storing function values that were returned in previous calls, and returning these values in further calls to the function instead of actually calling the function.
Where can I use memoization?
Not all functions can be memoized. In particular, pure functions can be memoized.
A function is said to be pure when it behaves in a predictable way, in the sense that for each x, the function will always return the same associated y value (i.e a single-valued map).
An example of a function that can be memoized is:
functionsquare(x){returnx*x;}
An example of a function that can't be memoized could be:
By introducing side-effects we alter the inner state of the function, having different return values for the same input. In this particular case, one could say that not_mem(0) != not_mem(0) is always true!
A somewhat generic way to do memoization
In OCaml we can define a higher order function memo that takes a function f as parameter and returns a memoized function that is perhaps faster than the input function.
We can see that the memo function has a table structure where it stores the f function results as (x, y) pairs. For each call to the memoized f function, memo will search in the table structure for an (x, y) pair that matches in x the input value. If found, it will return the associated y value:
matchentrieswith(x',y)::_whenx'=x->y
If not found, it will partially apply x to the original f function, and store the result in table as an (x, y) pair, to be used on further calls. It will finally return the computed value just as the orginial f function would have done:
lety=fxintable:=(x,y)::!table;y
A simple timing shows the performance improvements on further calls for the memoized fibonacci memo_fib function:
A couple of differences to notice for this version and the OCaml version are:
I'm not using a list of (x, y) pairs. Instead, I'm using a hashtable (or object) {}. This way, response time for the memoized function will not be proportional to the f function domain (as opposed to the OCaml memo function implementation).
I'm not using a local table variable to store previous calls. Instead, I'm using a property of the f function, f.memo.
In this particular case, the JS memo function behaves like the OCaml memo function for unary functions, since let y = f x for unary functions will evaluate f as opposed to currying the function. However, for (n > 1)-ary functions the OCaml version will return a curried function when called with less formal parameters than "expected".
Lets do some profiling. For this I'll be using the console.time and console.timeEnd firebug methods:
functionmemo(f){returnfunction(x){f.memo=f.memo||{};return(xinf.memo)?f.memo[x]:f.memo[x]=f(x);};}functionfib(x){if(x<2)return1;elsereturnfib(x-1)+fib(x-2);}varmemo_fib=memo(fib);//first callconsole.time("first call");console.log(memo_fib(30));console.timeEnd("first call");//console will output://first call: 17264ms//1346269//second call (memoized)console.time("memoized call");console.log(memo_fib(30));console.timeEnd("memoized call");//console will output://memoized call: 4ms//1346269
Beyond unary functions memoization in JavaScript
The Ocaml and JavaScript versions of memo lack support for (n>1)-ary functions.
If the OCaml memo function is applied to a binary function, it will only memoize the first partial application for this function.
Lets define a function sum_fib and memo_sum_fib:
This problem happens because the memoization happens at the first partial application level. That means that the table structure will hold (int, int -> int) elements, as opposed of a mapping from a pair of formal parameters to the returned value: (int * int, int).
In JavaScript we have a bigger problem. Since partial application is not a "natural" feature of the language and the memo function is not designed to handle n-ary functions, this will lead to an error or unexpected results. At least the OCaml version returned the expected values :P.
In JavaScript we can solve this problem by using the arguments object as the key to our f.memo hashtable. Our new memo function would now look like this:
Not only this function supports n-ary functions to be memoized, but also it performs a correct memoization, in the sense that it will store all formal parameters in f.memo, as the key to the value returned by this function.
Lets do some profiling!
functionmemo(f){returnfunction(){varargs=Array.prototype.slice.call(arguments);f.memo=f.memo||{};return(argsinf.memo)?f.memo[args]:f.memo[args]=f.apply(this,args);};}functionfib(x){if(x<2)return1;elsereturnfib(x-1)+fib(x-2);}functionsum_fib(a,b){returnfib(a)+fib(b);}varmemo_sum_fib=memo(sum_fib);console.time("first call");console.log(memo_sum_fib(20,30));console.timeEnd("first call");//console will output://first call: 17165ms//1357215console.time("memoized call");console.log(memo_sum_fib(20,30));console.timeEnd("memoized call");//console will output://memoized call: 5ms//1357215
Finally, a nice trick you can do is to call the memoized function the same a the function passed in as a formal parameter. Repeating the last example will show a lot of improvements!
functionmemo(f){returnfunction(x){f.memo=f.memo||{};return(xinf.memo)?f.memo[x]:f.memo[x]=f(x);};}functionfib(x){if(x<2)return1;elsereturnfib(x-1)+fib(x-2);}varfib=memo(fib);//first callconsole.time("first call");console.log(fib(30));console.timeEnd("first call");//console will output://first call: 7ms//1346269//second call (memoized)console.time("memoized call");console.log(fib(30));console.timeEnd("memoized call");//console will output://memoized call: 5ms//1346269
Any critique or comment will be well appreciated.
Bye!.