Hi! I'm Nicolas and I'm interested in information visualization, JavaScript and web standards.
I currently work as a Data Visualization Scientist at Twitter. I wrote PhiloGL, the JavaScript InfoVis Toolkit and V8-GL.
I'm currently working on three new visualizations for the JavaScript InfoVis Toolkit that will be added in version 1.2.
I started the development of these new visualizations a couple of weeks ago, and while trying to incorporate these new visualizations harmoniously into the Toolkit I've found myself considering new design and code abstractions I haven't thought about before. That's good for the Toolkit, and also good for my brain :)
One of the visualizations I'll be incorporating in the next major release is a Force-Directed visualization. I first want to thank Marcus Cobden for donating a lot of code related to this algorithm that you can find here.
My code is based in his donation and also in a nice overview of Force-Directed layout algorithms that I've found here.
So what are Force-Directed Layout Algorithms?
Force-Directed Layout algorithms are graph drawing algorithms based only on information contained within the structure of the graph itself rather than relying on contextual information. The most straightforward Force-Directed algorithm uses repulsive forces between nodes and attractive forces between adjacent nodes. This physical model produces some local minima in which the graph, well, gets to a stable configuration and is drawn in an aesthetically pleasing way.
The main algorithm consists on a main loop that simulates the system for some iterations and then plots the graph. The system will consist on repulsive forces between all graph nodes and attractive forces between adjacent nodes. The force models considered correspond to Hooke's law and Coulomb's law.
Here's an example with some JSON data I also use for other examples at the demos page:
There are lots of refinements and different methods for making Force-Directed Layouts. Some are just based on the model I described before, others are based on what's called Graph Theoretic Distances: If for two adjacent nodes their ideal distance is set as d, then for nodes with shortest path distance of n their ideal layout distance should be n * d. Attractive and repulsive forces are used to make a graph system with random nodes positions get to a local minima based on these rules.
I implemented the first model I described. For each iteration the computePositionStep method is called to update all nodes positions based on attractive and repulsive forces. The formal parameters passed to the method are property which is an array of node properties which contain positions to be updated and opt which is an object containing layout options like generic repulsive and attractive force functions. $C creates a new Complex Number.
computePositionStep:function(property,opt){vargraph=this.graph,GUtil=Graph.Util;varmin=Math.min,max=Math.max;vardpos=$C(0,0);//calculate repulsive forcesGUtil.eachNode(graph,function(v){//initialize disp$each(property,function(p){v.disp[p].x=0;v.disp[p].y=0;});GUtil.eachNode(graph,function(u){if(u.id!=v.id){$each(property,function(p){varvp=v.getPos(p),up=u.getPos(p);dpos.x=vp.x-up.x;dpos.y=vp.y-up.y;varnorm=dpos.norm()||1;v.disp[p].$add(dpos.$scale(opt.nodef(norm)/norm));});}});});//calculate attractive forcesvarT=!!graph.getNode(this.root).visited;GUtil.eachNode(graph,function(node){GUtil.eachAdjacency(node,function(adj){varnodeTo=adj.nodeTo;if(!!nodeTo.visited===T){$each(property,function(p){varvp=node.getPos(p),up=nodeTo.getPos(p);dpos.x=vp.x-up.x;dpos.y=vp.y-up.y;varnorm=dpos.norm()||1;node.disp[p].$add(dpos.$scale(-opt.edgef(norm)/norm));nodeTo.disp[p].$add(dpos.$scale(-1));});}});node.visited=!T;});//arrange positions to fit the canvasvart=opt.t,w2=opt.width/2,h2=opt.height/2;GUtil.eachNode(graph,function(u){$each(property,function(p){vardisp=u.disp[p];varnorm=disp.norm()||1;varp=u.getPos(p);p.$add($C(disp.x*min(Math.abs(disp.x),t)/norm,disp.y*min(Math.abs(disp.y),t)/norm));p.x=min(w2,max(-w2,p.x));p.y=min(h2,max(-h2,p.y));});});}
You can find more information about this algorithm in this overview of Force-Directed algorithms.
This method is called multiple times to provide a final layout, for example like this:
Where times corresponds to the number of iterations of the simulation, and is generally > 50 and t can be thought as the global temperature of a system that gets cooler with each iteration.
Performance
Although there are a couple of methods to reduce the complexity of this algorithm, this implementation runs like O(V^3), which can be considered as really bad performance.
This is not desirable for real-time interactive visualizations, in which everlasting computation algorithms can lead to blocking browser popups like "This Script is Taking too long..." or things like that.
One way you can avoid these popups is by making incremental computations. This can be done by splitting the main algorithm that has for example 100 iterations into smaller pieces of 20 iterations each. The main iteration loop could then be changed into something like this:
This method could be called for example like this:
//compute positions incrementally and animate.fd.computePositions(property,opt,{iter:20,onStep:function(perc){Log.write(perc+'% loaded...');},onComplete:function(){Log.write('done');fd.animate();}});
And the main computation would be split into smaller parts calling on each step of the computation the onStep callback.
Graph Operations
For graph operations (Adding/Removing nodes and edges, Graph Sum and Graph Morphing) I make all iterations in the first layout and then only apply 20 iterations after some operation has been completed. Here's a video:
When (*not*) to use Force-Directed Layouts
Force-Directed Layouts can be a good choice when you're drawing general graphs and you don't have domain specific (or any other topological) information about them.
Tree structures however are a special case of graphs: this means that tree structures have more constrains and therefore there's more contextual information for drawing a Tree than for drawing a general graph.
Also, Trees are easier to grasp for the human eye than "general graphs". There's that intrinsic notion of hierarchy that can be used to draw aesthetically pleasing graph layouts. Think about the "general graph" layouts that start with the drawing of a spanning tree and then add the "extra edges" to complete the graph structure (for example here and here). The Multitrees method I mentioned a couple of posts ago was also developed to try to extract more contextual information about a graph and draw it based on a notion of partial hierarchy.
Because of this, in my opinion, it wouldn't be wise to plot Trees that have many nodes with this Force-Directed algorithm: this extra-information about trees isn't used in this layout. However, if you're planning on drawing general graphs and you don't have any other information about them, then this algorithm can detect interesting symmetries and make aesthetically pleasing drawings.
I'll keep you updated with the other visualizations I'm working on in the next posts.