Chris Harrison – ClusterBall

My friend Luke posted this at his blog, but I have to repost it. Chris Harrison has what he thinks may be a novel way to do visual clustering, for category pages at Wikipedia, called ClusterBall. It looks like he first sorts a bunch of nodes around a circle, then sorts nodes which can float in the center of the circle, and a parent node is fixed at the center. (It reminds me a little bit of this method that Matt Hurst referenced, where you try to find a minimizing spanning tree, or backbone, to get from one node to another. You layout areas of the graph first locally, then arrange those local areas on your spanning tree.)


This visualization shows the structure of three levels of Wikipedia category pages and their interconnections. Centered in the graph is a parent node. Pages that are linked from this parent node are rendered inside the ball. Finally, pages that are linked to the latter (secondary) nodes are rendered on the outer ring. Links between category pages are illustrated by edges, which are color coded to represent their depth from the parent node. Nodes are clustered such that edge lengths are minimized. This forces highly connected groups of pages to clump together, essentially forming topical groups. The center acts as an anchor while the ring provides a fixed perimeter. This allows the secondary, super-categories to “float” above clusters.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s