Visualizing Organizational Structure From Communications Data

 An excellent teaching example that illustrates an interesting intersection of physics, math, and computer science. This explains the concepts behind a force-directed graph, then uses a force-directed graph to generate an illustration of organizational structure based on the communications within the organization. This is an example of how significant amounts of information can be extracted from existing collections of data, sometimes without much additional effort. This can easily segue into a discussion of privacy and ethics.

Email and other communications archiving is increasingly widespread. This presents an underappreciated opportunity for data mining and visualization. Drawing an analogy between communications patterns and organizational structure yields a quick and interesting visualization. To create this visualization, we map the communications patterns to a physical model, then simulate the physical model using SVG and JavaScript.

The Model

Start with the assumption that the more communication there is between two people the closer they are within an organization, or at least the more they informally influence each other. Physically model this by putting a spring between these two people, where the more communications there are, the stronger the spring.

But, we don't want the model to collapse, which is what it would do if we have only springs. The next component then, is to put an electric charge on each person. Remember like charges repel, so this spreads out the model. The springs pull people together, and the charges spread them out.

Finally, we add friction so the model will stop eventually.

The Math

The conceptual model is easily translated into a mathematical model, which is in turn translated into a computational algorithm.

The Springs

Springs are governed by Hooke's law, F=-kr . In this model k is the number of message exchanged between two parties.

Springs pull the parties together, with equal and opposite forces on each.

The total force on each party due to all the other parties is the sum of the spring force from all the other parties. Fi = -k ji xj - xi2 + yj - yi2 where the location of each party is xj yj

The Charges

The force between two charges is governed by Coulomb's law F=ke q1q2 r2 where ke is Coulomb's constant.

Charges push the parties apart, with equal and opposite forces on each.

The electric force on each party is the sum of the force due to each of the other parties. Fi=ke ji qiqj xj - xi2 + yj - yi2

The Total Force

The total force on each party is the sum of these two forces. We see this schematically below where we see curves for electric force, the spring force, and in red the sum of the two. It is the fundamentally different shape of these curves that makes the force-directed graphs work. At a large distance the spring force dominates and pulls the parties together. At a short distance the electric force will dominate and force the parties apart. The neutral position where the red line crosses the x axis and where the parties can come to rest will be some intermediate distance.

The total force on a party is the sum of the spring and electric forces.


We add one more force, friction, which only acts while the parties are in motion. F=-bv This force is in the opposite direction from the motion, and slows the parties down. Without it, the parties would remain in motion forever and never settle down into a stationary graph.

A Live Example

Let's try a live example so we can see all this in action. Start with a group of people, and the count of messages they have sent to each other. Message counts less than 10 have been dropped for clarity.

PersonPersonMessage Count

Each person is positioned randomly, and we see them move under the influence of the springs and charges eventually coming to a stop due to friction.

An example force directed graph providing a visualization of the
dynamics of the system as it settles into a stable configuration.

When the animation comes to a stop, the layout of the parties is determined by the level of communications. Closer parties are more closely associated with each other. The presence of the charges ensures that the graph is spread out and easy to view and interpret.

We quickly see one person at the center, Mary, the director of development. If we look at the code we see that she has been given a pair of red shoes. The ends of the barbell distribution are two separate development groups under the director.