Logarithms on the Web: Using Mathematics to Display a Tag Cloud

A Logarithm in Creation

Although I am a computer programmer, I also studied Mathematics in college. Naturally, I was excited when a computer program I was working on involved some “complex” math (don’t worry, no integrals or infinite series ahead).

The Problem

The tags using a basic linear scale. The other tags are so small you can’t see them

I was trying to display a tag cloud on my company’s website. However, there was a problem; the tag “Ozone” appeared about 100 times more often then almost any other tag. When I tried to display them using a linear scale the results looked like this:

What I tried

I linear transformation. The “Ozone” tag still clobbers all the other tags.

My first thought was to adjust the linear scale. To make the other tags visible, I could just increase the slope of the equation so instead of using something like y=1x+2 I would use y=5x+2. Here is what I got:

The Solutions

Notice the data fits a power curve pretty well.

Once I realized the linear transformation wouldn’t work, my next thought was logarithms. When I graphed the data it fit a power curve quite nicely (R2 of .917). Since the data fit a power curve, I was pretty sure I could make logarithms work. However, I had a few other constraints. In order to keep the tag cloud a consistent size, I needed the maximum of the equation to be 1 and the minimum to be 1/3.

Starting Equation: y = log(x)

Adjusting the Maximum

To bring the maximum value down to 1, I just divided by the log of the biggest value in the data set:

y = log(x)/log(xmax)

Ajusting the Minimum

Tag Cloud

The final result; much better than before!

To adjust the minimum, I just preformed a simple linear transformation on my previous result:

y = f(x)/1.5 + 1/3

This gave me a final equation of

y = log(x)/log(xmax)/1.5 + 1/3

The results were exactly what I was looking for :)

If you would like to see the data for yourself, here is a spreadsheet in odf format that contains the data, the equations, and the graph: Log-Transformations.ods (you can download a viewer here).

2 Responses to Logarithms on the Web: Using Mathematics to Display a Tag Cloud

  1. Ben Babcock says:

    Awesome post. I found this extremely helpful while pondering how to generate a tag cloud of my own. I suspected that other mathematicians like me had already tackled the problem and turned to Google before spending too much time fitting my tags to a distribution.

    I saw some excruciatingly tortured examples of how to get a weighting. Your solution is both easy to understand and elegant, and very easy to customize given different constraints.

    Thanks for sharing it!

