Although I am a computer programmer, I also studied Mathematics in college. Naturally, I was excited when a computer program I was working on involved some “complex” math (don’t worry, no integrals or infinite series ahead).
I was trying to display a tag cloud on my company’s website. However, there was a problem; the tag “Ozone” appeared about 100 times more often then almost any other tag. When I tried to display them using a linear scale the results looked like this:
What I tried
My first thought was to adjust the linear scale. To make the other tags visible, I could just increase the slope of the equation so instead of using something like y=1x+2 I would use y=5x+2. Here is what I got:
Once I realized the linear transformation wouldn’t work, my next thought was logarithms. When I graphed the data it fit a power curve quite nicely (R2 of .917). Since the data fit a power curve, I was pretty sure I could make logarithms work. However, I had a few other constraints. In order to keep the tag cloud a consistent size, I needed the maximum of the equation to be 1 and the minimum to be 1/3.
Starting Equation: y = log(x)
Adjusting the Maximum
To bring the maximum value down to 1, I just divided by the log of the biggest value in the data set:
y = log(x)/log(xmax)
Ajusting the Minimum
To adjust the minimum, I just preformed a simple linear transformation on my previous result:
y = f(x)/1.5 + 1/3
This gave me a final equation of
y = log(x)/log(xmax)/1.5 + 1/3
The results were exactly what I was looking for