Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Using axis lines for good or evil, published by dynomight on March 6, 2024 on LessWrong.
Say you want to plot some data. You could just plot it by itself:
Or you could put lines on the left and bottom:
Or you could put lines everywhere:
Or you could be weird:
Which is right? Many people treat this as an aesthetic choice. But I'd like to suggest an unambiguous rule.
Principles
First, try to accept that all axis lines are optional. I promise that readers will recognize a plot even without lines around it.
So consider these plots:
Which is better? I claim this depends on what you're plotting. To answer, mentally picture these arrows:
Now, ask yourself, are the lengths of these arrows meaningful? When you draw that horizontal line, you invite people to compare those lengths.
You use the same principle for deciding if you should draw a y-axis line. As yourself if people should be comparing the lengths:
Years vs. GDP
Suppose your data is how the GDP of some country changed over time, so the x-axis is years and the y-axis is GDP.
You could draw either axis or not. So which of these four plots is acceptable?
Got your answers? Here's a key:
Why?
GDP is an absolute quantity. If GDP doubles, then that means something. So readers should be thinking about the distance between the curve and the x-axis.
But 1980 is arbitrary. When comparing 2020 to 2000, all that matters is that they're 20 years apart. No one cares that "2020 is twice as far from 1980 as 2000" because time did not start in 1980.
Years vs. GDP again
Say you have years and GDP again, except all the GDP numbers are much larger - instead of varying between 0 and $3T, they vary between $50T and $53T.
What to do? In principle you could stretch the y-axis all the way down to zero.
But that doesn't seem like a good idea - you can barely see anything.
Sometimes you need to start the y-axis at $50T. That's fine. (As long as you're not using a bar chart.) But then, the right answer changes.
The difference is that $50T isn't a meaningful baseline. You don't want people comparing things like (GDP in 1980 - $50T) vs. (GDP in 2000 - $50T) because that ratio doesn't mean anything.
Years vs temperature
What if the y-axis were temperature? Should you draw a line along the x-axis at zero?
If the temperature is in Kelvin, then probably yes.
If the temperature is in Fahrenheit, then no. No one cares about the difference between the current temperature and the freezing point of some brine that Daniel Fahrenheit may or may not have made.
If the temperature is in Celsius, then maybe. Do it if the difference from the freezing point of water is important.
Of course, if the freezing point of water is critical and you're using Fahrenheit, then draw a line at 32°F. Zero and one are the most common useful baselines, but use whatever is meaningful.
(Rant about philosophical meaning of "0" and "1" and identity elements in mathematical rings redacted at strenuous insistence of test reader.)
Homeowners vs. cannabis
Sometimes you should put lines at the ends of axes, too. Say the x-axis is the fraction of homeowners in different counties, and the y-axis is support for legal cannabis:
Should you draw axis lines? Well, comparisons to 0% are meaningful along both axes. So it's probably good to add these lines:
But comparisons to 100% are also meaningful. So in this case, you probably want a full box around the plot.
Lines can also be used for evil
Lots of people hate the Myers-Briggs personality test - suggesting that you should use a created-by-academic-psychologists test like the Big Five instead. I've long held this was misguided and that if you take the Myers-Briggs scores (without discretizing them into categories) they're almost equivalent to the Big Five without neuroticism or "Big Four".
So I was excited to see some recent research that tests t...
view more