Skip to content

scatter and pcolormesh: Motivating examples

Olga Botvinnik edited this page Aug 21, 2013 · 5 revisions

scatter

The default matplotlib color cycle is not pretty to look at. What's even worse is that if you just do a scatter plot, then it doesn't cycle at all through any values

import matplotlib.pyplot as mpl_plt
# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    ax.scatter(x, y, label=str(i))
ax.legend()
    
ax.set_title('prettyplotlib `scatter` example\nshowing default matplotlib `scatter`')
fig.savefig('scatter_matplotlib_default.png')

Matplotlib default scatterplot

Before prettyplotlib: how to make nice plots

Now I'm going to take you through ALL the steps I used to take to make nice looking plots.

First, change the colors with brewer2mpl:

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors
...
color = set2[i]
ax.scatter(x, y, label=str(i), facecolor=color)
``

The full code is,

```python
import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), color=color)
    
fig.savefig('scatter_matplotlib_improved_01_changed_colors.png')

Matplotlib scatter improved 01: changed colors

This looks nice, almost like an impressionist painting, but it's still hard to see overlaps here. So let's fill the symbols with 0.5 opacity using alpha=0.5.

ax.scatter(x, y, label=str(i), color=color, alpha=0.5)

The full code is,

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), color=color, alpha=0.5)
    
fig.savefig('scatter_matplotlib_improved_02_added_alpha.png')

Matplotlib scatter improved 02: added alpha

This is still pretty lovely and impressionist-y but I still didn't like that it was hard to see when the dots overlapped. So let's add a black outline, and specify that color is just the facecolor:

ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black',
facecolor=color)

The full code is,

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black', facecolor=color)
    
fig.savefig('scatter_matplotlib_improved_03_added_outline.png')

Matplotlib scatter improved 03: added black outline

Ack, but those lines are too thick ... let's think them down to linewidth=0.15

ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black',
facecolor=color, linewidth=0.15)

The full code is,

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black', facecolor=color, linewidth=0.15)
fig.savefig('scatter_matplotlib_improved_04_thinned_outline.png')

Matplotlib scatter improved 04: thinned out black outline

Now we're getting somewhere. This looks very lovely. Don't you want to just cuddle up with that cute plot?

What are those top and right axes lines really doing for us? They're boxing the data in, but we can do that with our eyes from the other axis lines. So let's remove the top and right axis lines using ax.spines:

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
ax.spines[spine].set_visible(False)

The full code is,

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black', facecolor=color, linewidth=0.15)

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(False)
fig.savefig('scatter_matplotlib_improved_05_removed_top_right_spines.png')

Matplotlib scatter improved 05: removed top and right axis lines

Oops, but we still have the ticks on the top and right axes. We'll need to get rid of them. Actually, why don't we just get rid of all ticks altogether? We can tell by the position of the number where it indicates, so we don't need an additional tick.

# Get rid of ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')

Here's the full code:

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black', facecolor=color, linewidth=0.15)

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(False)

# Get rid of ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
fig.savefig('scatter_matplotlib_improved_06_removed_ticks.png')

Matplotlib scatter improved 06: removed tick marks

Ahh, much better. But we won't stop there. Now we'll tweak the remaining pieces of the figure. For the rest of the spines, let's thin the line down to 0.5 points instead of the default 1.0 points. Also, we'll change it from pure black to a slightly lighter dark grey. Here they are side by side:

fig, axes = plt.subplots(2)
axes[0].set_axis_bgcolor('black')
axes[0].text(0.5, 0.5, 'black', color='white', fontsize=24, va='center', ha='center')
axes[1].set_axis_bgcolor('#262626')
axes[1].text(0.5, 0.5, 'almost black', fontsize=24, color='white', va='center', ha='center')
fig.savefig('black_vs_almost_black.png')

Matplotlib scatter improved 06: removed tick marks

So not a huge difference, and the dark grey still looks pretty black, but it's [a little more pleasant on the eyes](http://ianstormtaylor.com/design-tip-never- use-black/) to use a dark grey instead of black. There's very few things in nature that are truly black. Just look at shadows! They're just dark grey, or blue, or red or purple. But I digress. Back to plotting libraries...

To change the $x$-axis and $y$-axis line colors, and the outlines of the scatter symbols from black to dark grey, we'll do:

# For remaining spines, thin out their line and change the black to a slightly off-black dark grey
almost_black = '#262626'
...
ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor='black', facecolor=color, linewidth=0.15)`
...
spines_to_keep = ['bottom', 'left']
for spine in spines_to_keep:
    ax.spines[spine].set_linewidth(0.5)
    ax.spines[spine].set_color(almost_black)

The full code is,

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

# Save a nice dark grey as a variable
almost_black = '#262626'

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor=almost_black, facecolor=color, linewidth=0.15)

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(False)

# Get rid of ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')

# For remaining spines, thin out their line and change the black to a slightly off-black dark grey
spines_to_keep = ['bottom', 'left']
for spine in spines_to_keep:
    ax.spines[spine].set_linewidth(0.5)
    ax.spines[spine].set_color(almost_black)
fig.savefig('scatter_matplotlib_improved_07_axis_black_to_almost_black.png')

Matplotlib scatter improved 07: changed axis lines from black to almost black

This is nice. But if you look closely, the tick labels are still black :( We have to change them separately, using

# Change the labels to the off-black
ax.xaxis.label.set_color(almost_black)
ax.yaxis.label.set_color(almost_black)

And while we're at it, let's add a title and make it dark grey too.

# Change the axis title to off-black
ax.title.set_color(almost_black)

ax.set_title('prettyplotlib `scatter` example\nshowing improved matplotlib `scatter`')

The full code is,

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

# Save a nice dark grey as a variable
almost_black = '#262626'

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor=almost_black, facecolor=color, linewidth=0.15)

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(False)

# Get rid of ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')

# For remaining spines, thin out their line and change the black to a slightly off-black dark grey
spines_to_keep = ['bottom', 'left']
for spine in spines_to_keep:
    ax.spines[spine].set_linewidth(0.5)
    ax.spines[spine].set_color(almost_black)

# Change the labels to the off-black
ax.xaxis.label.set_color(almost_black)
ax.yaxis.label.set_color(almost_black)

# Change the axis title to off-black
ax.title.set_color(almost_black)

ax.set_title('prettyplotlib `scatter` example\nshowing improved matplotlib `scatter`')
fig.savefig('scatter_matplotlib_improved_08_labels_black_to_almost_black.png')

Matplotlib scatter improved 08: changed labels to almost black

If you remember in the original example, we also had an axis legend, using

ax.legend()

So let's add it to this code, too.

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

# Save a nice dark grey as a variable
almost_black = '#262626'

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor=almost_black, facecolor=color, linewidth=0.15)

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(False)

# Get rid of ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')

# For remaining spines, thin out their line and change the black to a slightly off-black dark grey
almost_black = '#262626'
spines_to_keep = ['bottom', 'left']
for spine in spines_to_keep:
    ax.spines[spine].set_linewidth(0.5)
    ax.spines[spine].set_color(almost_black)

# Change the labels to the off-black
ax.xaxis.label.set_color(almost_black)
ax.yaxis.label.set_color(almost_black)

# Change the axis title to off-black
ax.title.set_color(almost_black)

ax.legend()
    
ax.set_title('prettyplotlib `scatter` example\nshowing improved matplotlib `scatter`')
fig.savefig('scatter_matplotlib_improved_09_ugly_legend.png')

Matplotlib scatter improved 09: added ugly legend

There are many things I don't like about this legend.

  1. First of all, why does it have such a thick border line? What does that really add to our interpretation of the legend? The black line is so thick that it distracts from what we're trying to portray - which label goes with which color.
  2. Why does it show three points? Does this legend think I'm dumb and can't figure out which symbol goes with which label after one iteration, so it does it three times?
  3. Finally, the legend labels are pure black. Maybe you notice it too, after comparing to $x$-axis and $y$-axis lines and labels.

We'll accomplish these three things using this code:

# Remove the line around the legend box, and instead fill it with a light grey
# Also only use one point for the scatterplot legend because the user will
# get the idea after just one, they don't need three.
light_grey = np.array([float(248)/float(255)]*3)
legend = ax.legend(frameon=True, scatterpoints=1, fontcolor=almost_black)
rect = legend.get_frame()
rect.set_facecolor(light_grey)
rect.set_linewidth(0.0)

# Change the legend label colors to almost black, too
texts = legend.texts
for t in texts:
    t.set_color(almost_black)

Now our code is pretty huge ...

import matplotlib.pyplot as mpl_plt
import brewer2mpl

# Get "Set2" colors from ColorBrewer (all colorbrewer scales: http://bl.ocks.org/mbostock/5577023)
set2 = brewer2mpl.get_map('Set2', 'qualitative', 8).mpl_colors

# Set the random seed for consistency
np.random.seed(12)

# Save a nice dark grey as a variable
almost_black = '#262626'

fig, ax = mpl_plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    color = set2[i]
    ax.scatter(x, y, label=str(i), alpha=0.5, edgecolor=almost_black, facecolor=color, linewidth=0.15)

# Remove top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(False)

# Get rid of ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')

# For remaining spines, thin out their line and change the black to a slightly off-black dark grey
almost_black = '#262626'
spines_to_keep = ['bottom', 'left']
for spine in spines_to_keep:
    ax.spines[spine].set_linewidth(0.5)
    ax.spines[spine].set_color(almost_black)

# Change the labels to the off-black
ax.xaxis.label.set_color(almost_black)
ax.yaxis.label.set_color(almost_black)

# Change the axis title to off-black
ax.title.set_color(almost_black)

# Remove the line around the legend box, and instead fill it with a light grey
# Also only use one point for the scatterplot legend because the user will 
# get the idea after just one, they don't need three.
light_grey = np.array([float(248)/float(255)]*3)
legend = ax.legend(frameon=True, scatterpoints=1)
rect = legend.get_frame()
rect.set_facecolor(light_grey)
rect.set_linewidth(0.0)

# Change the legend label colors to almost black, too
texts = legend.texts
for t in texts:
    t.set_color(almost_black)

    
ax.set_title('prettyplotlib `scatter` example\nshowing improved matplotlib `scatter`')
fig.savefig('scatter_matplotlib_improved_10_pretty_legend.png')

Matplotlib scatter improved 10: beautiful legend

Aaaaaaaaaaand I got tired of doing all those steps, EVERY time. So I wrote prettyplotlib. Here's an illustrative example of how awesome prettyplotlib is, and how it will save all the time you spent agonizing over making your matplotlib plots beautiful.

import prettyplotlib as ppl

# This is "import matplotlib.pyplot as plt" from the prettyplotlib library
from prettyplotlib import plt

# This is "import matplotlib as mpl" from the prettyplotlib library
from prettyplotlib import mpl

# Set the random seed for consistency
np.random.seed(12)

fig, ax = plt.subplots(1)

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    ppl.scatter(ax, x, y, label=str(i))
    
ppl.legend(ax)
    
ax.set_title('prettyplotlib `scatter` example\nshowing default color cycle and scatter params')
fig.savefig('scatter_prettyplotlib_default.png')

Matplotlib scatter improved 10: beautiful legend

The only commands that were different from the very first example with matplotlib are:

ppl.scatter(ax, x, y, label=str(i), facecolor='none')

instead of:

ax.scatter(x, y, label=str(i))

And a different legend command:

ppl.legend(ax)

instead of:

ax.legend()

If you really want to get the original matplotlib style back in prettyplotlib, you can do:

import prettyplotlib as ppl
from prettyplotlib import plt
from prettyplotlib import mpl
from prettyplotlib import brewer2mpl

# Set the random seed for consistency
np.random.seed(12)

fig, ax = plt.subplots(1)

#mpl.rcParams['axis.color_cycle'] = ['blue']

# Show the whole color range
for i in range(8):
    x = np.random.normal(loc=i, size=1000)
    y = np.random.normal(loc=i, size=1000)
    ax.scatter(x, y, label=str(i), facecolor='blue', edgecolor='black', linewidth=1)
    
# Get back the top and right axes lines ("spines")
spines_to_remove = ['top', 'right']
for spine in spines_to_remove:
    ax.spines[spine].set_visible(True)
    
# Get back the ticks. The position of the numbers is informative enough of
# the position of the value.
ax.xaxis.set_ticks_position('both')
ax.yaxis.set_ticks_position('both')

# For all the spines, make their line thicker and return them to be black
all_spines = ['top', 'left', 'bottom', 'right']
for spine in all_spines:
    ax.spines[spine].set_linewidth(1.0)
    ax.spines[spine].set_color('black')

# Change the labels back to black
ax.xaxis.label.set_color('black')
ax.yaxis.label.set_color('black')

# Change the axis title also back to black
ax.title.set_color('black')

# Remove the line around the legend box, and instead fill it with a light grey
# Also only use one point for the scatterplot legend because the user will 
# get the idea after just one, they don't need three.
ax.legend()
    
ax.set_title('prettyplotlib `scatter` example\nrevert everything back to default matplotlib parameters')
fig.savefig('scatter_prettyplotlib_back_to_matplotlib_default.png')

Matplotlib scatter improved 10: beautiful legend

Notice that the default calls of ax.scatter and ax.legend do the usual thing. This is important, because for prettyplotlib to work, you'll need to use a syntax that's different from the usual matplotlib one: ppl.scatter(ax, x, y...) instead of ax.scatter(x, y, ...)

pcolormesh

That's all, folks!

That's my introduction to prettyplotlib and why you need it. There are similar examples for the other functions, but these ones for ppl.scatter and ppl.pcolormesh are the most extensive.

Clone this wiki locally