Basic (low-level) library for plotting: matplotlib
Higher-level interfaces:
import numpy as np
import matplotlib.pyplot as plt
x = np.array([0, 1, 2, 3])
y1 = x*2
y2 = x**2
plt.plot(x, y1)
plt.plot(x, y2)
In Jupyter plots are shown automatically
In a regular terminal / program:
plt.show()
results:
We'll create a plot that shows the sine and cosine functions in the interval from 0 to 2Ï€
x = np.linspace(0, 2*3.1415, 200)
plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x))
Create a Python function that plots a gaussian function based on its parameters mu and sigma:
plot_gaussian_function(mu, sigma)
Predefined stylesheets are available via:
plt.style.use("stylename")
see plt.style.available
for a list of available styles (online reference)
graph styling example:
plt.plot(x, y, color="C0", marker="X", linestyle="dashed")
specifying colors:
(1, 0.7, 0)
)line style:
"none"
or ""
"solid"
or "-"
"dashed"
or "--"
"dotted"
or ":"
"dashdot"
or "-."
marker:
""
(none)"."
(small dot)"o"
(large dot)"s"
(square)"X"
"+"
","
(pixel)important parameters:
color
linestyle
linewidth
marker
markersize
long form:
plt.plot(x, y, color="C0", marker="X", linestyle="dashed")
short form (less flexible):
plt.plot(x, y, "C0X--")
plt.title("Trigonometric functions")
plt.xlabel("x (radians)")
plt.ylabel("y")
labelling individual graphs:
plt.plot(x, np.sin(x), label='sin(x)')
plt.plot(x, np.cos(x), label='cos(x)')
plt.legend()
disabling axes:
plt.axis("off")
Fit axes (without gaps):
plt.axis("tight")
Show a specific region:
plt.axis([-1, 1, -1, 1])
Show a specific region of one axis:
plt.xlim(-1, 1)
Equal distances on both axes:
plt.axis("equal")
Equal distances on both axes, restricting plot area to used data ranges:
plt.axis("scaled")
plt.grid(True)
plt.yticks([-1, 0, 1])
plt.xticks(np.linspace(0, 2*np.pi, 5))
possible result:
import matplotlib.pyplot as plt
import numpy as np
plt.style.use("seaborn")
x = np.linspace(0, 2*np.pi, 100)
sin = np.sin(x)
cos = np.cos(x)
plt.plot(x, sin, "C0--", label="sin(x)")
plt.plot(x, cos, "C1:", label="cos(x)")
pi_multiples = np.array([0, 0.5, 1, 1.5, 2]) * np.pi
sin_points = np.sin(pi_multiples)
cos_points = np.cos(pi_multiples)
plt.plot(pi_multiples, sin_points, "C0o")
plt.plot(pi_multiples, cos_points, "C1o")
plt.title("Trigonometric functions")
plt.xlabel("x (radians)")
plt.xticks(np.linspace(0, 2*np.pi, 5))
plt.legend()
plt.axis("scaled")
plt.style.use("./mystyle.mplstyle")
# general configuration
axes.facecolor: EAEAF2
# line plot configuration
lines.linewidth: 1.5
lines.marker: o
lines.markersize: 4
# box plot configuration
boxplot.whiskers: 0, 100
custom theme colors (C0 - C10):
axes.prop_cycle: cycler('color', ['4C72B0', '55A868', 'C44E52', '8172B2', 'CCB974', '64B5CD'])
see:
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html
plt.plot(x, y)
/ plt.plot(y)
plt.bar(x, y)
plt.plot(x, y, ".")
/ plt.scatter(x, y, size, color)
plt.hist(x)
plt.boxplot(x)
plt.pie(x, labels=...)
Graph of associated values of x and y:
plt.plot(x, y)
Graph with automatic x (0, 1, ...):
plt.plot(y)
plt.bar(x, y, width=0.6)
plt.bar(x, y, width=1, align="edge")
plt.bar(
[0, 1, 2],
[9.6, 17, 9.8],
tick_label=["China", "Russia", "USA"]
)
# horizontal
plt.barh([0, 1, 2], [9.6, 17, 9.8])
creates data points with two (or more) values - one on the x-axis and the other on the y-axis
simple:
plt.plot(x, y, ".")
advanced:
plt.scatter(x, y, s=sizes, c=colors)
Counts the occurence of certain values / ranges
plt.hist(many_simulated_dice_rolls_with_10_dice)
plt.hist(
many_simulated_dice_rolls_with_10_dice,
bins=[13, 18, 23, 28, 33, 38, 43, 48, 53, 58]
)
plt.hist(
many_simulated_dice_rolls_with_10_dice,
density=True
)
Visualization of statistical data of a distribution (minimum, median, maximum, ...)
plt.boxplot(dice_simulation_1, whis=(0, 100))
plt.boxplot(
[dice_simulation_1, dice_simulation_2],
labels=["simulation 1", "simulation 2"]
)
plt.pie([3, 10, 17, 9], labels=["a", "b", "c", "d"])
plt.pie([3, 10, 17, 9], explode=[0, 0, 0, 0.1])
plt.pie([3, 10, 17, 9], startangle=90, counterclock=False)
Iris data set: measurements of 150 Iris flowers (50 of type iris setosa, 50 of type iris versicolor and 50 of type iris virginica)
data entries: sepal length, sepal width, petal length, petal width
import pandas as pd
iris = pd.read_csv(
"https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv"
)
# get all rows and the first four columns as numpy data
iris = iris.iloc[:,:4].to_numpy()
sepal_length = iris[:,0]
sepal_width = iris[:,1]
petal_length = iris[:,2]
petal_width = iris[:,3]
scatter plot of petal_length and petal_width
plot the first 50, the second 50 and the thrid 50 data points separately (in separate colors)
scatter plot of all four iris properties
use the color and size to visualize sepal length and sepal width
histogram of the petal length
boxplot of all four measurements
plt.plot(petal_length[:50], petal_width[:50], ".",
label="setosa")
plt.plot(petal_length[50:100], petal_width[50:100], ".",
label="versicolor")
plt.plot(petal_length[100:150], petal_width[100:150], ".",
label="virginica")
plt.legend()
plt.scatter(petal_length, petal_width,
sepal_length*10, sepal_width)
plt.hist(
petal_length,
bins=np.arange(0.5, 7.5, 0.5)
)
plt.boxplot(
[petal_length, petal_width, sepal_length, sepal_width],
labels=["petal length", "petal width", "sepal length",
"sepal width"],
whis=(0, 100)
)
Figure = entire drawing
Axes = coordinate system that can display data
A figure can contain multiple axes objects next to one another
Every drawing in pyplot is created via a figure object (the figure is usually created automatically when plotting)
Manually creating a figure of size 800 x 600 px (assuming 100 dpi):
fig = plt.figure(
figsize=(8, 6),
facecolor="#eeeeee"
)
This will automatically become the active figure.
exporting the active figure:
plt.savefig("myplot.png")
plt.savefig("myplot.svg")
Creating and activating new axes objects:
# axes in the bottom left
ax1 = plt.axes(0, 0, 0.5, 0.5)
plt.plot([0, 1, 2], [0, 1, 0])
# axes in the top right
ax2 = plt.axes(0.5, 0.5, 0.5, 0.5)
plt.plot([0, 1, 2], [0, 1, 0])
getting the current active axes object:
# gca = get current axes
active_axes = plt.gca()
making an axes object the active axes:
# sca = set current axes
plt.sca(ax1)
creating axes with the the same x axis and a new y axis:
ax2 = ax1.twinx()
automatic creation of multiple Axes objects in a grid (here: 2 rows, 3 columns):
fig, ax = plt.subplots(2, 3)
ax0 = ax[0, 0]
ax1 = ax[0, 1]
ax5 = ax[1, 2]
naming to keep in mind:
plt.axis
: e.g. for setting scalingplt.axes
: for creating a new coordinate systemactual meaning (from Latin): axis = singular, axes = plural
The Methods of plt
that we've previously seen call methods of the active Axes object in the background:
ax.plot(...)
ax.set_title(...)
ax.set_xlabel(...)
ax.legend()
ax.set_aspect("equal")
Extra function in pandas: scatter matrix
creates several scatter plots in a grid
if there are 4 data series it will create 4x4=16 plots (scatter plots and histograms)
from pandas.plotting import scatter_matrix
scatter_matrix(iris)
see Python Data Science Handbook: Histograms, Binnings, and Density
a grayscale image with 3x3 pixels:
image = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
plt.imshow(image, cmap="gray")
an RGB image with 2x2 pixels:
colors = np.array([[[255, 0, 0], [0, 255, 0]],
[[0, 0, 255], [0, 0, 0]]])
plt.imshow(colors)
Artist = base class for elements in a figure
example: creating an artist (rectangle) explicitly
r = plt.Rectangle((0.25, 0.75), 0.1, 0.1, color="C0")
ax = plt.gca()
ax.add_artist(r)
examples: creating primitive artists:
r = plt.Rectangle((0.25, 0.75), 0.1, 0.1, color="C0")
c = plt.Circle((0.75, 0.75), 0.1, color="C1")
p = plt.Polygon(
[[0.2, 0.2], [0.5, 0.1], [0.8, 0.2]],
color="C2",
)
l = plt.Line2D(
(0.5, 0.5), # x coordinates
(0.25, 0.75), # y coordinates
color="C3",
)
t = plt.Text(0, 0, "Hello!", size=40, color="C4")