You are reading documentation for the unreleased version of Matplotlib. Try searching for the released version of this page instead?
Applications are open for the 2018 John Hunter Matplotlib Summer Fellowship. Apply now!
Version 2.2.2.post1701+g250c33e83
matplotlib
Fork me on GitHub

Related Topics

Using histograms to plot a cumulative distributionΒΆ

This shows how to plot a cumulative, normalized histogram as a step function in order to visualize the empirical cumulative distribution function (CDF) of a sample. We also show the theoretical CDF.

A couple of other options to the hist function are demonstrated. Namely, we use the normed parameter to normalize the histogram and a couple of different options to the cumulative parameter. The normed parameter takes a boolean value. When True, the bin heights are scaled such that the total area of the histogram is 1. The cumulative kwarg is a little more nuanced. Like normed, you can pass it True or False, but you can also pass it -1 to reverse the distribution.

Since we're showing a normalized and cumulative histogram, these curves are effectively the cumulative distribution functions (CDFs) of the samples. In engineering, empirical CDFs are sometimes called "non-exceedance" curves. In other words, you can look at the y-value for a given-x-value to get the probability of and observation from the sample not exceeding that x-value. For example, the value of 225 on the x-axis corresponds to about 0.85 on the y-axis, so there's an 85% chance that an observation in the sample does not exceed 225. Conversely, setting, cumulative to -1 as is done in the last series for this example, creates a "exceedance" curve.

Selecting different bin counts and sizes can significantly affect the shape of a histogram. The Astropy docs have a great section on how to select these parameters: http://docs.astropy.org/en/stable/visualization/histogram.html

../../_images/sphx_glr_histogram_cumulative_001.png
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(19680801)

mu = 200
sigma = 25
n_bins = 50
x = np.random.normal(mu, sigma, size=100)

fig, ax = plt.subplots(figsize=(8, 4))

# plot the cumulative histogram
n, bins, patches = ax.hist(x, n_bins, density=True, histtype='step',
                           cumulative=True, label='Empirical')

# Add a line showing the expected distribution.
y = ((1 / (np.sqrt(2 * np.pi) * sigma)) *
     np.exp(-0.5 * (1 / sigma * (bins - mu))**2))
y = y.cumsum()
y /= y[-1]

ax.plot(bins, y, 'k--', linewidth=1.5, label='Theoretical')

# Overlay a reversed cumulative histogram.
ax.hist(x, bins=bins, density=True, histtype='step', cumulative=-1,
        label='Reversed emp.')

# tidy up the figure
ax.grid(True)
ax.legend(loc='right')
ax.set_title('Cumulative step histograms')
ax.set_xlabel('Annual rainfall (mm)')
ax.set_ylabel('Likelihood of occurrence')

plt.show()

Keywords: matplotlib code example, codex, python plot, pyplot Gallery generated by Sphinx-Gallery