{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n# The Lifecycle of a Plot\n\n\nThis tutorial aims to show the beginning, middle, and end of a single\nvisualization using Matplotlib. We'll begin with some raw data and\nend by saving a figure of a customized visualization. Along the way we try\nto highlight some neat features and best-practices using Matplotlib.\n\n.. currentmodule:: matplotlib\n\n

#### Note

This tutorial is based on\n `this excellent blog post `_\n by Chris Moffitt. It was transformed into this tutorial by Chris Holdgraf.

\n\nA note on the Object-Oriented API vs. Pyplot\n============================================\n\nMatplotlib has two interfaces. The first is an object-oriented (OO)\ninterface. In this case, we utilize an instance of :class:`axes.Axes`\nin order to render visualizations on an instance of :class:`figure.Figure`.\n\nThe second is based on MATLAB and uses a state-based interface. This is\nencapsulated in the :mod:`.pyplot` module. See the :doc:`pyplot tutorials\n` for a more in-depth look at the pyplot\ninterface.\n\nMost of the terms are straightforward but the main thing to remember\nis that:\n\n* The Figure is the final image that may contain 1 or more Axes.\n* The Axes represent an individual plot (don't confuse this with the word\n \"axis\", which refers to the x/y axis of a plot).\n\nWe call methods that do the plotting directly from the Axes, which gives\nus much more flexibility and power in customizing our plot.\n\n

#### Note

In general, try to use the object-oriented interface over the pyplot\n interface.

\n\nOur data\n========\n\nWe'll use the data from the post from which this tutorial was derived.\nIt contains sales information for a number of companies.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\nimport matplotlib.pyplot as plt\n\n\ndata = {'Barton LLC': 109438.50,\n 'Frami, Hills and Schmidt': 103569.59,\n 'Fritsch, Russel and Anderson': 112214.71,\n 'Jerde-Hilpert': 112591.43,\n 'Keeling LLC': 100934.30,\n 'Koepp Ltd': 103660.54,\n 'Kulas Inc': 137351.96,\n 'Trantow-Barrows': 123381.38,\n 'White-Trantow': 135841.99,\n 'Will LLC': 104437.60}\ngroup_data = list(data.values())\ngroup_names = list(data.keys())\ngroup_mean = np.mean(group_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Getting started\n===============\n\nThis data is naturally visualized as a barplot, with one bar per\ngroup. To do this with the object-oriented approach, we first generate\nan instance of :class:`figure.Figure` and\n:class:`axes.Axes`. The Figure is like a canvas, and the Axes\nis a part of that canvas on which we will make a particular visualization.\n\n

#### Note

Figures can have multiple axes on them. For information on how to do this,\n see the :doc:`Tight Layout tutorial\n `.

\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have an Axes instance, we can plot on top of it.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots()\nax.barh(group_names, group_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Controlling the style\n=====================\n\nThere are many styles available in Matplotlib in order to let you tailor\nyour visualization to your needs. To see a list of styles, we can use\n:mod:`.style`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print(plt.style.available)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can activate a style with the following:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "plt.style.use('fivethirtyeight')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's remake the above plot to see how it looks:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots()\nax.barh(group_names, group_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The style controls many things, such as color, linewidths, backgrounds,\netc.\n\nCustomizing the plot\n====================\n\nNow we've got a plot with the general look that we want, so let's fine-tune\nit so that it's ready for print. First let's rotate the labels on the x-axis\nso that they show up more clearly. We can gain access to these labels\nwith the :meth:`axes.Axes.get_xticklabels` method:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots()\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we'd like to set the property of many items at once, it's useful to use\nthe :func:`pyplot.setp` function. This will take a list (or many lists) of\nMatplotlib objects, and attempt to set some style element of each one.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots()\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()\nplt.setp(labels, rotation=45, horizontalalignment='right')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It looks like this cut off some of the labels on the bottom. We can\ntell Matplotlib to automatically make room for elements in the figures\nthat we create. To do this we set the ``autolayout`` value of our\nrcParams. For more information on controlling the style, layout, and\nother features of plots with rcParams, see\n:doc:`/tutorials/introductory/customizing`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "plt.rcParams.update({'figure.autolayout': True})\n\nfig, ax = plt.subplots()\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()\nplt.setp(labels, rotation=45, horizontalalignment='right')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we add labels to the plot. To do this with the OO interface,\nwe can use the :meth:`.Artist.set` method to set properties of this\nAxes object.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots()\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()\nplt.setp(labels, rotation=45, horizontalalignment='right')\nax.set(xlim=[-10000, 140000], xlabel='Total Revenue', ylabel='Company',\n title='Company Revenue')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also adjust the size of this plot using the :func:`pyplot.subplots`\nfunction. We can do this with the ``figsize`` kwarg.\n\n

#### Note

While indexing in NumPy follows the form (row, column), the figsize\n kwarg follows the form (width, height). This follows conventions in\n visualization, which unfortunately are different from those of linear\n algebra.

\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(8, 4))\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()\nplt.setp(labels, rotation=45, horizontalalignment='right')\nax.set(xlim=[-10000, 140000], xlabel='Total Revenue', ylabel='Company',\n title='Company Revenue')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For labels, we can specify custom formatting guidelines in the form of\nfunctions. Below we define a function that takes an integer as input, and\nreturns a string as an output. When used with `.Axis.set_major_formatter` or\n`.Axis.set_minor_formatter`, they will automatically create and use a\n:class:`ticker.FuncFormatter` class.\n\nFor this function, the ``x`` argument is the original tick label and ``pos``\nis the tick position. We will only use ``x`` here but both arguments are\nneeded.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def currency(x, pos):\n \"\"\"The two args are the value and tick position\"\"\"\n if x >= 1e6:\n s = '\${:1.1f}M'.format(x*1e-6)\n else:\n s = '\${:1.0f}K'.format(x*1e-3)\n return s" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then apply this function to the labels on our plot. To do this,\nwe use the ``xaxis`` attribute of our axes. This lets you perform\nactions on a specific axis on our plot.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(6, 8))\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()\nplt.setp(labels, rotation=45, horizontalalignment='right')\n\nax.set(xlim=[-10000, 140000], xlabel='Total Revenue', ylabel='Company',\n title='Company Revenue')\nax.xaxis.set_major_formatter(currency)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Combining multiple visualizations\n=================================\n\nIt is possible to draw multiple plot elements on the same instance of\n:class:`axes.Axes`. To do this we simply need to call another one of\nthe plot methods on that axes object.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(8, 8))\nax.barh(group_names, group_data)\nlabels = ax.get_xticklabels()\nplt.setp(labels, rotation=45, horizontalalignment='right')\n\n# Add a vertical line, here we set the style in the function call\nax.axvline(group_mean, ls='--', color='r')\n\n# Annotate new companies\nfor group in [3, 5, 8]:\n ax.text(145000, group, \"New Company\", fontsize=10,\n verticalalignment=\"center\")\n\n# Now we move our title up since it's getting a little cramped\nax.title.set(y=1.05)\n\nax.set(xlim=[-10000, 140000], xlabel='Total Revenue', ylabel='Company',\n title='Company Revenue')\nax.xaxis.set_major_formatter(currency)\nax.set_xticks([0, 25e3, 50e3, 75e3, 100e3, 125e3])\nfig.subplots_adjust(right=.1)\n\nplt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Saving our plot\n===============\n\nNow that we're happy with the outcome of our plot, we want to save it to\ndisk. There are many file formats we can save to in Matplotlib. To see\na list of available options, use:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print(fig.canvas.get_supported_filetypes())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then use the :meth:`figure.Figure.savefig` in order to save the figure\nto disk. Note that there are several useful flags we show below:\n\n* ``transparent=True`` makes the background of the saved figure transparent\n if the format supports it.\n* ``dpi=80`` controls the resolution (dots per square inch) of the output.\n* ``bbox_inches=\"tight\"`` fits the bounds of the figure to our plot.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Uncomment this line to save the figure.\n# fig.savefig('sales.png', transparent=False, dpi=80, bbox_inches=\"tight\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.1" } }, "nbformat": 4, "nbformat_minor": 0 }