PLSC - Political Science Poli.Analysis & Data w/ Python¶

Summer 2023¶

Module 4¶

Using Python for Visualization¶

Image Source: https://www.practicalpythonfordatascience.com

Python Libraries¶

Python has great visualization libraries:

  • Matplotlib
  • Seaborn
  • Plotly
  • Bokeh
  • Altair
  • Folium
  • Matplotlib is the main visualization library in Python.
  • Matplotlib is the most popular and widely-used Python package.
  • You can customize literally anything on a graph using Matplotlib.
  • Matplotlib may require a lot of coding to produce a good graph (I always find this as a problem, but at the same time, it gives you the flexibility. Once you know the basic, you can learn how to customize.)

$~$

  • Seaborn is built on top of Matplotlib.
  • It is a great library for statistical analysis.

  • Plotly has increasingly become popular.

  • It is for interactive visualization. But you can create static graphs as well.

Altair and Bokeh are also Python libraries that used for interactive graph.

$~$

My recommendation:

  • Start with Matplotlib. You cannot learn all the Matplotlib functions and capabilities. It is impossible to learn all. However, learning the logic of matplotlib is more important.
  • Then, learn seaborn and plotly. They are great libraries.

I never find necessity to use Altair and Bokeh, but some people use them and they like them. We will not go over Altair and Bokeh in this course.

Geospatial visualization¶

You can use libraries above for geospatial visualization. Remember pandas pd.dataframe(). If we want our data to include geospatial features, then we need to use libraries like geopandas.

Plotly or Altair can also help you to create maps.

Agenda¶

We will first look at how to use Matplotlib, Seaborn and Plotly respectively. Then, we will focus on geospatial data visualization such as using Geopandas and Plotly.

This module will only scratch the surface of Python's visualization capabilities. The rest will depend on what you want your graph to be.

  • Do you just want to explore the data?
  • Do you want your graph to be interactive or static?
  • Will you present the graph in paper/presentation?
  • Are you writing a book about how to make astonishing graphs using Python?
  • Do you need several on the same canvas?
  • Do you need to create a graph that has no axes, no title, no spines?
  • Do you need to play with font, ticks size, title size, label rotations etc?

So, all this require learning how to ask/search the right question on online forums or documentation websites. But, first you have to get the basic ideas of how to manage to create the basic graph. You can work on your code and customize it based on your needs.

Matplotlib¶

We import pyplot from matplotlib (not whole library) as plt.

In [1]:
import matplotlib.pyplot as plt
In [2]:
import numpy as np
x = np.linspace(0, 15, 10)
y = x ** 3
z= x/2
In [3]:
x
Out[3]:
array([ 0.        ,  1.66666667,  3.33333333,  5.        ,  6.66666667,
        8.33333333, 10.        , 11.66666667, 13.33333333, 15.        ])
In [4]:
y
Out[4]:
array([   0.        ,    4.62962963,   37.03703704,  125.        ,
        296.2962963 ,  578.7037037 , 1000.        , 1587.96296296,
       2370.37037037, 3375.        ])
In [5]:
z
Out[5]:
array([0.        , 0.83333333, 1.66666667, 2.5       , 3.33333333,
       4.16666667, 5.        , 5.83333333, 6.66666667, 7.5       ])
In [6]:
plt.plot(x, y, 'b') # 'b' is for blue color
plt.xlabel('X Axis  Here')
plt.ylabel('Y Axis Here')
plt.title('Title Here')
plt.show()

Properties of a graph

$$~$$

Source: Matplotlib

In [7]:
# plt.subplot(nrows, ncols, plot_number)
plt.subplot(1,3,1)
plt.plot(x, y, 'rx-') # More on color options later
plt.subplot(1,3,2)
plt.plot(y, x, 'g*-')
plt.subplot(1,3,3)
plt.plot(z, x, 'k.-');
In [8]:
# create histogram
plt.hist(y)

plt.show()
In [9]:
a =[15, 7, 8, 7, 2, 17, 2, 6,	4, 11, 22, 3, 6]

b =[90, 60, 70, 88, 110, 86, 103, 87, 90, 71, 75, 82, 86]

plt.scatter(a, b, c ="blue")

plt.show()

Object Oriented Method¶

I suggest you learn this method and stick with it.

Lets first closely look at what plt.figure() and add_axesdo.

In [10]:
fig = plt.figure()

fig #empty canvas
Out[10]:
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
In [11]:
ax=fig.add_axes([0,0,1,1])
ax #we add axes on the canvas
Out[11]:
<Axes: >
In [12]:
fig #lets see our figure again
Out[12]:
In [13]:
#Putting all these together.

fig = plt.figure() #canvas
ax=fig.add_axes([0,0,1,1]) #axes

#drawing on the canvas with spesified axes.
ax.plot(x,y, c='r')

plt.show() #show the plot

Another example:

In [14]:
# Creates blank canvas
fig = plt.figure()

ax1 = fig.add_axes([0.1, 0.1, 0.9, 0.9]) # main axes
ax2 = fig.add_axes([0.2, 0.5, 0.3, 0.4]) # second axes
#ax3= ..........

# Larger Figure Axes 1
ax1.plot(x, y, 'r')
ax1.set_xlabel('X_axes1')
ax1.set_ylabel('Y_axes1')
ax1.set_title('Title 1')

# Insert Figure Axes 2
ax2.plot(y, x, 'k')
ax2.set_xlabel('X_axes2')
ax2.set_ylabel('Y_axes2')
ax2.set_title('Title 2');

plt.subplots()¶

Now, plt.figure is okay to use but you can use another method called subplots()for better approch to more axis managment.

You can add many plots as you want with subplots.

There are many ways to create graphs in matplotlib. However, I personally use plt.subplots() because it is more customizable.

This is the first line of our code that when we start making a graph:

fig, ax= plt.subplots()

In [15]:
# check what plt.subplots() returns
# a figure and axes
plt.subplots()
Out[15]:
(<Figure size 640x480 with 1 Axes>, <Axes: >)
In [16]:
# We can assign figure to fig and axes to ax
fig, ax = plt.subplots()
In [17]:
fig
Out[17]:
In [18]:
ax
Out[18]:
<Axes: >

So instead of writing this:

plt.plot()
plt.hist()
plt.scatter()
plt.
plt.
plt.

We will start creating the graph creating object, which is only an empty canvas.

fig = plt.figure()
ax=fig.add_axes([#,#,#,#])

or

fig, ax= plt.subplots() (recommended)

When we work with any subplot, we will spesify using [row, column] position.

Source: Matplotlib

Lets have a look at an example:

In [19]:
fig, ax = plt.subplots(nrows=2, ncols=2) # or simply (2,2)

ax[0,0].plot(x, c='red')
ax[1,0].plot(-x, c='black')
ax[0,1].plot(z*-2, c='blue')
ax[1,1].plot(y, c='green')


ax[0,0].set_xlabel('x label')
ax[0,1].set_xlabel('x label');
In [20]:
# more customization

fig, ax = plt.subplots(2, 2, figsize=(6,8))

# plots
ax[0,0].plot(x, c='red')
ax[0,0].plot(z, c="blue") # ploted on the same axis

ax[1,0].plot(-x, c='black')
ax[0,1].plot(z, c='blue')
ax[1,1].plot(y, c='green')

# #x axes
ax[0,0].set_xlabel('x1 label')
ax[0,1].set_xlabel('x2 label')
ax[1,0].set_xlabel('x3 label')
ax[1,1].set_xlabel('x4 label')

# #y axes
ax[0,0].set_ylabel('y1 label')
ax[0,1].set_ylabel('y2 label')
ax[1,0].set_ylabel('y3 label')
ax[1,1].set_ylabel('y4 label')

ax[1,0].set_title('Figure 3: Look where I am',
                  fontsize=12, family='serif')


# You can present latex code as well.
# Do not forget r in the beginning --> (r"$latex code$")
ax[1,0].annotate(r'$\prod_{i=1}^n \frac{1}{\sqrt{2 \pi \sigma^2}}$', xy=(4.2, -7.5), xytext=(6, -2),
            arrowprops=dict(facecolor='black', shrink=0.05))

ax[0,0].annotate("A simple point", xy=(9, 7.5), xytext=(6, 2.5),
            arrowprops=dict(facecolor='black', shrink=0.05))

ax[0,0].legend(['red', 'blue'])

plt.suptitle('I am the main title',family='serif', size=14)

plt.tight_layout()
plt.show()

Matplotlib styles¶

Matplotlib has many theme you can choose from. For more: check here. These are a few of them:

Source: Matplotlib

You can check available themes: print(plt.style.available)

You would have to spesify theme before the plot is being drawn. Use this in the begining of the plot: plt.style.use("name_of_theme")

I like "bmh" theme. So this how we would use it:

plt.style.use("bmh")

In [21]:
plt.style.use("bmh")

fig, ax = plt.subplots(2, 2, figsize=(10,8))

# plots
ax[0,0].plot(x, c='red')
ax[0,0].plot(z, c="blue") # ploted on the same axis

ax[1,0].plot(-x, c='black')
ax[0,1].plot(z, c='blue')
ax[1,1].plot(y, c='green')

# #x axes
ax[0,0].set_xlabel('x1 label')
ax[0,1].set_xlabel('x2 label')
ax[1,0].set_xlabel('x3 label')
ax[1,1].set_xlabel('x4 label')

# #y axes
ax[0,0].set_ylabel('y1 label')
ax[0,1].set_ylabel('y2 label')
ax[1,0].set_ylabel('y3 label')
ax[1,1].set_ylabel('y4 label')

ax[1,0].set_title('Figure 3: Look where I am',
                  fontsize=12, family='serif')


# You can present latex code as well.
# Do not forget r in the beginning --> (r"$latex code$")
ax[1,0].annotate(r'$\prod_{i=1}^n \frac{1}{\sqrt{2 \pi \sigma^2}}$', xy=(4.2, -7.5), xytext=(6, -2),
            arrowprops=dict(facecolor='black', shrink=0.05))

ax[0,0].annotate("A simple point", xy=(9, 7.5), xytext=(6, 2.5),
            arrowprops=dict(facecolor='black', shrink=0.05))

ax[0,0].legend(['red', 'blue'])
ax[0,0].grid(False)
ax[1,0].grid(color='r', alpha=0.5, linestyle='dashed', linewidth=0.5)


ax[0,1].spines['bottom'].set_color('blue')
ax[1,1].spines['right'].set_color("none")
ax[0,1].spines['top'].set_color('red')

plt.suptitle('I am the main title',family='serif', size=14)

plt.tight_layout()
plt.show()
In [22]:
#iteration on axes
plt.style.use("ggplot")

fig, ax = plt.subplots(nrows=1, ncols=2)

for i in ax:
    i.plot(x, y, 'b')
    i.set_xlabel('xlabel')
    i.set_ylabel('ylabel')
    i.set_title('title')

fig
plt.tight_layout()
In [23]:
fig, ax = plt.subplots(figsize=(12,6))

ax.plot(x, x+1, color="red", linewidth=0.25)
ax.plot(x, x+2, color="red", linewidth=0.50)
ax.plot(x, x+3, color="red", linewidth=1.00)
ax.plot(x, x+4, color="red", linewidth=2.00)

# possible linestype options ‘-‘, ‘–’, ‘-.’, ‘:’, ‘steps’
ax.plot(x, x+5, color="green", lw=3, linestyle='-')
ax.plot(x, x+6, color="green", lw=3, ls='-.')
ax.plot(x, x+7, color="green", lw=3, ls=':')

# custom dash
line, = ax.plot(x, x+8, color="black", lw=1.50)
line.set_dashes([5, 10, 15, 10]) # format: line length, space length, ...

# possible marker symbols: marker = '+', 'o', '*', 's', ',', '.', '1', '2', '3', '4', ...
ax.plot(x, x+ 9, color="blue", lw=3, ls='-', marker='+')
ax.plot(x, x+10, color="blue", lw=3, ls='--', marker='o')
ax.plot(x, x+11, color="blue", lw=3, ls='-', marker='s')
ax.plot(x, x+12, color="blue", lw=3, ls='--', marker='1')

# marker size and color
ax.plot(x, x+13, color="purple", lw=1, ls='-', marker='o', markersize=2)
ax.plot(x, x+14, color="purple", lw=1, ls='-', marker='o', markersize=4)
ax.plot(x, x+15, color="purple", lw=1, ls='-', marker='o', markersize=8, markerfacecolor="red")
ax.plot(x, x+16, color="purple", lw=1, ls='-', marker='s', markersize=8,
        markerfacecolor="yellow", markeredgewidth=3, markeredgecolor="green");
In [24]:
# data from https://allisonhorst.github.io/palmerpenguins/

#Example from matplotlib's website

species = ("Adelie", "Chinstrap", "Gentoo")
penguin_means = {
    'Bill Depth': (18.35, 18.43, 14.98),
    'Bill Length': (38.79, 48.83, 47.50),
    'Flipper Length': (189.95, 195.82, 217.19),
}

x = np.arange(len(species))  # the label locations
width = 0.25  # the width of the bars
multiplier = 0

fig, ax = plt.subplots(layout='constrained')

for attribute, measurement in penguin_means.items():
    offset = width * multiplier
    rects = ax.bar(x + offset, measurement, width, label=attribute)
    ax.bar_label(rects, padding=3)
    multiplier += 1

# Add some text for labels, title and custom x-axis tick labels, etc.
ax.set_ylabel('Length (mm)')
ax.set_title('Penguin attributes by species')
ax.set_xticks(x + width, species)
ax.legend(loc='upper left', ncols=3)
ax.set_ylim(0, 250)

plt.show()
In [25]:
x= np.linspace(0,10,20)

fig, ax = plt.subplots(figsize=(10, 4))

ax.plot(x, x**2, x, x**3, lw=2)

ax.set_xticks([2, 4, 6, 8, 10])
ax.set_xticklabels([r'$\alpha$', r'$\beta$', r'$\gamma$', r'$\delta$', r'$\epsilon$'], fontsize=18);
In [26]:
#Another example from matplotlib's website

import numpy as np
import matplotlib.pyplot as plt

# LOADING DATA TO WORK WITH
######################################################
import matplotlib.cbook as cbook
# Load a numpy record array from yahoo csv data with fields date, open, high,
# low, close, volume, adj_close from the mpl-data/sample_data directory. The
# record array stores the date as an np.datetime64 with a day unit ('D') in
# the date column.
price_data = (cbook.get_sample_data('goog.npz', np_load=True)['price_data']
              .view(np.recarray))
price_data = price_data[-250:]  # get the most recent 250 trading days

delta1 = np.diff(price_data.adj_close) / price_data.adj_close[:-1]

# Marker size in units of points^2
volume = (15 * price_data.volume[:-2] / price_data.volume[0])**2
close = 0.003 * price_data.close[:-2] / 0.003 * price_data.open[:-2]
###############################################################

fig, ax = plt.subplots()
ax.scatter(delta1[:-1], delta1[1:], c=close, s=volume, alpha=0.5)

ax.set_xlabel(r'$\Delta_i$', fontsize=15)
ax.set_ylabel(r'$\Delta_{i+1}$', fontsize=15)
ax.set_title('Volume and percent change')

ax.grid(True)
fig.tight_layout()

fig.savefig("filename.png")
# fig.savefig("filename.pdf")
In [27]:
##########################################
alpha = 0.7
phi_ext = 2 * np.pi * 0.5

def flux_qubit_potential(phi_m, phi_p):
    return 2 + alpha - 2 * np.cos(phi_p) * np.cos(phi_m) - alpha * np.cos(phi_ext - 2*phi_p)

phi_p = np.linspace(0, 2*np.pi, 100)
phi_m = np.linspace(0, 2*np.pi, 100)

X,Y = np.meshgrid(phi_p, phi_m)
Z = flux_qubit_potential(X, Y).T
###########################################

from mpl_toolkits.mplot3d.axes3d import Axes3D

fig = plt.figure(figsize=(14,6))

ax = fig.add_subplot(1, 2, 1, projection='3d')

p = ax.plot_surface(X, Y, Z, rstride=4, cstride=4, linewidth=0)

# surface_plot with color grading and color bar
ax = fig.add_subplot(1, 2, 2, projection='3d')
p = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap= "viridis", linewidth=0, antialiased=False)
cb = fig.colorbar(p, shrink=0.5)

The goal is not to provide examples of matplotlib visualizations, rather to understand how matplotlib works.

A graphs and its code for my projects using Python

plt.style.use('seaborn-white')
plt.rcParams.update({'font.size': 18}) #to set font size for all properties of plot
plt.rcParams.update({'font.sans-serif':'Times New Roman'}) #or 'helvetica'

fig, ax = plt.subplots(figsize=(14,4))

my_palette = ["#E5E4E4", "#BFBFBF",'#6D6D6D', "#030303"]

cmap=sns.set_palette(palette=my_palette)

cross_tab.plot(ax=ax, kind='bar', stacked=True, cmap=cmap)

ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)

#ax.set_title('Support for Minority Rights by City',
              pad=35,fontdict={'fontsize':20})
#ax.set_xlabel('City',fontdict={'fontsize':20})
ax.set_ylabel('Number of Participants',fontdict={'fontsize':20})

plt.legend(['No Support','Socio-Cultural', 'Decentralization',
            'All Political Support'],frameon = False,
            title= 'Support Minority Rights',
            fontsize='small',fancybox=True, title_fontsize=r'small')

ax.set_xticklabels(cross_tab.adm_1, size=16);
ax.axis()
ax.grid(False)

#plt.axis('off')
fig.tight_layout()
ax.set_facecolor("w")
fig.set_facecolor('w')
fig.savefig('group_rights_demands_by_province_2011.png',bbox_inches='tight');
fig.savefig('group_rights_demands_by_province_2011.pdf',bbox_inches='tight');
plt.show()
  • Do not memorize anything from here. You cannot! Learn the logic.

  • Learn how to read documentation. Being able to read the documentation is key here.

  • Matplotlib library has a lot of example visualizations. So have a look at them. Find what you need and adjust the code for your own purpose.

  • Do not stop exploring Matplotlib! It takes a lot to create visually appealing graph using matplotlib, but it is worth it due to its flexibility!

  • Also, you would not make crazy advanced graphs unless you have to, so you will do only when it is necessary, say for your final project, presentation etc.