Graphing & Plotting Recipes, Revisited

CMP 464/788:
Topics Course: Data Science

Spring 2017

Original version by Katherine St. John.

Earlier in the course we used matplotlib.pyplot to compare y = log(x) or y = √ x . The first section is repeated from that day-- try redoing these with list comprehensions before moving on to the numpy section.

Graphing Mathematical Functions:

The pyplot module of matplotlib provides lots of useful ways to plot data to the screen. Let's use it to answer the question, which grows faster:
y = log(x) or y = √ x ?

To test out this question, we will write a program that:

  1. Uses the math and plotting libraries.
  2. Sets up a list of numbers (x-values) for our functions.
  3. Computes the y-values of our numbers for our functions.
  4. Creates plots of the two functions.
  5. Shows the plots in a separate graphics window.
Let's add in the Python code that for each of these steps:
  1. Uses the math and plotting libraries.
    import math	
    import matplotlib.pyplot as plt
    	
    Since it's unwieldly to type "matplotlib.pyplot" before every function we'd like to use from that library, instead we'll use the common abbreviation of "plt". With this, we can plt.plot(), instead of matplotlib.pyplot.plot().
  2. Sets up a list of numbers (x-values) for our functions.
    x = range(1,101)
    
    Remember: Python starts counting at 0 and goes up to, but not including the 101. So, this creates the list [1,2,...,100].
  3. Computes the y-values of our numbers for our functions.
    y1 = []
    for i in x:
       y = math.log(i)
       y1.append(y)
    y2 = []
    for i in x:
        y = math.sqrt(i)
        y2.append(y)   
    
    We need two separate lists since we have two separate functions to graph.

    How could you rewrite the above using list comprehensions?

  4. Creates plots of the two functions.
    plt.plot(x,y1,label='y1 = log(x)')
    plt.plot(x,y2,label='y2 = sqrt(x)')
    plt.legend()
    
    Creates the plot for safe keeping but does not display it until told to (see next lines).
  5. Shows the plots in a separate graphics window.
    plt.show()
    
    This line pops up the new graphics window to display the plots.

From your plots, which do you think grows faster: log(x) or sqrt(x)?

The numpy Module

The numpy is a module for numerical analysis and is part of scipy. It is distributed with anaconda, so, is now on your machines! It's main focus is on linear algebra (manipulating of vectors and matrices) and also has differential equation solvers (i.e. it will integrate functions for you, numerically).

It is commonly used with the abbreviation, np:

import numpy as np
import matplotlib.pyplot as plt

Numpy's main object is a homogeneous multidimensional array. That is, sequences of similar objects in most any dimension. You can access the elements in the arrays by tuples of positive numbers. For example, if you have a 10 x 10 grid of numbers, it could be stored in a 2-dimensional array, and each element could be accessed by specifying the row and column using a tuple: (1,2) would give the element at row 1 and column 2. The number of dimensions is called the rank of the array and the dimensions are often called axes.

Try the following at the prompt:

import numpy as np
import matplotlib.pyplot as plt
a = np.array([2,3,4])		#Sets up a 1-dimensional array with 3 elements
print(a)
b = np.array(range(100))	#Not common but does work
print(b)
c = np.linspace(0,99,100)	#More common, linearly space 100 numbers between 0 and 99, inclusive
print(c)

Let's use these commands to create the simple plots from above:

x = np.linspace(1,100,100)
y1 = np.log(x)
y2 = np.sqrt(x)
We need to use numpy's log and square root functions since those can handle arrays as inputs (the ones in the math library are expecting just single numbers, not np.arrays).

We can plot these in the same way as before:

plt.plot(x,y1)
plt.plot(x,y2)
plt.show()

Side Note: As we just demonstrated, the matplotlib library accepts both lists of numbers and arrays of numbers from the numpy library.

Challenges

Using the Python program you wrote above, try the following: