Skip to content

PyBCSession09

Katy Huff edited this page Jan 29, 2012 · 9 revisions

Other Wiki Pages: PyBc, Session01, Session02, Session03, Session04, Session05, Session06, Session07, Session08, Session09, f2py, swig, Cpython, Cython, PyTables, PyTaps, PythonBots, Django, GIS, AdvancedPython, WxPython, standardlib,

Working With Python

Installing Modules and Packages

Now that you've learned a little about how to use python, I thought it was important that you learn how to install new packages. As you saw on day 2, without external packages, python can't do very much for us in the area of scientific computing. Luckily, if you installed Enthought you already have a lot of the packages you'll need. However, as you use python more and more you'll find that there are other packages you'll want to install. There is no single answers to how you install packages - it is a function of your operating system and environment. For example:

  • On Windows, you can usually just search for the package you want to install online. There is usually an easy to install executable available.
  • On Linux, you can find most of the popular software using your package management system.

However, you may find yourself in a situation where none of these options are appealing. In that case, there are a few standard solutions.

Installing Software with Easy Install

If you followed the Enthought installation instructions, you should have a program available that you can run from the command line, called easy_install. This program is designed to do exactly what the name suggests. It knows about most of the popular python software and make the process of installing it a breeze. For example, lets use easy_install to install trac, the software that is used to run the hacker within web site. First, open a terminal. On Windows, you need to open the start menu, click run, and enter the command "cmd". On Macs, just enter "terminal" in spot light. On Linux...well you should know how to do this already.

Note: I am assuming you followed our installation instructions! If you did, you will have easy_install available and this example should work. Otherwise, the instructions may not work!

Once the terminal is open, just enter:

::
easy_install trac

If you get an error about permissions, you may have to enter "sudo easy_install trac" and enter your password. Easy_install will go on the Internet, download trac, and install it.

Installing Less Popular Software with setup.py

Sometimes, easy_install won't know how to access the software you want. This is especially true of less popular software. At this point you must manually download the source code for the software and install it yourself. I'm going to walk you through a simple example now. First, lets download the !CodeExample macro for trac [http://hackerwithin.org/cgi-bin/hackerwithin.fcgi/raw-attachment/wiki/[[Session09/codeexample.zip here]. Unzip the download to somewhere you can access it. I am going to assume you unzipped it to your Desktop. PyBCSession09/codeexample.zip here. Unzip the download to somewhere you can access it. I am going to assume you unzipped it to your Desktop.

If you look in the codeexample directory which you just unzipped, you'll notice that there is a file called setup.py. This file contains instructions that tell python how to install the software. So, python uses python scripts called setup.py to install software. If you write your own python application and want to package it in a way that is easy to install, you'll have to learn how to write a setup.py script.

Now, open a terminal and navigate to the directory where the codeexample directory lives. If you installed it to the Desktop, you can just open a terminal and enter:

::
cd Desktop/codeexample

If you hit "dir" on Windows, or "ls" on anything else, you should see a listing of the files in the codeexample directory, setup.py among them. Now we are going to install the software by entering:

::
python setup.py install

If you didn't see any error messages, then you should be good to go. If you saw something related to permissions, you may have to enter "sudo python setup.py install". Thats it! Open an ipython shell. Enter the commands:

::
#!CodeExample #!python import trac import codeexample

If you don't see any error messages then everything installed correctly.

That seemed a little too easy

Thats because it was. I chose some really easy to install software, because I wanted to avoid problems. However, as you install more complicated software, you will run into some problems. As you'll see in the breakout sessions, python can be pretty slow. We can speed it up by writing the slow parts of our software in C or FORTRAN. Then we can write some special code to allow us to use that software in python. The drawback is that in order to install the software, you now need to compile something which introduces a host of additional problems. The codeexample macro is written in 100% pure python, so the installation was really simple. This won't always be the case.

Furthermore, you'll often encounter a situation where you want to install a package, but are unable to because you lack some python module that the package depends on. Now you have to worry about installing even more software to resolve the dependencies. Needless to say, this can be tedious and frustrating.

Debugging in Python

Presented by: Kurt Smith

Why use pdb? Aren't print statements sufficient?

The hardy print statement can get you pretty far in debugging. It is simple and always available, and does exactly what you want with no complications. Unfortunately, print does ''only'' what you want, and nothing more. If you realize that you should be printing the contents of something else, you have to go in to your code, modify it and re-run it. It is easy to lose discipline with print statements, and suddenly you've got too much output to be meaningful.

Suppose you have no idea what's wrong, other than something's breaking inside a certain function. If it takes a full minute for the program to run and get to the print statements, then every debugging iteration will take a full minute -- hardly an enjoyable debugging session. The Python standard library module pdb was made just for this -- to give you an interactive code explorer wherever you want it. You can drop into an interactive pdb session at the interesting spot, look around, find out what the different variables are set to, figure out what's going on, and step through the code.

Once you get used to pdb it will save you much time in sticky debugging situations. With just six commands or so, you can be much more effective at debugging your code as compared to print statements.

Example:

Let's take the code to plot the logistic map (http://en.wikipedia.org/wiki/Logistic_map) and insert an artificial bug in it:

#!python
import pylab as pl
import numpy as np

def logistic(r):
    return lambda x: r * x * (1.0 - x)

def sine(r):
    from math import sin, pi
    return lambda x: r * sin(pi * x)

def iterate_limit(func, x0, niter, nresults):
    for _ in xrange(niter):
        x0 = func(x0)
    results = np.zeros((nresults,))
    results = set()
    for i in xrange(nresults):
        x0 = func(x0)
        results.add(x0)
    # artificial bug here...
    return nresults

def driver(func, lower, upper, N=400):
    X = np.linspace(lower, upper, N)
    nresults, niter = 1000, 1000
    for x in X:
        results = iterate_limit(func(x), 0.5, niter, nresults)
        pl.plot([x]*len(results), list(results), 'k,')

if __name__ == '__main__':
    pl.figure()
    driver(logistic, 0.0, 4.0)
    pl.xlabel('r')
    pl.ylabel('X limit')
    pl.title('Logistic Map')
    pl.figure()
    driver(sine, 0.0, 1.0)
    pl.xlabel('r')
    pl.ylabel('X limit')
    pl.title('Sine Map')
    pl.show()

Running this code leads to an exception:

$ python chaos.py Traceback (most recent call last):
  File "chaos.py", line 31, in <module>
    driver(logistic, 0.0, 4.0)
  File "chaos.py", line 27, in driver
    pl.plot([x]*len(results), list(results), 'k,')
TypeError: object of type 'int' has no len()

This traceback lets us know that something is wrong with the results variable, since it's an int and not a set.

Instead of the usual hunt-and-peck with print statements, let's debug this with the pdb module.

First, we'll illustrate the use of pdb.set_trace(). We make the following change in the code, at line 26 in [attachment:chaos.py]:

#!python
def driver(func, lower, upper, N=400):
    X = np.linspace(lower, upper, N)
    nresults, niter = 1000, 1000
    for x in X:
        import pdb; pdb.set_trace()
        results = iterate_limit(func(x), 0.5, niter, nresults)
        pl.plot([x]*len(results), list(results), 'k,')

We'll save this modified version as [attachment:chaos_set_trace.py].

When running this code, the pdb.set_trace() call will drop us into the interactive pdb debugger.

$ python chaos_set_trace.py
> /home/ksmith/THW/PYBC/debugging/chaos_set_trace.py(27)driver()
-> results = iterate_limit(func(x), 0.5, niter, nresults)
(Pdb)

pdb tells us what line of code will be executed next (with the -> indicator). We have a number of commands at our disposal. Let's type help to get some help:

(Pdb) help

Documented commands (type help <topic>):


EOF    bt         cont      enable  jump  pp       run      unt   a      c
continue  exit    l     q        s        until alias  cl         d         h
list  quit     step     up    args   clear      debug     help    n     r
tbreak   w     b      commands   disable   ignore  next  restart  u
whatis
break  condition  down      j       p     return   unalias  where

Miscellaneous help topics:


exec  pdb

Undocumented commands:


retval  rv

(Pdb)

pdb tells us the commands and their abbreviations that we can use. Let's try list:

(Pdb) list
 22     def driver(func, lower, upper, N=400):
 23         X = np.linspace(lower, upper, N)
 24         nresults, niter = 1000, 1000
 25         for x in X:
 26             import pdb; pdb.set_trace()
 27  ->         results = iterate_limit(func(x), 0.5, niter, nresults)
 28             pl.plot([x]*len(results), list(results), 'k,')
 29     30     if __name__ == '__main__':
 31         pl.figure()
 32         driver(logistic, 0.0, 4.0)
(Pdb)

A code listing results, with an arrow indicating the next line to be executed. It's possible to abbreviate list to a single l (that's an 'ell').

We can print out the values of variables in the code at this point, with the print command (or its p abbreviation):

(Pdb) p x, niter, nresults
(0.0, 1000, 1000)
(Pdb)

If we hit return again, pdb will execute the most recent command again, no matter what it is (useful to save some typing).

We can step through the code by issuing the next and step commands. next will run the next line of code at the current indentation level, including all nested function calls, etc. step will do a depth-first tracing of the next line of code, allowing you to step through the execution of every function call. Let's call step:

#!python
(Pdb) s
--Call--
> /home/ksmith/THW/PYBC/debugging/chaos_set_trace.py(4)logistic()
-> def logistic(r):
(Pdb) l
  1     import pylab as pl
  2     import numpy as np
  3     4  -> def logistic(r):
  5         return lambda x: r * x * (1.0 - x)
  6     7     def sine(r):
  8         from math import sin, pi
  9         return lambda x: r * sin(pi * x)
 10     11     def iterate_limit(func, x0, niter, nresults):

We see that step takes us to the logistic function. Issuing the next command and a few returns will move through the function, and we end up in the iterate_limit function:

(Pdb) n
> /home/ksmith/THW/PYBC/debugging/chaos_set_trace.py(12)iterate_limit()
-> for _ in xrange(niter):
(Pdb) l
  7     def sine(r):
  8         from math import sin, pi
  9         return lambda x: r * sin(pi * x)
 10     11     def iterate_limit(func, x0, niter, nresults):
 12  ->     for _ in xrange(niter):
 13             x0 = func(x0)
 14         results = np.zeros((nresults,))
 15         results = set()
 16         for i in xrange(nresults):
 17             x0 = func(x0)

If we hit n for next, we'll cycle through the for loop niter times, which is 1000 times more than we want to. The until command (type help until for documentation) will move us beyond this loop:

(Pdb) unt
> /home/ksmith/THW/PYBC/debugging/chaos_set_trace.py(14)iterate_limit()
-> results = np.zeros((nresults,))
(Pdb) l
  9         return lambda x: r * sin(pi * x)
 10     11     def iterate_limit(func, x0, niter, nresults):
 12         for _ in xrange(niter):
 13             x0 = func(x0)
 14  ->     results = np.zeros((nresults,))
 15         results = set()
 16         for i in xrange(nresults):
 17             x0 = func(x0)
 18             results.add(x0)
 19         # artificial bug here...

Stepping through things with next and using the until command to get over the loop, we come to the artificial bug line.

The bug should be clear -- the line should be return results rather than return nresults. We can exit the debugger with continue to continue running the remaining code, or we can simply quit.

A few other useful commands:

  • `up` and `down` -- these take you up and down the call stack, allowing exploration of the variables in the calling frame, for example. Very useful if you want to know what's going on in the ''calling'' function.
  • `pm` -- lets you do post-mortem debugging, i.e. after an exception has been triggered. See the documentation.
  • `return` -- takes you to the return statement of the current function/method; nice for skipping over an uninteresting function body.
  • `break <lineno>` -- sets a break immediately before the specified line number, equivalent to putting import pdb; pdb.set_trace() in the code at this point.
  • `continue` -- executes code until a break or a set_trace() is encountered. If you care about what's going on in only 2 places in the code, put a pdb.set_trace() at these spots, and issue the continue command to go back and forth between them.

References:

Python Editors/IDEs

You now know the basics of writing python and installing python software. Now it is time to make your work flow more efficient by picking the python editor that is best ''for you''. This is all about personal preference, so the only way to find out what works for you is to try a few options.

The minimalist way

All you really need to develop python is a text editor and the python interpreter. For example, in Windows, you could write your python scripts with notepad, then run them in the terminal. However, this isn't a very good option because notepad is not a good text editor.

Traditional Code Editors

Many of the organizers of THWPYBC2010 use more traditional code editors such as vi or emacs to develop python. These editors are very powerful if you know how to use them. They have a lot of good features and are highly customizable. However, the learning curve may be a little steep. One of the great features of these tools is that they are not specifically designed to run python. They can be used to develop any language which is a big plus.

Using IDLE

Throughout the boot camp, we have used IDLE as our development environment of choice. IDLE has some pretty nice features. It highlights your text and has a built in interpreter. You can run scripts by hitting "f5". IDLE also supports some tab completion and is more graphical - so if you like using your mouse, it isn't a bad tool.

Using Fancy IDE's

Those of you coming from a !MatLab background may be used to something a little more substantial. I want to disclaim that I don't really use fancy IDE's but I tried to learn a little about two popular ones: [http://code.google.com/p/spyderlib/ spyder] and [http://eric-ide.python-projects.org/ eric].

I think that spyder is trying to be a tool that !MatLab users can be comfortable using. It has all of the things that IDLE has, such as an interactive terminal, syntax highlighting, etc... but much more. For example, when you type functions in the console, spyder will automatically pull up the documentation for you. It also has a cool feature where it will display local variables in a list so you can view them. The one thing that I found lacking was a good, graphical integrated debugger. It has pdb built in, but the debugger isn't integrated into the code your typing. So you can't, for example, right click a line and enter a break point.

Eric is a little more heavy duty. I would describe it as a little more like Visual Studio or Eclipse like than !MatLab like. The main advantage over spyder is that it has a nicely integrated debugger. Another popular IDE is Eclipse. You can install a plugin which allows python development. I encourage you to try a few IDE's and find the one that works best for you.