SlopPy: An error-tolerant Python interpreter that facilitates sloppy programming

What is SlopPy?

SlopPy (Sloppy Python) is a modified Python interpreter that ensures your scripts will never crash.

Whenever SlopPy encounters an uncaught exception, instead of crashing the script, it will create a special NA ("Not Available") object, make that the result of the current expression, and continue executing normally. Whenever an NA object appears in an expression, SlopPy propagates it according to special rules. For example, all unary and binary operations involving NA will return NA.

SlopPy allows imperfect scripts to finish executing and produce partial results (and a log of all exceptions), which can be more informative than simply crashing at the first uncaught exception. SlopPy is a drop-in replacement for the Python 2.6 interpreter, so it should work seamlessly with all of your existing scripts and 3rd-party libraries with no run-time slowdown.

How can SlopPy be useful for me?

If you've written Python scripts that run for at least a few minutes, then you've probably encountered the following annoyance:

  1. You start executing a long-running script on your machine.
  2. You switch to working on another task or go home for the evening.
  3. When you return to check on your script, you see that it crashed at the first uncaught exception without producing any useful results.

Now you need to edit your script to fix that bug and then re-execute. It might take a few minutes to hours before your script gets past the point where it originally crashed, and then it will likely crash again with another exception. It might take a few rounds of debugging and re-executing before the script successfully finishes running and produces results.

SlopPy allows your buggy script to finish running on the first attempt, produce partial results, and show you all uncaught exceptions (not just the first one). You can always gain more insights from partial results than from no results, and you can also try to patch up all exceptions in one round of edits rather than addressing one at a time.

In sum, SlopPy allows you to write sloppy scripts in a 'quick-and-dirty' manner without worrying about error handling, which can speed up your iteration cycle when prototyping.

Can you show me a quick demo?

Sure, this 6-minute screencast demonstrates SlopPy's basic capabilities:

How can I learn more about SlopPy?

This workshop paper reflects the state of SlopPy circa late-2010:

Sloppy Python: Using Dynamic Analysis to Automatically Add Error Tolerance to Ad-Hoc Data Processing Scripts. Philip J. Guo. International Workshop on Dynamic Analysis (WODA), July 2011.
[BibTeX]


How can I download and install SlopPy?

SlopPy is a modified version of the Python 2.6.3 interpreter. I want to make it easy for people to start using SlopPy, but I haven't yet had time to create reliable one-click installers for all supported operating systems. Currently, the only way to install SlopPy is to download and compile its source code. If you don't want to go through this hassle, please email me at:

and I will try my best to compile a custom version for your computer and to guide you through the setup process.

The SlopPy source code resides in a public GitHub code repository. You can check out the latest copy and compile using these commands:

git clone git://github.com/pgbovine/SlopPy.git
cd SlopPy
./configure
make

Dependencies

Mac OS X: If you install the 'Xcode developer tools' and 'X11' packages from your installation DVD, then you should have most of the software required to compile SlopPy. It's also a good idea to install the GNU readline library before compiling SlopPy, so that your Python interactive prompt acts more pleasant.

Linux: The software needed to compile SlopPy might already come pre-installed, but in case they're not, here are some useful packages to install (these names are for Debian-based distros, but it should be easy to look up the corresponding names in other package management systems):

sudo apt-get install libc6-dev g++ gcc libreadline-dev

It's normal for warning messages like this one to appear when you're compiling Python:

Failed to find the necessary bits to build these modules:
_bsddb             bsddb185           dbm
dl                 gdbm               imageop
sunaudiodev
To find the necessary bits, look in setup.py in detect_modules() for the module's name.

It just means that certain Python modules cannot be compiled for your machine, but as long as you see an executable named python (or python.exe on Mac OS X) in the SlopPy directory, the build was successful.

Running SlopPy for the first time

After a successful compile, there should be an executable named python (or python.exe on Mac OS X) in the SlopPy directory. When you execute that program, you should see an interactive Python prompt like the following:

$ ./python.exe 
SlopPy: A Python interpreter that facilitates sloppy, error-tolerant data analysis
Created by Philip Guo (pg@cs.stanford.edu)

Python 2.6.3 (r263:75183, Sep  5 2010, 13:56:04) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

Working with 3rd-party libraries

SlopPy is designed to work seamlessly with all 3rd-party libraries, extensions, and tools (e.g., NumPy, SciPy, matplotlib, IPython), as long as they are compatible with Python 2.6. You shouldn't need to re-compile any libraries or extension code.

All you need to do is to set the PYTHONPATH environment variable so that SlopPy knows where your libraries and extensions are installed (alternatively, you can prepend the path onto the sys.path variable from within your Python script).

You can install 3rd-party libraries in a variety of ways, but if you're affiliated with a university, I highly recommend downloading a free academic version of the Enthought Python Distribution. It's a fantastic one-click installer containing Python 2.6 and over 75 useful libraries.

After installing the Enthought Python Distribution on my Mac OS X 10.6 computer, I can give SlopPy access to all of its installed libraries by setting PYTHONPATH to the appropriate location and then starting up SlopPy:

export PYTHONPATH=/Library/Frameworks/Python.framework/Versions/6.1/lib/python2.6/site-packages
~/SlopPy/python.exe

If you're on a 64-bit machine and want to compile a 32-bit x86 SlopPy binary (e.g., to interoperate with already-installed 32-bit 3rd-party libraries), you can run this modified configure command before compiling:

# for Mac OS X:
./configure CC="gcc -arch i386" CXX="g++ -arch i386"

# for Linux:
./configure CC="gcc -m32" CXX="g++ -m32"

Please let me know if you have troubles getting 3rd-party libraries working with SlopPy.

How does SlopPy work?

The best way to show how SlopPy works is to demonstrate using the Python interactive prompt.

Creating NA objects from uncaught exceptions

When a regular Python interpreter encounters an uncaught exception, it will crash with a traceback message. For example, let's try to execute x = 1 / 0:

$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> x = 1 / 0
Traceback (most recent call last):
  File "", line 1, in 
ZeroDivisionError: integer division or modulo by zero

When SlopPy executes that same statement, it will create a special NA object to represent the ZeroDivisionError and assign it to x:

$ ~/SlopPy/python.exe 
SlopPy: A Python interpreter that facilitates sloppy, error-tolerant data analysis
Created by Philip Guo (pg@cs.stanford.edu)

Python 2.6.3 (r263:75183, Sep  5 2010, 13:56:04) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> x = 1 / 0
>>> print x
<NA>

NA stands for "Not Available", which is a term statisticians use to indicate missing data.

Treatment of NA values

When an NA value appears in most types of expressions, SlopPy simply propagates it to the expression's result. The intuition here is that if an operand has an unknown (NA) value, then the result should also be unknown. For example:

>>> x = 1 / 0
>>> print x
<NA>
>>> y = 5   
>>> z = x + y
>>> print z
<NA>

Pretty much anything you do to an NA value just propagates it. Here are some more examples:

>>> x = 1 / 0
>>> x
<NA>
# Unary operation:
>>> -x
<NA>
# Binary operation:
>>> x + 5
<NA>
# Element indexing:
>>> x[1]
<NA>
>>> lst = [1, 2, 3]
>>> lst[x]
<NA>
# Field access:
>>> x.field
<NA>
# Function call:
>>> x(1, 2, 3)
<NA>
# Method call:
>>> x.method('a', 'b', 'c')
<NA>
# Comparison:
>>> x > 5
<NA>
>>> x < 5
<NA>
>>> x == 5
<NA>

Iterating over an NA value terminates instantly (to prevent infinite loops):

>>> for line in open(NONEXISTENT_FILE):
...     print line
... 
>>>

In the above example, the name NONEXISTENT_FILE is unbound, so its value is NA; the NA propagates as the result of the open() call, and iterating over it terminates instantly.

Mutating an NA value does nothing:

>>> x[1] = 5
>>> x.field = 5

Iterators and generators skip over NA values rather than yielding them. The intuition here is that when a mostly-correct script is executing, most elements in collections are legal values. Thus, the results of aggregate statistics computed via iteration (e.g., summation) should not be corrupted by NA values. e.g.,:

>>> x = 1 / 0
>>> x
<NA>
>>> lst = [1, 2, 3]
>>> lst.append(x)
>>> lst.append(4)
>>> lst.append(x)
>>> lst
[1, 2, 3, <NA>, 4, <NA>]
>>> for e in lst:
...     print e
... 
1
2
3
4

Special treatment for conjunctions and disjunctions:

>>> x and False
False
>>> False and x
False
>>> x or True
True
>>> True or x
True

If a branch condition is NA, then SlopPy takes an arbitrary side (technically, neither is correct). In most circumstances, it takes the 'true' side of the branch:

>>> x = 1 / 0
>>> if x:
...     print "TRUE BRANCH"
... else:
...     print "FALSE BRANCH"
... 
TRUE BRANCH

Warning logs

While the target script is executing, SlopPy outputs two warning logs to the current working directory: slop_verbose.log (human-readable text) and slop_binary.log (pickled Python objects for consumption by other scripts). These logs contain the context of each exception (e.g., stack traceback and local variable values) and an indication of how NA values propagate throughout execution. Please email me if you have questions about how to understand or use these logs.


SlopPy is created and maintained by Philip Guo

Last updated: 2010-09-30