SlopPy (Sloppy Python) is a modified Python interpreter that ensures your scripts will never crash.
Whenever SlopPy encounters an uncaught exception, instead of crashing the script, it will create a special NA ("Not Available") object, make that the result of the current expression, and continue executing normally. Whenever an NA object appears in an expression, SlopPy propagates it according to special rules. For example, all unary and binary operations involving NA will return NA.
SlopPy allows imperfect scripts to finish executing and produce partial results (and a log of all exceptions), which can be more informative than simply crashing at the first uncaught exception. SlopPy is a drop-in replacement for the Python 2.6 interpreter, so it should work seamlessly with all of your existing scripts and 3rd-party libraries with no run-time slowdown.
If you've written Python scripts that run for at least a few minutes, then you've probably encountered the following annoyance:
Now you need to edit your script to fix that bug and then re-execute. It might take a few minutes to hours before your script gets past the point where it originally crashed, and then it will likely crash again with another exception. It might take a few rounds of debugging and re-executing before the script successfully finishes running and produces results.
SlopPy allows your buggy script to finish running on the first attempt, produce partial results, and show you all uncaught exceptions (not just the first one). You can always gain more insights from partial results than from no results, and you can also try to patch up all exceptions in one round of edits rather than addressing one at a time.
In sum, SlopPy allows you to write sloppy scripts in a 'quick-and-dirty' manner without worrying about error handling, which can speed up your iteration cycle when prototyping.
Sure, this 6-minute screencast demonstrates SlopPy's basic capabilities:
This workshop paper reflects the state of SlopPy circa late-2010:
Sloppy Python: Using Dynamic Analysis to Automatically Add Error Tolerance to Ad-Hoc Data Processing Scripts.
Philip J. Guo.
International Workshop on Dynamic Analysis (WODA), July 2011.
SlopPy is a modified version of the Python 2.6.3 interpreter. I want to make it easy for people to start using SlopPy, but I haven't yet had time to create reliable one-click installers for all supported operating systems. Currently, the only way to install SlopPy is to download and compile its source code. If you don't want to go through this hassle, please email me at:
and I will try my best to compile a custom version for your computer and to guide you through the setup process.
The SlopPy source code resides in a public GitHub code repository. You can check out the latest copy and compile using these commands:
git clone git://github.com/pgbovine/SlopPy.git cd SlopPy ./configure make
Mac OS X: If you install the 'Xcode developer tools' and 'X11' packages from your installation DVD, then you should have most of the software required to compile SlopPy. It's also a good idea to install the GNU readline library before compiling SlopPy, so that your Python interactive prompt acts more pleasant.
Linux: The software needed to compile SlopPy might already come pre-installed, but in case they're not, here are some useful packages to install (these names are for Debian-based distros, but it should be easy to look up the corresponding names in other package management systems):
sudo apt-get install libc6-dev g++ gcc libreadline-dev
It's normal for warning messages like this one to appear when you're compiling Python:
Failed to find the necessary bits to build these modules: _bsddb bsddb185 dbm dl gdbm imageop sunaudiodev To find the necessary bits, look in setup.py in detect_modules() for the module's name.
It just means that certain Python modules cannot be compiled for your machine, but as long as you see an executable named python (or python.exe on Mac OS X) in the SlopPy directory, the build was successful.
After a successful compile, there should be an executable named python (or python.exe on Mac OS X) in the SlopPy directory. When you execute that program, you should see an interactive Python prompt like the following:
$ ./python.exe SlopPy: A Python interpreter that facilitates sloppy, error-tolerant data analysis Created by Philip Guo (firstname.lastname@example.org) Python 2.6.3 (r263:75183, Sep 5 2010, 13:56:04) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>
SlopPy is designed to work seamlessly with all 3rd-party libraries, extensions, and tools (e.g., NumPy, SciPy, matplotlib, IPython), as long as they are compatible with Python 2.6. You shouldn't need to re-compile any libraries or extension code.
All you need to do is to set the PYTHONPATH environment variable so that SlopPy knows where your libraries and extensions are installed (alternatively, you can prepend the path onto the sys.path variable from within your Python script).
You can install 3rd-party libraries in a variety of ways, but if you're affiliated with a university, I highly recommend downloading a free academic version of the Enthought Python Distribution. It's a fantastic one-click installer containing Python 2.6 and over 75 useful libraries.
After installing the Enthought Python Distribution on my Mac OS X 10.6 computer, I can give SlopPy access to all of its installed libraries by setting PYTHONPATH to the appropriate location and then starting up SlopPy:
export PYTHONPATH=/Library/Frameworks/Python.framework/Versions/6.1/lib/python2.6/site-packages ~/SlopPy/python.exe
If you're on a 64-bit machine and want to compile a 32-bit x86 SlopPy binary (e.g., to interoperate with already-installed 32-bit 3rd-party libraries), you can run this modified configure command before compiling:
# for Mac OS X: ./configure CC="gcc -arch i386" CXX="g++ -arch i386" # for Linux: ./configure CC="gcc -m32" CXX="g++ -m32"
Please let me know if you have troubles getting 3rd-party libraries working with SlopPy.
The best way to show how SlopPy works is to demonstrate using the Python interactive prompt.
When a regular Python interpreter encounters an uncaught exception, it will crash with a traceback message. For example, let's try to execute x = 1 / 0:
$ python Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> x = 1 / 0 Traceback (most recent call last): File "
", line 1, in ZeroDivisionError: integer division or modulo by zero
When SlopPy executes that same statement, it will create a special NA object to represent the ZeroDivisionError and assign it to x:
$ ~/SlopPy/python.exe SlopPy: A Python interpreter that facilitates sloppy, error-tolerant data analysis Created by Philip Guo (email@example.com) Python 2.6.3 (r263:75183, Sep 5 2010, 13:56:04) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> x = 1 / 0 >>> print x <NA>
NA stands for "Not Available", which is a term statisticians use to indicate missing data.
When an NA value appears in most types of expressions, SlopPy simply propagates it to the expression's result. The intuition here is that if an operand has an unknown (NA) value, then the result should also be unknown. For example:
>>> x = 1 / 0 >>> print x <NA> >>> y = 5 >>> z = x + y >>> print z <NA>
Pretty much anything you do to an NA value just propagates it. Here are some more examples:
>>> x = 1 / 0 >>> x <NA> # Unary operation: >>> -x <NA> # Binary operation: >>> x + 5 <NA> # Element indexing: >>> x <NA> >>> lst = [1, 2, 3] >>> lst[x] <NA> # Field access: >>> x.field <NA> # Function call: >>> x(1, 2, 3) <NA> # Method call: >>> x.method('a', 'b', 'c') <NA> # Comparison: >>> x > 5 <NA> >>> x < 5 <NA> >>> x == 5 <NA>
Iterating over an NA value terminates instantly (to prevent infinite loops):
>>> for line in open(NONEXISTENT_FILE): ... print line ... >>>
In the above example, the name NONEXISTENT_FILE is unbound, so its value is NA; the NA propagates as the result of the open() call, and iterating over it terminates instantly.
Mutating an NA value does nothing:
>>> x = 5 >>> x.field = 5
Iterators and generators skip over NA values rather than yielding them. The intuition here is that when a mostly-correct script is executing, most elements in collections are legal values. Thus, the results of aggregate statistics computed via iteration (e.g., summation) should not be corrupted by NA values. e.g.,:
>>> x = 1 / 0 >>> x <NA> >>> lst = [1, 2, 3] >>> lst.append(x) >>> lst.append(4) >>> lst.append(x) >>> lst [1, 2, 3, <NA>, 4, <NA>] >>> for e in lst: ... print e ... 1 2 3 4
Special treatment for conjunctions and disjunctions:
>>> x and False False >>> False and x False >>> x or True True >>> True or x True
If a branch condition is NA, then SlopPy takes an arbitrary side (technically, neither is correct). In most circumstances, it takes the 'true' side of the branch:
>>> x = 1 / 0 >>> if x: ... print "TRUE BRANCH" ... else: ... print "FALSE BRANCH" ... TRUE BRANCH
While the target script is executing, SlopPy outputs two warning logs to the current working directory: slop_verbose.log (human-readable text) and slop_binary.log (pickled Python objects for consumption by other scripts). These logs contain the context of each exception (e.g., stack traceback and local variable values) and an indication of how NA values propagate throughout execution. Please email me if you have questions about how to understand or use these logs.
SlopPy is created and maintained by Philip Guo
Last updated: 2010-09-30