Philip Guo (Phil Guo, Philip J. Guo, Philip Jia Guo, pgbovine)

Why Python is a great language for teaching beginners in introductory programming classes

This article presents reasons for why I feel that Python is a great first language to teach students about computer programming (see this related article for more of my views on introductory programming courses). I've developed and refined some of these hunches through my experiences in teaching Python to beginners.

In a nutshell, all of my assertions reinforce my belief that most beginner programmers don't care at all about Computer Science or programming language theory, but rather simply want to get the computer to do what they say with as little overhead and boilerplate code as possible. The less code they need to write, the less errors and bugs they might encounter; the less errors and bugs they encounter, the less likely they are to become frustrated in those crucial early days and give up on programming altogether.

I think that Python is superior to many other languages for teaching introductory programming because it allows the student to focus less on details such as types, compilers, and writing boilerplate code and more on the algorithms and data structures relevant to performing their intended tasks (similar arguments could be made for languages such as Ruby). Here are some specific Python features that make it great for teaching beginners to program:

No need to declare the types of variables

When a student first learns to program in a statically-typed language such as C++ or Java, usually the first lines that he/she must write are variable declarations like int count or List<String> seq. These lines of code don't do anything except to serve as visual clutter and obscure the statements that perform real actions. In the small programs that beginners often write, declarations can take up 1/3 to 1/4 of all lines of code! Veteran programmers think that declaring variables and types are second nature, but they can be really annoying to beginners who have to constantly grapple with a compiler yelling at them for type errors. (You don't have to declare types in some statically-typed languages, such as ML and other languages with type inference, but teaching students ML as a first language is like giving them a swift hard kick to the nose.)

Run-time type errors rather than compile-time type errors

Compile-time type errors annoy students to no end. It's better to at least let their program run and do SOMETHING rather than fail to compile and do NOTHING. In production-quality mission-critical code, a compiler with strict static type checking is vital in order to reduce the incidence of run-time type errors, but students aren't writing code for nuclear reactors—they're writing 20-line toy programs. At least let their programs run and have them figure out what went wrong at run-time! Sometimes execution can get pretty far before a run-time type error occurs, and the student is likely to have a decent intuition about the source of the error.

It's demoralizing to be a beginner who just wants his code to run and do something (even if it doesn't behave perfectly), only to find out that this mysterious tyrant called a 'compiler' refuses to turn his code into an executable program. Ideally, students should be able to run their programs as frequently as possible—the more iterations of the edit-debug cycle they can squeeze in during a programming session, the more motivated they will be to learn.

(The article Strong Typing vs. Strong Testing by Bruce Eckel discusses the advantages of a dynamically-typed language in producing higher-quality code even when used by veteran programmers.)

The print statement is polymorphic and automatically inserts spaces and newlines

Beginner students will often make heavy use of the print statement, both for producing program output and to do their debugging (it might be too overwhelming to teach them to use the Python debugger from Day 1). Instead of requiring students to learn obscure C-style printf syntax, the Python print statement can correctly print out variables of all types, both primitive and compound. Rather than writing something wacko like:

printf("%s lives at %d %s\n",
       person, street_number, street_name);

the student can write something much more sensible:

print person, "lives at", street_number, street_name

The above Python print statement actually reads like it should print out a line that looks like "John Doe lives at 135 Elm Street" whereas the C printf version only looks sane to, well, C programmers.

Strings, lists, and dictionaries are built-ins that operate with fairly uniform syntax

You can get very far as a beginner programmer by simply using strings (to store textual data), lists (to store sequences), and dictionaries (to store associations). Python makes these three data structures extremely easy to use with a uniform syntax: e.g., the bracket operator [ ... ] for element access, the for ... in ... loop for iteration, and the if ... in ... test for membership.

Dynamic typing makes it easy to avoid awkwardly long type declarations for lists and dictionaries, such as something ugly like Map<String, List<Integer>>, and simple initializers like x = [1, 2, 3] avoid the need to make repeated method calls to insert elements. For example, if you want to create a dictionary with keys as strings and values as lists of integers, simply just throw those types of data inside—no verbose declarations are necessary:

d = {}
d["my bowling scores"] = [120, 140, 165]

Compare with the equivalent Java 1.5 code:

Map<String, List<Integer>> d = new HashMap<String, List<Integer>>();
List<Integer> lst = new LinkedList<Integer>();
lst.add(120);
lst.add(140);
lst.add(165);
d.put("my bowling scores", lst);

I have nothing against Java (poor Java gets picked on by programming language snobs all the time), but holy cow! Just look at that verbosity! So much more typing to achieve the same goal of slapping some data into a dictionary.

Yeah, yeah, I know that for an experienced programmer working on a large piece of software, a few lines more typing isn't gonna be a killer, but for a beginner, think about which version will likely be easier to comprehend! I can't imagine trying to explain the above Java code to a beginner. Beginners don't care about type declarations, Java generics, parametric polymorphism, or any other weird programming language jargon—they just want their code to work like they intended with as little typing as possible!

Beautiful, clean-looking syntax that's concise enough to not produce visual clutter but not too concise so as to cause confusion

It's undisputed that Python code looks clean and beautiful, and clean-looking code is easier to write, easier to read, and easier to understand. However, Python code isn't so concise so as to be confusing, such as what often happens with lots of Perl code. It's a great balance between conciseness and clarity.

Something which initially attracted me to Python was that I felt it was fun to write Python code because it just looked pretty. Although this might sound corny, I feel that the pleasant visceral emotional feelings aroused when looking at Python code can improve the programmer's productivity by making him/her feel that the code is neater and easier to comprehend. If you need to stare at code for several hours each day, it's better if it looks prettier (same with most things in the world).

An interactive command prompt and the ability to dynamically reload modules without restarting an interactive session

Whenever beginner programmers have a question about exactly how a function or language feature works, the best way to answer it is through the scientific method—FIRST TRY IT OUT AND SEE WHAT HAPPENS, and then improve and refine your understanding and re-experiment. The best way to experiment is via an interactive command prompt, because there is no need to save a source file, recompile it, re-edit to get rid of stupid type errors in the code you've just modified, recompile, and then re-execute. Using Python, you can just create data structures in an interactive prompt and test them out in real-time.

Even better, when you want to test out functions you've written in source files (called 'modules'), you can use the reload function from an interactive prompt to test your updates without exiting. This becomes immensely powerful if you need to set up a lot of state before you run your experimental module, and it takes lots of work to set up that state again after shutting down your Python session. A quick edit-debug cycle benefits students greatly because they feel more free to experiment because the cost of experimentation is so low. And with increased experimentation comes better understanding and an increased motivation to learn further.

Functional operations that return new values rather than mutating existing objects

Many operations in the Python standard library are written in a functional style, meaning that they return a new value rather than mutating the original object. In teaching beginners to program using the Python interactive prompt, I've found this functional property to be extremely useful because it allows one to keep on re-experimenting on the same object without mutating it.

For example, when I was teaching students about string slicing (with the purpose of stripping off surrounding double quotes), I first had them create a string x set to "Weight in kilograms" (with surrounding quotes). In order to demonstrate the string slicing syntax, I had them try out slicing with various indices to see the results for themselves:

>>> print x
"Weight in kilograms"
>>> print x[1:]
Weight in kilograms"
>>> print x[2:]
eight in kilograms"
>>> print x[3:]
ight in kilograms"
>>> print x[:-1]
"Weight in kilograms
>>> print x[:-2]
"Weight in kilogram
>>> print x[:-3]
"Weight in kilogra
>>> print x[1:-1]
Weight in kilograms

The final line achieves the goal of stripping off leading and trailing quotes on x. Because the slicing operation returns a new string every time, the original x is unmodified and can thus be sliced multiple times. If instead the operation had modified x each time, it would be more cumbersome to experiment with slicing using different indices.

One of the best ways for beginners to learn is via direct experimentation, and the functional style facilitates such experimentation.

Comprehensive standard library and plenty of available free third-party libraries

Most introductory programming courses teach students to write math-related programs - e.g., calculating Fibonacci or prime numbers, making a four-function calculator, calculating the average of a set of numbers - because these can be easily implemented without the help of external libraries (after all, mathematical operators are primitive operations in most programming languages because they are primitive operations in most machine languages).

I personally think that programming would be a lot more fun for students if they had a real-life application in mind, such as making scripts to manage their digital photos, writing extensions to their favorite instant messaging client, or automatically downloading their favorite images and photos from the Internet. All of these application domains require extensive use of libraries, and fortunately Python has libraries for all of these purposes. Without the help of libraries, students can never move beyond writing toy programs into creating useful pieces of software that can motivate them to learn further.

The right operators are overloaded to an appropriate degree

Operator overloading can make code appear more concise and readable, but can also obfuscate code when used recklessly. The extent to which a language allows its operators to be overloaded (and which ones are actually overloaded in the standard library) determines code readability. On one extreme, in C or Java, almost no operators are overloaded, so verbose function/method calls are necessary to perform most operations. On the other extreme, in Perl, many operators are overloaded and have different meanings in different contexts, making the code look cryptic. I believe that Python strikes a good balance in overloading operators like + and [].

For instance, the + operator is a great example of tasteful operator overloading: mathematical addition for numbers, concatenation for strings, lists, and tuples. If you ask beginners what they expect + to do for the aforementioned types, they could easily guess the correct functionality. On the other hand, if you ask beginners what they expect + to do for two dictionaries, then they would be confused. There's no intuitively obvious thing to do to combine two dictionaries. Perhaps you could combine their mappings into one larger dictionary, but if both input dictionaries have entries with the same key, which value would you keep? This lack of a clear meaning for + on dictionaries is probably why it's not overloaded to work on that type.

Created: 2007-05-18
Last modified: 2008-07-02