Philip Guo (Phil Guo, Philip J. Guo, Philip Jia Guo, pgbovine)

The power (but enormous setup cost) of scripting

Earlier today ...

My friend asked me whether I knew of any photo editing programs that he could use to easily resize a batch of digital photos. Since he had hundreds of photos, he didn't want to open a photo editing program such as Adobe Photoshop and manually resize and save all of them. The first thought that popped into my head was ImageMagick, a suite of free command-line photo manipulation tools. I knew that ImageMagick contained a program that could resize photos, and that I could just write a script to use that program to resize all the photos in his batch. I told my friend that if he could get ImageMagick installed on his Windows machine, I could quickly write him a short script that could do what he wanted.

Command-line programs versus GUI programs

Before I proceed with my story, I want to make sure that you, the reader, understand the distinction between a GUI program and a command-line program. The vast majority of computer users today interact solely with GUI programs on a daily basis, usually on a Windows-based operating system. GUI stands for Graphical User Interface, and a GUI program is one in which the user interacts with the program by using the mouse to click around on graphical elements on the screen and occasionally using the keyboard to type in some text. Pretty much every program you probably use is a GUI program, from your web browser to your email client to your word processor. Most people think that GUI programs are the only type of computer programs that exist, but there is another type: command-line programs.

A command-line program does not contain any graphical elements that a user can interact with; instead, a user runs a command-line program by typing the program's name along with its inputs into a text window called a terminal (or a 'command prompt') and then hitting the Enter key. The user must then wait until the program is finished, at which point he can enter in another program to launch. (There are exceptions to the non-interactive nature of command-line programs—some command-line programs such as text-based email clients are actually interactive, and you can think of them as GUI programs except that they execute within the confines of a terminal, but in this article, I will only consider non-interactive command-line programs.) At first glance, this non-interactive, command-based technique seems like an esoteric and inefficient way to use computers, but command-line programs have certain strengths over GUI programs (and vice versa, of course); most notably, command-line programs can be easily combined together via user-written scripts in order to make the computer perform repetitive tasks rather than making humans do them. I'll demonstrate with the following example:

The convert program in the ImageMagick suite takes 3 inputs: the name of a source image file, a transformation to apply to that source file, and the name of an output image file. Its purpose is to apply the transformation on the source file to create an output image file. For example, if I have a file named big_bunny.jpg and I want to resize it to 25% of its original size and save the transformed version as little_bunny.jpg, I could type the following into the terminal and press Enter:

  convert big_bunny.jpg -resize 25% little_bunny.jpg

The first word I type, convert, is the name of the program. The rest of the words on this line are the inputs to the convert program: big_bunny.jpg is the name of the source image file, -resize 25% is the transformation to apply to that file, and little_bunny.jpg is the name of the output file that should be created when the program is finished. When I hit Enter and run this program, it will resize the image named by big_bunny.jpg to 25% of its original size and save it as little_bunny.jpg, just as I specified. This program will usually complete its task in less than half a second on a modern-day computer.

So this is actually kind of cool, because in order to do the same thing with a GUI program like Adobe Photoshop, I would need to perform the following steps:

  1. Open Photoshop, which takes anywhere from 10 to 30 seconds to fully launch on my computer.

  2. Open a file by using the mouse to click on the "File" menu, selecting "Open..." and browsing around until you find the file you want to open: big_bunny.jpg. This can take another 15-30 seconds.

  3. Resize it to 25% by using the mouse to click on the "Image" menu, selecting "Image Size...", choosing "percent" in the pop-up dialog box, and typing in "25" into the text box, then finally clicking on "OK". Phew, that was at least 20 seconds of work.

  4. Save the resized file by using the mouse to click on the "File" menu, selecting "Save As...", and typing in a new filename, little_bunny.jpg, and clicking "OK". 15-30 seconds more ... but you're finally done!

In total, it could take anywhere from 1 to 2 minutes to resize one photo, depending on how fast you are with the mouse. The reason it takes so long is that Photoshop can do much more than simply resizing photos, so it takes time to navigate around its menus to select just the photo resizing functionality. In contrast, it took approximately 10 seconds to type in the following command

  convert big_bunny.jpg -resize 25% little_bunny.jpg

hit the Enter key, and wait half a second for the program to complete. The advantage of the command-line program over the GUI program in this application is that it is much faster to accomplish a single specific non-interactive task such as resizing an image. On the flip side, the GUI program is far better for interactive tasks such as making line drawings on an image with the mouse. (try doing THAT by just typing in textual commands in a terminal!)

I hope I've convinced you that, for at least this particular image opening, resizing, and saving task, using a command-line program saves you some time over using a GUI program (command-line takes approximately 10 seconds whereas GUI takes approximately 90 seconds). But if you only want to resize one image, who the heck cares? 90 seconds isn't unbearably long. And if you already have Photoshop open, it might take you only 60 seconds to resize each subsequent image. It might not be worth the hassle to learn how to use command-line programs and just stick with GUI programs like Photoshop for these tasks. After all, it's much more intuitive and less error-prone to select menu items using the mouse than having to remember to type in a sequence of commands in the terminal in EXACTLY the right order and manner. Hmmm, there must be some other advantage of command-line programs that make their lack of user friendliness worth it ...

The power of scripting

Now, back to my friend's problem: He didn't just have one image that he wanted to resize; he had HUNDREDS. How long would it take to manually open each file in Photoshop, click around the menus to find the resize dialog box, type in "25" for 25%, click "OK", and save the modified file under a new name? It's simple multiplication. Let's say that you're a real speed demon and can click around so fast that you can open, resize, and save one image in only 30 seconds. For 100 photos, this would take 50 minutes (3,000 seconds), for 500 photos, this would take over 4 hours (15,000 seconds), and for 1,000 photos, this would take over 8 hours (30,000 seconds) of SITTING IN FRONT OF THE COMPUTER REPEATING THE SAME SERIES OF MUNDANE MOUSE AND KEYBOARD ACTIONS NON-STOP! Wow, no wonder he didn't want to do all of this by hand in Photoshop!

So what would have happened if my friend had instead used the command-line program convert to resize his images? Wouldn't he have to do essentially the same series of hundreds of repetitive actions, once for each image to be resized, except that instead of clicking around using the mouse, he would need to type out everything using the keyboard in the terminal? Let's say he had a directory with all sorts of animal pictures. Wouldn't he need to type the following to resize all of his photos, one line for each photo?

  convert big_bunny.jpg -resize 25% little_bunny.jpg
  convert big_horse.jpg -resize 25% little_horse.jpg
  convert big_sheep.jpg -resize 25% little_sheep.jpg
  convert big_goat.jpg -resize 25% little_goat.jpg
  ...

Let's say that each line takes him 10 seconds to type. For 100 photos, this would take almost 17 minutes (1,000 seconds), for 500 photos, this would take 1.5 hours (5,000 seconds), and for 1,000 photos, this would take 3 hours (10,000 seconds). Sure, it's about 3 times faster than using a GUI program like Photoshop because he doesn't have to navigate through menus using the mouse, but it's still mind-numbingly tedious and repetitive enough that nobody would want to do it! Hmm, but look at those commands again, and the astute reader will recognize that even though each of the calls to convert above looks different, they all follow a similar pattern ... how could we take advantage of that textual pattern?

Ok, now time for the kicker: you don't need to type out every single command in the terminal for hours on end. You can write a small program called a script to automatically do the typing for you! And herein lies the true power of command-line programs: The power of command-line programs comes from the ability to write scripts that can connect these programs together in order to automate mundane tasks, something that is almost impossible to do in GUI programs due to their inherently interactive nature.

Here is the script that I wrote for my friend to resize his batch of photos:

  if [ ! -d resized/ ]
  then
    mkdir resized
  fi

  for img in *.jpg
  do
    convert $img -resize 25% resized/$img
  done

In English, this script does the following: creates a sub-directory named resized if it doesn't already exist, then calls the convert program on all JPEG images in the current directory (files with the .jpg extension), resizing each one to 25% of its original size, and saving the result as the same filename except in the resized sub-directory.

This script was written in a programming language called BASH, but the choice of language is irrelevant ... the key power of scripting comes from the fact that it can leverage the power of existing command-line programs (in this case, the convert program) to automate mundane, repetitive tasks. Let's focus on the interesting part of this script, which is called a 'for loop':

  for img in *.jpg
  do
    convert $img -resize 25% resized/$img
  done

What this block of code tells the computer to do is to find all the files in the current directory with the .jpg extension (these are JPEG image files), and for each file found, execute the program sandwiched between the do and done lines, with the variable $img set to the name of a JPEG image file. The program happens to be convert, with the variable $img set to the source filename and resized/$img set to the output filename. Thus, if you run this script in a directory with 5 JPEG files called one.jpg, two.jpg, three.jpg, four.jpg, five.jpg, then this 'for loop' will execute the convert program 5 times, once with $img set to the name of each file. This is equivalent to MANUALLY typing in the following 5 commands in the terminal:

  convert one.jpg -resize 25% resized/one.jpg
  convert two.jpg -resize 25% resized/two.jpg
  convert three.jpg -resize 25% resized/three.jpg
  convert four.jpg -resize 25% resized/four.jpg
  convert five.jpg -resize 25% resized/five.jpg

Cool, eh? The 'for loop' runs 5 times because there are 5 JPEG files, and each time it runs, it substitutes the name of one of the files where the variable $img appears in the code. Fine, that just saved me 50 seconds or so of typing. But here comes the real clincher: this script works the same way regardless of how many images are in the directory. So if you have 100, 500, 1,000 or even more images in the directory, invoking the SAME script will be like the equivalent of typing all 100, 500, 1,000 or more commands by hand, respectively! You can write the script once, which takes about 5 minutes, and re-use it to save yourself endless hours of mundane typing. Your computer will never get tired or bored from executing all of these mundane, almost-identical tasks, because that is what it was designed to do! That's the real power of scripting :)

It took me 5 minutes to write and test that script, and when my friend runs it on his directory of 500 photos, it will save him over 4 hours of mundane clicking and typing to perform the equivalent resizing task on a GUI program such as Photoshop. Now THAT is an enormous improvement in productivity. He can actually spend those 4 hours doing something fun with his life rather than mimicking a machine. He can let the computer do all the boring work!

The enormous setup cost of scripting

Sounds great! So why doesn't everyone use command-line programs and write scripts to automate these mundane tasks and save themselves countless hours of time? Well, I think the reason why is there is an ENORMOUS setup cost that must be paid before people can gain the productivity benefits of scripting. Here are some components of this setup cost:

  • First of all, you need to install a command-line environment. If you are like most computer users, you use Windows, which doesn't have a very good built-in environment. You will most likely need to download and install Cygwin. This isn't easy if you're not too computer-saavy. In fact, if you tell someone that he/she needs to install Cygwin as a pre-requisite to automating a task such as photo resizing, that's probably sufficiently difficult for the person to simply give up.

  • Next, you need to familiarize yourself with how to maneuver in a text-only command-line environment by typing commands for changing directories, listing the contents of directories, creating new directories, copying and moving files, etc. These skills are far less intuitive than simply clicking around a GUI with a mouse and dragging-and-dropping icons of files to organize your hard drive. After all, GUIs were invented largely to help facilitate people's interaction with directories and files on their computers.

  • You need to figure out how to install command-line programs such as the ImageMagick suite (which contains the convert program) that I used in this article. Command-line programs are usually more difficult to install than GUI programs that come pre-packaged with user-friendly graphical 'Click here to install me' installers. At best, you will need to know how to use a software package management system (e.g., Cygwin, apt-get, Fink) to directly install the program binaries, and at worse, you will need to manually compile the program from source code (omg, wtf???).

  • You need to learn at least one programming language, probably a scripting language such as BASH, Perl, or Python. You need to learn at least the basics and be familiar with concepts from computer programming, which most people are not going to invest the effort to learn.

  • The final kick in the face comes from the difficulty of learning just what the heck you need to type to get the command-line program to do exactly what you want. If you know the exact syntax of a particular command and its options, it's easy to type, but if you don't, then it is very difficult to just 'figure it out' by educated guessing. With command-line programs, you rely on 'recall' of an invisible list of command names rather than 'recognition' of visible options on the screen, which is much more difficult for beginners. Even worse than the plethora of available command-line options to programs, some options are only active when other options are used in conjunction, and these dependencies are often not well-documented, causing no end of frustrations for confused users wondering why their programs didn't work as expected.

  • In contrast, with a GUI program, the user can visually see what options are available and use the mouse to click around and explore in a 2-D visual space, often being able to make educated guesses as to how to achieve certain tasks. For example, if you are trying to use the convert program and forget the exact syntax of the -resize 25% portion, it's VERY hard to simply guess that exact wording. If you mis-type by just one letter and instead write resize 25%, then the program simply won't work. In contrast, with a GUI program, you don't need to remember the exact name of the menu items. You can visually scan all the choices, find one named "Image", click on it, and find an entry called "Image size ..." that suits your needs. You don't need to remember the exact name "Image size ...". It could be named "Size of image ..." or "Image Size" or "Image->Size" and you would still be fine. No danger of mis-typing. That's the main usability advantage of GUI programs.

No silver bullet

All that I have written so far should be obvious and unsurprising for anyone with a decent amount of programming experience. Scripting is a powerful tool to automate mundane tasks. Duh! Programmers have already paid the aforementioned setup cost by familiarizing themselves with a command-line environment and programming over years of work experience. But the challenge is how to lower this enormous setup cost that needs to be paid by ordinary computer users before they can take advantage of the enormous productivity benefits of scripting. There is ongoing research in topics such as 'end-user programming' that attempt to bridge this gap between obscure command-line and user-friendly GUI programs.

Unfortunately, I still see no easy solution in sight, no silver bullet to shatter this barrier to entry. I don't expect my parents or non-programmer friends to be able to easily automate mundane tasks on the computer, and as far as I can tell, they are still going to have to spend hours repeatedly clicking and dragging until they get fed up with their computers and punch a hole through the monitor. Even more pessimistically, I don't think that most people are even aware that these tasks can be automated, so they don't even know that they can ask experts for help, simply because they have not been exposed to ideas from computer programming.

My hope for the future of programming education

I personally didn't write my first scripts until my senior year of college, four years after I had started programming for school and work, even though I was a Computer Science major! And why not? Because nobody taught me about the power of scripting in any class I had taken, so I was simply unaware of the possibilities and thus unable and unwilling to pay that enormous setup cost. Most people, myself included, learn to program in an isolated, sanitized academic environment where they use programming to solve contrived toy problems presented for the sake of pedagogy. Students learn how to calculate Fibonacci numbers, to sort lists of names, and to calculate simple statistics (all great exercises for learning about algorithms and programming techniques, but totally useless in real life), but they don't learn to use scripting to perform useful computing tasks that they could apply outside of the classroom. In other words, students learn programming solely as a professional skill to use in the office, not as a means of assisting them with their own computing tasks at home. One of my hopes for higher education is that programming courses start teaching scripting as part of the curriculum, with an emphasis on real-world productivity-enhancing applications such as the solution to my friend's problem: how to quickly resize a large batch of hundreds of photos.

Created: 2007-05-13
Last modified: 2007-06-17
Related pages tagged as programming: