Philip Guo (Phil Guo, Philip J. Guo, Philip Jia Guo, pgbovine)

Java and Software Engineering Notes

For MIT 6.170 - Laboratory in Software Engineering
Summary
For the Spring 2006 semester, I was a TA for 6.170, the software engineering laboratory course at MIT. This is a collection of notes about software engineering concepts and Java programming that I have compiled from emails and feedback that I have sent to students in my recitation.

General Java 1.5 Coding Style Tips

  • Always check pre-conditions if it's not too much of a hassle to do so, and abort with an IllegalArgumentException (or some other kind of exception) if a pre-condition is violated. This strategy, called 'failing fast', is useful for preventing errors from propagating. For example:
    /** @requires rt.getExpt() >= 0
        @effects Constructs a new Poly equal to "rt".
        If rt.getCoeff() is zero, constructs a "0" polynomial.
    */
    public RatPoly(RatTerm rt) {
      if (!(rt.getExpt() >= 0)) {
        throw new IllegalArgumentException();
      }
      ... your real code ...
    }
    
    Notice how I just surrounded the pre-condition (@requires) with parentheses and a NOT symbol (!) instead of messing around with >= and <, etc... When you have a complex boolean expression, the simplest and clearest way to negate it is to simply wrap it in parens and put a ! in front of it. Some of you had slight bugs caused by not inverting complex boolean conditions correctly. Those could have easily been avoided by using the technique I just described.
  • Always make explicit checks for special cases even if you know that the right thing is going to happen. It's better to have conditions stated in code than simply in comments, because code compiles but comments don't. :) For example:

    /** Division operation.
        @requires arg != null
        @return a RatTerm equals to (this / arg).
         If arg is zero, or if either argument is NaN,
         then returns NaN.
    */
    public RatTerm div(RatTerm arg) {
      if (isNaN() || arg.isNaN() || arg.isZero()) {
        return RatTerm.NaN;
      }
      else {
        return new RatTerm(coeff.div(arg.coeff), expt-arg.expt);
      }
    }
    
    It actually turns out that this code will do the right thing even if you don't check for the special NaN and zero cases, but it's not immediately obvious why (until you look at the specs. for RatNum). It's so easy to make those special case checks, and they make the code's intentions much clearer.
  • Don't use == and != to compare objects. Always use the equals() method. == only returns true iff two variables refer to the SAME object in memory. Often, this is not what you want. For example:

      if (rt.getCoeff() != RatNum.ZERO)
    
    tests if a coefficient of a RatTerm is the SAME object im memory as RatNum.ZERO. It's entirely possible to construct another RatNum object which is equivalent to RatNum.ZERO but resides elsewhere in memory, in which case this code doesn't work as intended. Instead, it is preferred to use the equals() method, although it's slightly more verbose:
      if (!(rt.getCoeff().equals(RatNum.ZERO)))
    
    In the worst case, if a particular class doesn't override equals(), it will act just like the == operator, so there's no reason to use == and != on objects.
  • Use the foreach loop whenever possible, which should be whenever you want to iterate over all elements in a list but not to mutate the list itself (i.e., by adding or removing elements). It makes your code much cleaner.

    Old-school for loop:
    
    for (int i = 0; i < terms.size(); i++) {
      RatTerm t = terms.get(i);
      ... use t to your heart's delight ...
    }
    
    New Java 1.5 foreach loop:
    
    for (RatTerm t : terms) {
      ... use t to your heart's delight ...
    }
    
    With the foreach loop, you don't have to worry about off-by-1 errors or other common mistakes with traditional for loops.
  • Don't duplicate code. Whenever you catch yourself copying and pasting an expression several times, think about a way to re-factor your code to avoid duplication (i.e., by placing the values of these expressions into variables). Duplicated code is often a pain to maintain because any bug fixes to one copy must be appropriately applied to all other copies. Encapsulate common functionality in private helper methods and make calls to those.

  • Don't use clone(). If you don't know what it is, good; otherwise, read Effective Java to see why you shouldn't use it.

Java 1.5 Standard Library Tips

Don't re-invent the wheel. Java has a great standard library that provides tons of functionality. Many of you implemented your own versions of common operations, especially on lists. Before you do that, read the Java 1.5 Standard Library API for the classes of interest (and their super-classes); chances are, there will be some method or combinations of methods that accomplishes what you want to do. Well-written code should contain lots of calls to the Java Standard Library instead of to your own hacked-up versions of methods.

  • Especially don't re-invent the wheel for list operations. Read the Java List API very carefully! It is probably the data structure that you will end up using the most in Java programming. I have seen many students re-implement basic list operations (e.g., copying, appending, removing, finding an element in a list, seeing if a list contains an element) from scratch when there are simple method calls available to accomplish the same task. Before you write any private helper method that manipulates lists, read that API and make sure that there is no method or simple combination of several methods that does the same task.

  • On a related note, the Java Collections class contains many useful utility methods that can be used to manipulate lists and other collections (including sort, max, min, binary search, and Collections.unmodifiableList() for preventing rep. exposure through iterators). Get to know that API intimately as well.

  • Use the List set() method to mutate a list in-place. Using an iterator or a foreach loop won't work.

  • When implementing equals() for a class that contains a List of elements, many of you manually iterated through the list in both objects being compared and called equals() on each element. You can accomplish the same thing (using much less code) by simply calling equals() on the two lists being compared. The specs. for the List interface's equals() methods states that it does an iteration and element-wise comparison. (You can do a similar thing for hashCode() as well.)

Abstraction Functions

Your abstraction function must explicitly mention ALL of your specfields! It can't just be some informal description of what the class is supposed to represent. Informal descriptions belong in regular comments or in the class overview, not in abstraction functions.

The easiest two-step way to write an abstraction function:

  1. Write down "AF(r) = ADT x such that:", but replace "ADT" with some description of your ADT.
  2. For every specfield, write one line that maps some combination of your object r's concrete fields (e.g., r.foo, r.bar, etc...) to that specfield's value. That is, how could I combine concrete fields of r to get the specfield?

That's it! e.g., for a class Money, let's say:

@specfield amount : real

private int dollars;
private int cents;

Then here is an example of an abstraction function:

AF(r) = Money m such that:
  m.amount = r.dollars + (r.cents / 100)

Even if you write your abstraction function in English, make sure to properly disambiguate between concrete fields and specfields. Remember that the abstraction function provides a mapping between combinations of concrete fields to all specfields. Don't leave out any specfields in your abstraction function!

e.g., for GeoPoint, the specfields are:

@specfield  latitude  : real
@specfield  longitude : real

and the concrete fields are:

private int latitude;
private int longitude;

This is an ambiguous and vague abstraction function:

 latitude represents latitude measured in degrees;
 longitude represents longitude measured in degrees

It is not clear which latitude or longitude one is referring to, and it's not even clear that it's a function. Here is a more preferred abstraction function:

AF(r) = some point p on the Earth such that:
  r.latitude represents p.latitude measured in millionths of degrees
  r.longitude represents p.longitude measured in millionths of degrees

Even though this is also written in pseudo-English, it is now clearly a function from a concrete representation to an abstract state.

Representation Invariants

Be very nitpicky when writing your rep. invariants. Here is my suggested way for doing it:

Examine EVERY concrete field and try to pick out what properties MUST be true about each field and its relationships with other fields. For fields that are not primitives, do they have to be non-null? If so, that's part of the rep. invariant. You don't need to repeat the rep. invariant of a field whose type already has a rep. invariant written for it (ahh, the power of induction).

You should write down rep. invariants even for states that are disallowed by the preconditions of the public constructors. For instance, recall GeoSegment:

public class GeoSegment {
  ...
  private final String name;
  private final GeoPoint p1;
  private final GeoPoint p2;

  /**
   * @requires name != null && p1 != null && p2 != null
   * ...
   **/
  public GeoSegment(String name, GeoPoint p1, GeoPoint p2) {
    ...
  }
}

A reasonable rep. invariant is that name, p1, and p2 cannot be null, but some of you purposely didn't write that down because you figured that any valid GeoSegment must be created with those fields non-null because that's the precondition (@requires) of its only public constructor. However, simply stating in comments that something cannot be null doesn't mean that the code will honor that promise. The whole point of a rep. invariant and accompanying checkRep() is to catch errors in your code. What if your constructor had a bug in it that didn't properly initialize name, p1, and p2? Then the rep. invariant should be violated, and the checkRep() should fail.

Tips for writing and using the checkRep() method to check rep. invariants:

  • You should place calls to checkRep() at the beginning and end of every public method and at the end of every public constructor. Many of you didn't want to place the calls at the beginning of (sometimes trivial) observer methods, but what if the rep. of your object was messed up prior to calling an observer? It's always good to be able to catch errors early.
  • If you are afraid that checkRep() calls may slow down your program significantly, a good idea is to declare a static final boolean field and wrap all checkRep() calls in if statements guarded by that field. For example:

      static final boolean debug = true;
      ...
      if (debug) {checkRep();}
      ...
      if (debug) {checkRep();}
      ...
    
    Now whenever you want to demo your code in some performance-critical context, simply set the value of the debug field to false. Not only will checkRep() not be called, but actually the Java runtime system is smart enough to not even execute the if statement because it knows at compile-time that the value of debug is false and can never be changed to true because it is final. (This safely and elegantly simulates the popular #ifdef/#endif pre-processor conditional compilation construct familiar to C programmers.)
  • Don't simply ignore checking a particular rep. invariant inside of your checkRep() method just because it might be too inefficient (chances are, you haven't actually tried to profile your program so you don't know whether there will actually be a noticeable slowdown). The whole point of checkRep() is to enforce ALL of your rep. invariants, not just some of them. If your program really grinds to a halt, you can disable the checkRep() calls by commenting them out or guarding them with a debug field set to false.

Avoiding Representation Exposure

  • If all of your fields are private final primitives, then you've got no problems with rep. exposure :)
  • If your fields are private final references to immutable objects, then you still don't have problems with rep. exposure :) You don't need to worry about making defensive copies of immutable objects.

  • You need to be careful about rep. exposure whenever you have any fields that are mutable object references (private and final only protect the references, not the actual objects, from being mutated). Make a defensive copy of the mutable object whenever you take an instance in from the outside world in a constructor and whenever you return it to the outside world in an observer method.

  • If you call iterator() to return an iterator to a private field that is a List of some sort (even a list that holds immutable objects), then you have most likely caused rep. exposure because a party outside of your class can now call remove() using that iterator to remove elements from your private list. One solution is to simply make a copy of the list and return an iterator to that copy. The preferred Java way to do it is to use the Collections.unmodifiableList() method to create an unmodifiable view of the list. When an iterator is obtained for that view, the remove() method will be disabled so there is no way to mutate the original list through that view.

Code Efficiency

The main focus of 6.170 (and of most real-world software engineering) is on building reliable, maintainable, robust software systems, not squeezing out every ounce of performance possible. As my fellow TA Matthew puts it, computers double in speed every 18 months, but humans don't ever double in speed. What he means is that programmer time is much more valuable than machine time. If you have to make a piece of code more complicated and thus more likely to be buggy just to (hopefully) get some performance gains, then it's not worth it.

The only time you should think about optimizations is after you've already built up a working system along with a test suite. Then you can profile your system to see where it's spending most of its time and direct your optimization efforts to those areas. You will have the advantage of both having an already-working version and also a test suite to serve as your safety net. That being said, in later problem sets, you will sometimes have to consider efficiency, but in those cases we will always tell you that speed is one of the requirements. Otherwise, you should strive to write the clearest code possible.

A general rule of thumb is to never make a design decision that makes your code more complicated for the sole purpose of improving run-time performance.

Created: 2006-03-07
Last modified: 2006-03-07
Related pages tagged as computing education:
Related pages tagged as programming: