MindView Inc.
[ Viewing Hints ] [ 2nd Edition ] [ Free Newsletter ]
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

Thinking in Java, 1st edition

©1998 by Bruce Eckel

[ Previous Chapter ] [ Short TOC ] [ Table of Contents ] [ Index ] [ Next Chapter ]

8: Holding
your objects

It’s a fairly simple program that has only a fixed quantity of objects with known lifetimes.

In general, your programs will always be creating new objects based on some criteria that will be known only at the time the program is running. You won’t know until run-time the quantity or even the exact type of the objects you need. To solve the general programming problem, you need to create any number of objects, anytime, anywhere. So you can’t rely on creating a named handle to hold each one of your objects:

MyObject myHandle;

since you’ll never know how many of these things you’ll actually need.

To solve this rather essential problem, Java has several ways to hold objects (or rather, handles to objects). The built-in type is the array, which has been discussed before and will get additional coverage in this chapter. Also, the Java utilities library has some collection classes (also known as container classes, but that term is used by the AWT so “collection” will be used here) that provide more sophisticated ways to hold and even manipulate your objects. This will comprise the remainder of this chapter.

Arrays

Most of the necessary introduction to arrays is in the last section of Chapter 4, which shows how you define and initialize an array. Holding objects is the focus of this chapter, and an array is just one way to hold objects. But there are a number of other ways to hold objects, so what makes an array special?

There are two issues that distinguish arrays from other types of collections: efficiency and type. The array is the most efficient way that Java provides to store and access a sequence of objects (actually, object handles). The array is a simple linear sequence, which makes element access fast, but you pay for this speed: when you create an array object, its size is fixed and cannot be changed for the lifetime of that array object. You might suggest creating an array of a particular size and then, if you run out of space, creating a new one and moving all the handles from the old one to the new one. This is the behavior of the Vector class, which will be studied later in the chapter. However, because of the overhead of this size flexibility, a Vector is measurably less efficient than an array.

The vector class in C++ does know the type of objects it holds, but it has a different drawback when compared with arrays in Java: the C++ vector’s operator[] doesn’t do bounds checking, so you can run past the end. (It’s possible, however, to ask how big the vector is, and the at( ) method does perform bounds checking.) In Java, you get bounds checking regardless of whether you’re using an array or a collection – you’ll get a RuntimeException if you exceed the bounds. As you’ll learn in Chapter 9, this type of exception indicates a programmer error and thus you don’t need to check for it in your code. As an aside, the reason the C++ vector doesn’t check bounds with every access is speed – in Java you have the constant performance overhead of bounds checking all the time for both arrays and collections.

The other generic collection classes that will be studied in this chapter, Vector, Stack, and Hashtable, all deal with objects as if they had no specific type. That is, they treat them as type Object, the root class of all classes in Java. This works fine from one standpoint: you need to build only one collection, and any Java object will go into that collection. (Except for primitives – these can be placed in collections as constants using the Java primitive wrapper classes, or as changeable values by wrapping in your own class.) This is the second place where an array is superior to the generic collections: when you create an array, you create it to hold a specific type. This means that you get compile-time type checking to prevent you from putting the wrong type in, or mistaking the type that you’re extracting. Of course, Java will prevent you from sending an inappropriate message to an object, either at compile-time or at run-time. So it’s not as if it’s riskier one way or the other, it’s just nicer if the compiler points it out to you, faster at run-time, and there’s less likelihood that the end user will get surprised by an exception.

For efficiency and type checking it’s always worth trying to use an array if you can. However, when you’re trying to solve a more general problem arrays can be too restrictive. After looking at arrays, the rest of this chapter will be devoted to the collection classes provided by Java.

Arrays are first-class objects

Regardless of what type of array you’re working with, the array identifier is actually a handle to a true object that’s created on the heap. The heap object can be created either implicitly, as part of the array initialization syntax, or explicitly with a new expression. Part of the heap object (in fact, the only field or method you can access) is the read-only length member that tells you how many elements can be stored in that array object. The ‘[]’ syntax is the only other access that you have to the array object.

The following example shows the various ways that an array can be initialized, and how the array handles can be assigned to different array objects. It also shows that arrays of objects and arrays of primitives are almost identical in their use. The only difference is that arrays of objects hold handles while arrays of primitives hold the primitive values directly.

//: ArraySize.java
// Initialization & re-assignment of arrays
package c08;

class Weeble {} // A small mythical creature

public class ArraySize {
  public static void main(String[] args) {
    // Arrays of objects:
    Weeble[] a; // Null handle
    Weeble[] b = new Weeble[5]; // Null handles
    Weeble[] c = new Weeble[4];
    for(int i = 0; i < c.length; i++)
      c[i] = new Weeble();
    Weeble[] d = {
      new Weeble(), new Weeble(), new Weeble()
    };
    // Compile error: variable a not initialized:
    //!System.out.println("a.length=" + a.length);
    System.out.println("b.length = " + b.length);
    // The handles inside the array are 
    // automatically initialized to null:
    for(int i = 0; i < b.length; i++)
      System.out.println("b[" + i + "]=" + b[i]);
    System.out.println("c.length = " + c.length);
    System.out.println("d.length = " + d.length);
    a = d;
    System.out.println("a.length = " + a.length);
    // Java 1.1 initialization syntax:
    a = new Weeble[] {
      new Weeble(), new Weeble()
    };
    System.out.println("a.length = " + a.length);

    // Arrays of primitives:
    int[] e; // Null handle
    int[] f = new int[5];
    int[] g = new int[4];
    for(int i = 0; i < g.length; i++)
      g[i] = i*i;
    int[] h = { 11, 47, 93 };
    // Compile error: variable e not initialized:
    //!System.out.println("e.length=" + e.length);
    System.out.println("f.length = " + f.length);
    // The primitives inside the array are
    // automatically initialized to zero:
    for(int i = 0; i < f.length; i++)
      System.out.println("f[" + i + "]=" + f[i]);
    System.out.println("g.length = " + g.length);
    System.out.println("h.length = " + h.length);
    e = h;
    System.out.println("e.length = " + e.length);
    // Java 1.1 initialization syntax:
    e = new int[] { 1, 2 };
    System.out.println("e.length = " + e.length);
  }
} ///:~

Here’s the output from the program:

b.length = 5
b[0]=null
b[1]=null
b[2]=null
b[3]=null
b[4]=null
c.length = 4
d.length = 3
a.length = 3
a.length = 2
f.length = 5
f[0]=0
f[1]=0
f[2]=0
f[3]=0
f[4]=0
g.length = 4
h.length = 3
e.length = 3
e.length = 2

The array a is initially just a null handle, and the compiler prevents you from doing anything with this handle until you’ve properly initialized it. The array b is initialized to point to an array of Weeble handles, but no actual Weeble objects are ever placed in that array. However, you can still ask what the size of the array is, since b is pointing to a legitimate object. This brings up a slight drawback: you can’t find out how many elements are actually in the array, since length tells you only how many elements can be placed in the array; that is, the size of the array object, not the number of elements it actually holds. However, when an array object is created its handles are automatically initialized to null so you can see whether a particular array slot has an object in it by checking to see whether it’s null. Similarly, an array of primitives is automatically initialized to zero for numeric types, null for char, and false for boolean.

Array c shows the creation of the array object followed by the assignment of Weeble objects to all the slots in the array. Array d shows the “aggregate initialization” syntax that causes the array object to be created (implicitly with new on the heap, just like for array c) and initialized with Weeble objects, all in one statement.

The expression

a = d;

shows how you can take a handle that’s attached to one array object and assign it to another array object, just as you can do with any other type of object handle. Now both a and d are pointing to the same array object on the heap.

Java 1.1 adds a new array initialization syntax, which could be thought of as a “dynamic aggregate initialization.” The Java 1.0 aggregate initialization used by d must be used at the point of d’s definition, but with the Java 1.1 syntax you can create and initialize an array object anywhere. For example, suppose hide( ) is a method that takes an array of Weeble objects. You could call it by saying:

hide(d);

but in Java 1.1 you can also dynamically create the array you want to pass as the argument:

hide(new Weeble[] { new Weeble(), new Weeble() });

This new syntax provides a more convenient way to write code in some situations.

The second part of the above example shows that primitive arrays work just like object arrays except that primitive arrays hold the primitive values directly.

Collections of primitives

Collection classes can hold only handles to objects. An array, however, can be created to hold primitives directly, as well as handles to objects. It is possible to use the “wrapper” classes such as Integer, Double, etc. to place primitive values inside a collection, but as you’ll see later in this chapter in the WordCount.java example, the wrapper classes for primitives are only somewhat useful anyway. Whether you put primitives in arrays or wrap them in a class that’s placed in a collection is a question of efficiency. It’s much more efficient to create and access an array of primitives than a collection of wrapped primitives.

Of course, if you’re using a primitive type and you need the flexibility of a collection that automatically expands when more space is needed, the array won’t work and you’re forced to use a collection of wrapped primitives. You might think that there should be a specialized type of Vector for each of the primitive data types, but Java doesn’t provide this for you. Some sort of templatizing mechanism might someday provide a better way for Java to handle this problem.[32]

Returning an array

Suppose you’re writing a method and you don’t just want to return one thing, but a whole bunch of things. Languages like C and C++ make this difficult because you can’t just return an array, only a pointer to an array. This introduces problems because it becomes messy to control the lifetime of the array, which easily leads to memory leaks.

Java takes a similar approach, but you just “return an array.” Actually, of course, you’re returning a handle to an array, but with Java you never worry about responsibility for that array – it will be around as long as you need it, and the garbage collector will clean it up when you’re done.

As an example, consider returning an array of String:

//: IceCream.java
// Returning arrays from methods

public class IceCream {
  static String[] flav = {
    "Chocolate", "Strawberry",
    "Vanilla Fudge Swirl", "Mint Chip",
    "Mocha Almond Fudge", "Rum Raisin",
    "Praline Cream", "Mud Pie" 
  };
  static String[] flavorSet(int n) {
    // Force it to be positive & within bounds:
    n = Math.abs(n) % (flav.length + 1);
    String[] results = new String[n];
    int[] picks = new int[n];
    for(int i = 0; i < picks.length; i++)
      picks[i] = -1;
    for(int i = 0; i < picks.length; i++) {
      retry:
      while(true) {
        int t =
          (int)(Math.random() * flav.length);
        for(int j = 0; j < i; j++)
          if(picks[j] == t) continue retry;
        picks[i] = t;
        results[i] = flav[t];
        break;
      }
    }
    return results;
  }
  public static void main(String[] args) {
    for(int i = 0; i < 20; i++) {
      System.out.println(
        "flavorSet(" + i + ") = ");
      String[] fl = flavorSet(flav.length);
      for(int j = 0; j < fl.length; j++)
        System.out.println("\t" + fl[j]);
    }
  }
} ///:~

The method flavorSet( ) creates an array of String called results. The size of this array is n, determined by the argument you pass into the method. Then it proceeds to choose flavors randomly from the array flav and place them into results, which it finally returns. Returning an array is just like returning any other object – it’s a handle. It’s not important that the array was created within flavorSet( ), or that the array was created anyplace else, for that matter. The garbage collector takes care of cleaning up the array when you’re done with it, and the array will persist for as long as you need it.

As an aside, notice that when flavorSet( ) chooses flavors randomly, it ensures that a random choice hasn’t been picked before. This is performed in a seemingly infinite while loop that keeps making random choices until it finds one that’s not already in the picks array. (Of course, a String comparison could also have been performed to see if the random choice was already in the results array, but String comparisons are inefficient.) If it’s successful it adds the entry and breaks out to go find the next one (i gets incremented). But if t is a number that’s already in picks, then a labeled continue is used to jump back two levels, which forces a new t to be selected. It’s particularly convincing to watch this happen with a debugger.

main( ) prints out 20 full sets of flavors, so you can see that flavorSet( ) chooses the flavors in a random order each time. It’s easiest to see this if you redirect the output into a file. And while you’re looking at the file, remember, you’re not really hungry. (You just want the ice cream, you don’t need it.)

Collections

To summarize what we’ve seen so far, your first, most efficient choice to hold a group of objects should be an array, and you’re forced into this choice if you want to hold a group of primitives. In the remainder of the chapter we’ll look at the more general case, when you don’t know at the time you’re writing the program how many objects you’re going to need, or if you need a more sophisticated way to store your objects. Java provides four types of collection classes to solve this problem: Vector, BitSet, Stack, and Hashtable. Although compared to other languages that provide collections this is a fairly meager supply, you can nonetheless solve a surprising number of problems using these tools.

Among their other characteristics – Stack, for example, implements a LIFO (last-in, first-out) sequence, and Hashtable is an associative array that lets you associate any object with any other object – the Java collection classes will automatically resize themselves. Thus, you can put in any number of objects and you don’t need to worry about how big to make the collection while you’re writing the program.

Disadvantage: unknown type

The “disadvantage” to using the Java collections is that you lose type information when you put an object into a collection. This happens because, when the collection was written, the programmer of that collection had no idea what specific type you wanted to put in the collection, and making the collection hold only your type would prevent it from being a general-purpose tool. So instead, the collection holds handles to objects of type Object, which is of course every object in Java, since it’s the root of all the classes. (Of course, this doesn’t include primitive types, since they aren’t inherited from anything.) This is a great solution, except for these reasons:

  1. Since the type information is thrown away when you put an object handle into a collection, any type of object can be put into your collection, even if you mean it to hold only, say, cats. Someone could just as easily put a dog into the collection.
  2. Since the type information is lost, the only thing the collection knows it holds is a handle to an Object. You must perform a cast to the correct type before you use it.

On the up side, Java won’t let you misuse the objects that you put into a collection. If you throw a dog into a collection of cats, then go through and try to treat everything in the collection as a cat, you’ll get an exception when you get to the dog. In the same vein, if you try to cast the dog handle that you pull out of the cat collection into a cat, you’ll get an exception at run-time.

Here’s an example:

//: CatsAndDogs.java
// Simple collection example (Vector)
import java.util.*;

class Cat {
  private int catNumber;
  Cat(int i) {
    catNumber = i;
  }
  void print() {
    System.out.println("Cat #" + catNumber);
  }
}

class Dog {
  private int dogNumber;
  Dog(int i) {
    dogNumber = i;
  }
  void print() {
    System.out.println("Dog #" + dogNumber);
  }
}

public class CatsAndDogs {
  public static void main(String[] args) {
    Vector cats = new Vector();
    for(int i = 0; i < 7; i++)
      cats.addElement(new Cat(i));
    // Not a problem to add a dog to cats:
    cats.addElement(new Dog(7));
    for(int i = 0; i < cats.size(); i++)
      ((Cat)cats.elementAt(i)).print();
    // Dog is detected only at run-time
  }
} ///:~

You can see that using a Vector is straightforward: create one, put objects in using addElement( ), and later get them out with elementAt( ). (Note that Vector has a method size( ) to let you know how many elements have been added so you don’t inadvertently run off the end and cause an exception.)

The classes Cat and Dog are distinct – they have nothing in common except that they are Objects. (If you don’t explicitly say what class you’re inheriting from, you automatically inherit from Object.) The Vector class, which comes from java.util, holds Objects, so not only can you put Cat objects into this collection using the Vector method addElement( ), but you can also add Dog objects without complaint at either compile-time or run-time. When you go to fetch out what you think are Cat objects using the Vector method elementAt( ), you get back a handle to an Object that you must cast to a Cat. Then you need to surround the entire expression with parentheses to force the evaluation of the cast before calling the print( ) method for Cat, otherwise you’ll get a syntax error. Then, at run-time, when you try to cast the Dog object to a Cat, you’ll get an exception.

This is more than just an annoyance. It’s something that can create some difficult-to-find bugs. If one part (or several parts) of a program inserts objects into a collection, and you discover only in a separate part of the program through an exception that a bad object was placed in the collection, then you must find out where the bad insert occurred. You do this by code inspection, which is about the worst debugging tool you have. On the upside, it’s convenient to start with some standardized collection classes for programming, despite the scarcity and awkwardness.

Sometimes it works right anyway

It turns out that in some cases things seem to work correctly without casting back to your original type. The first case is quite special: the String class has some extra help from the compiler to make it work smoothly. Whenever the compiler expects a String object and it hasn’t got one, it will automatically call the toString( ) method that’s defined in Object and can be overridden by any Java class. This method produces the desired String object, which is then used wherever it was wanted.

Thus, all you need to do to make objects of your class print out is to override the toString( ) method, as shown in the following example:

//: WorksAnyway.java
// In special cases, things just seem
// to work correctly.
import java.util.*;

class Mouse {
  private int mouseNumber;
  Mouse(int i) {
    mouseNumber = i;
  }
  // Magic method:
  public String toString() {
    return "This is Mouse #" + mouseNumber;
  }
  void print(String msg) {
    if(msg != null) System.out.println(msg);
    System.out.println(
      "Mouse number " + mouseNumber);
  }
}

class MouseTrap {
  static void caughtYa(Object m) {
    Mouse mouse = (Mouse)m; // Cast from Object
    mouse.print("Caught one!");
  }
}

public class WorksAnyway {
  public static void main(String[] args) {
    Vector mice = new Vector();
    for(int i = 0; i < 3; i++)
      mice.addElement(new Mouse(i));
    for(int i = 0; i < mice.size(); i++) {
      // No cast necessary, automatic call
      // to Object.toString():
      System.out.println(
        "Free mouse: " + mice.elementAt(i));
      MouseTrap.caughtYa(mice.elementAt(i));
    }
  }
} ///:~

You can see the redefinition of toString( ) in Mouse. In the second for loop in main( ) you find the statement:

System.out.println("Free mouse: " + mice.elementAt(i));

After the ‘+’ sign the compiler expects to see a String object. elementAt( ) produces an Object, so to get the desired String the compiler implicitly calls toString( ). Unfortunately, you can work this kind of magic only with String; it isn’t available for any other type.

A second approach to hiding the cast has been placed inside Mousetrap. The caughtYa( ) method accepts not a Mouse, but an Object, which it then casts to a Mouse. This is quite presumptuous, of course, since by accepting an Object anything could be passed to the method. However, if the cast is incorrect – if you passed the wrong type – you’ll get an exception at run-time. This is not as good as compile-time checking but it’s still robust. Note that in the use of this method:

MouseTrap.caughtYa(mice.elementAt(i));

no cast is necessary.

Making a type-conscious Vector

You might not want to give up on this issue just yet. A more ironclad solution is to create a new class using the Vector, such that it will accept only your type and produce only your type:

//: GopherVector.java
// A type-conscious Vector
import java.util.*;

class Gopher {
  private int gopherNumber;
  Gopher(int i) {
    gopherNumber = i;
  }
  void print(String msg) {
    if(msg != null) System.out.println(msg);
    System.out.println(
      "Gopher number " + gopherNumber);
  }
}

class GopherTrap {
  static void caughtYa(Gopher g) {
    g.print("Caught one!");
  }
}

class GopherVector {
  private Vector v = new Vector();
  public void addElement(Gopher m) {
    v.addElement(m);
  }
  public Gopher elementAt(int index) {
    return (Gopher)v.elementAt(index);
  }
  public int size() { return v.size(); }
  public static void main(String[] args) {
    GopherVector gophers = new GopherVector();
    for(int i = 0; i < 3; i++)
      gophers.addElement(new Gopher(i));
    for(int i = 0; i < gophers.size(); i++)
      GopherTrap.caughtYa(gophers.elementAt(i));
  }
} ///:~

This is similar to the previous example, except that the new GopherVector class has a private member of type Vector (inheriting from Vector tends to be frustrating, for reasons you’ll see later), and methods just like Vector. However, it doesn’t accept and produce generic Objects, only Gopher objects.

Because a GopherVector will accept only a Gopher, if you were to say:

gophers.addElement(new Pigeon());

you would get an error message at compile time. This approach, while more tedious from a coding standpoint, will tell you immediately if you’re using a type improperly.

Note that no cast is necessary when using elementAt( ) – it’s always a Gopher.

Parameterized types

This kind of problem isn’t isolated – there are numerous cases in which you need to create new types based on other types, and in which it is useful to have specific type information at compile-time. This is the concept of a parameterized type. In C++, this is directly supported by the language in templates. At one point, Java had reserved the keyword generic to someday support parameterized types, but it’s uncertain if this will ever occur.

Enumerators (iterators)

In any collection class, you must have a way to put things in and a way to get things out. After all, that’s the primary job of a collection – to hold things. In the Vector, addElement( ) is the way that you insert objects, and elementAt( ) is one way to get things out. Vector is quite flexible – you can select anything at any time, and select multiple elements at once using different indexes.

If you want to start thinking at a higher level, there’s a drawback: you need to know the exact type of the collection in order to use it. This might not seem bad at first, but what if you start out using a Vector, and later on in your program you decide, for efficiency, that you want to change to a List (which is part of the Java 1.2 collections library)? Or you’d like to write a piece of code that doesn’t know or care what type of collection it’s working with.

The concept of an iterator can be used to achieve this next level of abstraction. This is an object whose job is to move through a sequence of objects and select each object in that sequence without the client programmer knowing or caring about the underlying structure of that sequence. In addition, an iterator is usually what’s called a “light-weight” object; that is, one that’s cheap to create. For that reason, you’ll often find seemingly strange constraints for iterators; for example, some iterators can move in only one direction.

The Java Enumeration[33] is an example of an iterator with these kinds of constraints. There’s not much you can do with one except:

  1. Ask a collection to hand you an Enumeration using a method called elements( ). This Enumeration will be ready to return the first element in the sequence on your first call to its nextElement( ) method.
  2. Get the next object in the sequence with nextElement( ).
  3. See if there are any more objects in the sequence with hasMoreElements( ).

That’s all. It’s a simple implementation of an iterator, but still powerful. To see how it works, let’s revisit the CatsAndDogs.java program from earlier in the chapter. In the original version, the method elementAt( ) was used to select each element, but in the following modified version an enumeration is used:

//: CatsAndDogs2.java
// Simple collection with Enumeration
import java.util.*;

class Cat2 {
  private int catNumber;
  Cat2(int i) {
    catNumber = i;
  }
  void print() {
    System.out.println("Cat number " +catNumber);
  }
}

class Dog2 {
  private int dogNumber;
  Dog2(int i) {
    dogNumber = i;
  }
  void print() {
    System.out.println("Dog number " +dogNumber);
  }
}

public class CatsAndDogs2 {
  public static void main(String[] args) {
    Vector cats = new Vector();
    for(int i = 0; i < 7; i++)
      cats.addElement(new Cat2(i));
    // Not a problem to add a dog to cats:
    cats.addElement(new Dog2(7));
    Enumeration e = cats.elements();
    while(e.hasMoreElements())
      ((Cat2)e.nextElement()).print();
    // Dog is detected only at run-time
  }
} ///:~

You can see that the only change is in the last few lines. Instead of:

    for(int i = 0; i < cats.size(); i++)
      ((Cat)cats.elementAt(i)).print();

an Enumeration is used to step through the sequence:

while(e.hasMoreElements())
      ((Cat2)e.nextElement()).print();

With the Enumeration, you don’t need to worry about the number of elements in the collection. That’s taken care of for you by hasMoreElements( ) and nextElement( ).

As another example, consider the creation of a general-purpose printing method:

//: HamsterMaze.java
// Using an Enumeration
import java.util.*;

class Hamster {
  private int hamsterNumber;
  Hamster(int i) {
    hamsterNumber = i;
  }
  public String toString() {
    return "This is Hamster #" + hamsterNumber;
  }
}

class Printer {
  static void printAll(Enumeration e) {
    while(e.hasMoreElements())
      System.out.println(
        e.nextElement().toString());
  }
}

public class HamsterMaze {
  public static void main(String[] args) {
    Vector v = new Vector();
    for(int i = 0; i < 3; i++)
      v.addElement(new Hamster(i));
    Printer.printAll(v.elements());
  }
} ///:~

Look closely at the printing method:

static void printAll(Enumeration e) {
  while(e.hasMoreElements())
    System.out.println(
      e.nextElement().toString());
}

Note that there’s no information about the type of sequence. All you have is an Enumeration, and that’s all you need to know about the sequence: that you can get the next object, and that you can know when you’re at the end. This idea of taking a collection of objects and passing through it to perform an operation on each one is powerful and will be seen throughout this book.

This particular example is even more generic, since it uses the ubiquitous toString( ) method (ubiquitous only because it’s part of the Object class). An alternative way to call print (although probably slightly less efficient, if you could even notice the difference) is:

System.out.println("" + e.nextElement());

which uses the “automatic conversion to String” that’s wired into Java. When the compiler sees a String, followed by a ‘+’, it expects another String to follow and calls toString( ) automatically. (In Java 1.1, the first String is unnecessary; any object will be converted to a String.) You can also perform a cast, which has the effect of calling toString( ):

System.out.println((String)e.nextElement());

In general, however, you’ll want to do something more than call Object methods, so you’ll run up against the type-casting issue again. You must assume you’ve gotten an Enumeration to a sequence of the particular type you’re interested in, and cast the resulting objects to that type (getting a run-time exception if you’re wrong).

Types of collections

The standard Java 1.0 and 1.1 library comes with a bare minimum set of collection classes, but they’re probably enough to get by with for many of your programming projects. (As you’ll see at the end of this chapter, Java 1.2 provides a radically redesigned and filled-out library of collections.)

Vector

The Vector is quite simple to use, as you’ve seen so far. Although most of the time you’ll just use addElement( ) to insert objects, elementAt( ) to get them out one at a time, and elements( ) to get an Enumeration to the sequence, there’s also a set of other methods that can be useful. As usual with the Java libraries, we won’t use or talk about them all here, but be sure to look them up in the electronic documentation to get a feel for what they can do.

Crashing Java

The Java standard collections contain a toString( ) method so they can produce a String representation of themselves, including the objects they hold. Inside of Vector, for example, the toString( ) steps through the elements of the Vector and calls toString( ) for each one. Suppose you’d like to print out the address of your class. It seems to make sense to simply refer to this (in particular, C++ programmers are prone to this approach):

//: CrashJava.java
// One way to crash Java
import java.util.*;

public class CrashJava {
  public String toString() {
    return "CrashJava address: " + this + "\n";
  }
  public static void main(String[] args) {
    Vector v = new Vector();
    for(int i = 0; i < 10; i++)
      v.addElement(new CrashJava());
    System.out.println(v);
  }
} ///:~

It turns out that if you simply create a CrashJava object and print it out, you’ll get an endless sequence of exceptions. However, if you place the CrashJava objects in a Vector and print out that Vector as shown here, it can’t handle it and you don’t even get an exception; Java just crashes. (But at least it didn’t bring down my operating system.) This was tested with Java 1.1.

What’s happening is automatic type conversion for Strings. When you say:

"CrashJava address: " + this

The compiler sees a String followed by a ‘+’ and something that’s not a String, so it tries to convert this to a String. It does this conversion by calling toString( ), which produces a recursive call. When this occurs inside a Vector, it appears that the stack overflows without the exception-handling mechanism getting a chance to respond.

If you really do want to print the address of the object in this case, the solution is to call the Object toString( ) method, which does just that. So instead of saying this, you’d say super.toString( ). (This only works if you're directly inheriting from Object or if none of your parent classes have overridden the toString( ) method).

BitSet

A BitSet is really a Vector of bits, and it is used if you want to efficiently store a lot of on-off information. It’s efficient only from the standpoint of size; if you’re looking for efficient access, it is slightly slower than using an array of some native type.

In addition, the minimum size of the BitSet is that of a long: 64 bits. This implies that if you’re storing anything smaller, like 8 bits, a BitSet will be wasteful, so you’re better off creating your own class to hold your flags.

In a normal Vector, the collection will expand as you add more elements. The BitSet does this as well – sort of. That is, sometimes it works and sometimes it doesn’t, which makes it appear that the Java version 1.0 implementation of BitSet is just badly done. (It is fixed in Java 1.1.) The following example shows how the BitSet works and demonstrates the version 1.0 bug:

//: Bits.java
// Demonstration of BitSet
import java.util.*;

public class Bits {
  public static void main(String[] args) {
    Random rand = new Random();
    // Take the LSB of nextInt():
    byte bt = (byte)rand.nextInt();
    BitSet bb = new BitSet();
    for(int i = 7; i >=0; i--)
      if(((1 << i) &  bt) != 0)
        bb.set(i);
      else
        bb.clear(i);
    System.out.println("byte value: " + bt);
    printBitSet(bb);

    short st = (short)rand.nextInt();
    BitSet bs = new BitSet();
    for(int i = 15; i >=0; i--)
      if(((1 << i) &  st) != 0)
        bs.set(i);
      else
        bs.clear(i);
    System.out.println("short value: " + st);
    printBitSet(bs);

    int it = rand.nextInt();
    BitSet bi = new BitSet();
    for(int i = 31; i >=0; i--)
      if(((1 << i) &  it) != 0)
        bi.set(i);
      else
        bi.clear(i);
    System.out.println("int value: " + it);
    printBitSet(bi);

    // Test bitsets >= 64 bits:
    BitSet b127 = new BitSet();
    b127.set(127);
    System.out.println("set bit 127: " + b127);
    BitSet b255 = new BitSet(65);
    b255.set(255);
    System.out.println("set bit 255: " + b255);
    BitSet b1023 = new BitSet(512);
// Without the following, an exception is thrown
// in the Java 1.0 implementation of BitSet:
//    b1023.set(1023);
    b1023.set(1024);
    System.out.println("set bit 1023: " + b1023);
  }
  static void printBitSet(BitSet b) {
    System.out.println("bits: " + b);
    String bbits = new String();
    for(int j = 0; j < b.size() ; j++)
      bbits += (b.get(j) ? "1" : "0");
    System.out.println("bit pattern: " + bbits);
  }
} ///:~

The random number generator is used to create a random byte, short, and int, and each one is transformed into a corresponding bit pattern in a BitSet. This works fine because a BitSet is 64 bits, so none of these cause it to increase in size. But in Java 1.0, when the BitSet is greater than 64 bits, some strange behavior occurs. If you set a bit that’s just one greater than the BitSet’s currently-allocated storage, it will expand nicely. But if you try to set bits at higher locations than that without first just touching the boundary, you’ll get an exception, since the BitSet won’t expand properly in Java 1.0. The example shows a BitSet of 512 bits being created. The constructor allocates storage for twice that number of bits. Then if you try to set bit 1024 or greater without first setting bit 1023, you’ll throw an exception in Java 1.0. Fortunately, this is fixed in Java 1.1, but avoid using the BitSet if you write code for Java 1.0.

Stack

A Stack is sometimes referred to as a “last-in, first-out” (LIFO) collection. That is, whatever you “push” on the Stack last is the first item you can “pop” out. Like all of the other collections in Java, what you push and pop are Objects, so you must cast what you pop.

What’s rather odd is that instead of using a Vector as a building block to create a Stack, Stack is inherited from Vector. So it has all of the characteristics and behaviors of a Vector plus some extra Stack behaviors. It’s difficult to know whether the designers explicitly decided that this was an especially useful way to do things, or whether it was just a naïve design.

Here’s a simple demonstration of Stack that reads each line from an array and pushes it as a String:

//: Stacks.java
// Demonstration of Stack Class
import java.util.*;

public class Stacks {
  static String[] months = { 
    "January", "February", "March", "April",
    "May", "June", "July", "August", "September",
    "October", "November", "December" };
  public static void main(String[] args) {
    Stack stk = new Stack();
    for(int i = 0; i < months.length; i++)
      stk.push(months[i] + " ");
    System.out.println("stk = " + stk);
    // Treating a stack as a Vector:
    stk.addElement("The last line");
    System.out.println(
      "element 5 = " + stk.elementAt(5));
    System.out.println("popping elements:");
    while(!stk.empty())
      System.out.println(stk.pop());
  }
} ///:~

Each line in the months array is inserted into the Stack with push( ), and later fetched from the top of the stack with a pop( ). To make a point, Vector operations are also performed on the Stack object. This is possible because, by virtue of inheritance, a Stack is a Vector. Thus, all operations that can be performed on a Vector can also be performed on a Stack, such as elementAt( ).

Hashtable

A Vector allows you to select from a sequence of objects using a number, so in a sense it associates numbers to objects. But what if you’d like to select from a sequence of objects using some other criterion? A Stack is an example: its selection criterion is “the last thing pushed on the stack.” A powerful twist on this idea of “selecting from a sequence” is alternately termed a map, a dictionary, or an associative array. Conceptually, it seems like a vector, but instead of looking up objects using a number, you look them up using another object! This is often a key process in a program.

The concept shows up in Java as the abstract class Dictionary. The interface for this class is straightforward: size( ) tells you how many elements are within, isEmpty( ) is true if there are no elements, put(Object key, Object value) adds a value (the thing you want), and associates it with a key (the thing you look it up with). get(Object key) produces the value given the corresponding key, and remove(Object key) removes the key-value pair from the list. There are enumerations: keys( ) produces an Enumeration of the keys, and elements( ) produces an Enumeration of all the values. That’s all there is to a Dictionary.

A Dictionary isn’t terribly difficult to implement. Here’s a simple approach, which uses two Vectors, one for keys and one for values:

//: AssocArray.java
// Simple version of a Dictionary
import java.util.*;

public class AssocArray extends Dictionary {
  private Vector keys = new Vector();
  private Vector values = new Vector();
  public int size() { return keys.size(); }
  public boolean isEmpty() {
    return keys.isEmpty();
  }
  public Object put(Object key, Object value) {
    keys.addElement(key);
    values.addElement(value);
    return key;
  }
  public Object get(Object key) {
    int index = keys.indexOf(key);
    // indexOf() Returns -1 if key not found:
    if(index == -1) return null;
    return values.elementAt(index);
  }
  public Object remove(Object key) {
    int index = keys.indexOf(key);
    if(index == -1) return null;
    keys.removeElementAt(index);
    Object returnval = values.elementAt(index);
    values.removeElementAt(index);
    return returnval;
  }
  public Enumeration keys() {
    return keys.elements();
  }
  public Enumeration elements() {
    return values.elements();
  }
  // Test it:
  public static void main(String[] args) {
    AssocArray aa = new AssocArray();
    for(char c = 'a'; c <= 'z'; c++)
      aa.put(String.valueOf(c),
             String.valueOf(c)
             .toUpperCase());
    char[] ca = { 'a', 'e', 'i', 'o', 'u' };
    for(int i = 0; i < ca.length; i++)
      System.out.println("Uppercase: " +
             aa.get(String.valueOf(ca[i])));
  }
} ///:~

The first thing you see in the definition of AssocArray is that it extends Dictionary. This means that AssocArray is a type of Dictionary, so you can make the same requests of it that you can a Dictionary. If you make your own Dictionary, as is done here, all you need to do is fill in all the methods that are in Dictionary. (And you must override all the methods because all of them – with the exception of the constructor – are abstract.)

The Vectors keys and values are linked by a common index number. That is, if you call put( ) with a key of “roof” and a value of “blue” (assuming you’re associating the various parts of a house with the colors they are to be painted) and there are already 100 elements in the AssocArray, then “roof” will be the 101 element of keys and “blue” will be the 101 element of values. And if you look at get( ), when you pass “roof” in as the key, it produces the index number with keys.indexOf( ), and then uses that index number to produce the value in the associated values vector.

The test in main( ) is simple; it’s just a map of lowercase characters to uppercase characters, which could obviously be done in a number of more efficient ways. But it shows that AssocArray is functional.

The standard Java library contains only one embodiment of a Dictionary, called Hashtable.[34] Java’s Hashtable has the same basic interface as AssocArray (since they both inherit Dictionary), but it differs in one distinct way: efficiency. If you look at what must be done for a get( ), it seems pretty slow to search through a Vector for the key. This is where Hashtable speeds things up. Instead of the tedious linear search for the key, it uses a special value called a hash code. The hash code is a way to take some information in the object in question and turn it into a “relatively unique” int for that object. All objects have a hash code, and hashCode( ) is a method in the root class Object. A Hashtable takes the hashCode( ) of the object and uses it to quickly hunt for the key. This results in a dramatic performance improvement.[35] The way that a Hashtable works is beyond the scope of this book[36] – all you need to know is that Hashtable is a fast Dictionary, and that a Dictionary is a useful tool.

As an example of the use of a Hashtable, consider a program to check the randomness of Java’s Math.random( ) method. Ideally, it would produce a perfect distribution of random numbers, but to test this you need to generate a bunch of random numbers and count the ones that fall in the various ranges. A Hashtable is perfect for this, since it associates objects with objects (in this case, the values produced by Math.random( ) with the number of times those values appear):

//: Statistics.java
// Simple demonstration of Hashtable
import java.util.*;

class Counter { 
  int i = 1; 
  public String toString() { 
    return Integer.toString(i); 
  }
}

class Statistics {
  public static void main(String[] args) {
    Hashtable ht = new Hashtable();
    for(int i = 0; i < 10000; i++) {
      // Produce a number between 0 and 20:
      Integer r = 
        new Integer((int)(Math.random() * 20));
      if(ht.containsKey(r))
        ((Counter)ht.get(r)).i++;
      else
        ht.put(r, new Counter());
    }
    System.out.println(ht);
  }
} ///:~

In main( ), each time a random number is generated it is wrapped inside an Integer object so that handle can be used with the Hashtable. (You can’t use a primitive with a collection, only an object handle.) The containsKey( ) method checks to see if this key is already in the collection. (That is, has the number been found already?) If so, the get( ) methods gets the associated value for the key, which in this case is a Counter object. The value i inside the counter is then incremented to indicate that one more of this particular random number has been found.

If the key has not been found yet, the method put( ) will place a new key-value pair into the Hashtable. Since Counter automatically initializes its variable i to one when it’s created, it indicates the first occurrence of this particular random number.

To display the Hashtable, it is simply printed out. The Hashtable toString( ) method moves through all the key-value pairs and calls the toString( ) for each one. The Integer toString( ) is pre-defined, and you can see the toString( ) for Counter. The output from one run (with some line breaks added) is:

{19=526, 18=533, 17=460, 16=513, 15=521, 14=495,
 13=512, 12=483, 11=488, 10=487, 9=514, 8=523,
 7=497, 6=487, 5=480, 4=489, 3=509, 2=503, 1=475,
 0=505}

You might wonder at the necessity of the class Counter, which seems like it doesn’t even have the functionality of the wrapper class Integer. Why not use int or Integer? Well, you can’t use an int because all of the collections can hold only Object handles. After seeing collections the wrapper classes might begin to make a little more sense to you, since you can’t put any of the primitive types in collections. However, the only thing you can do with the Java wrappers is to initialize them to a particular value and read that value. That is, there’s no way to change a value once a wrapper object has been created. This makes the Integer wrapper immediately useless to solve our problem, so we’re forced to create a new class that does satisfy the need.

Creating “key” classes

In the previous example, a standard library class (Integer) was used as a key for the Hashtable. It worked fine as a key, because it has all the necessary wiring to make it work correctly as a key. But a common pitfall occurs when using Hashtables when you create your own classes to be used as keys. For example, consider a weather predicting system that matches Groundhog objects to Prediction objects. It seems fairly straightforward: you create the two classes and use Groundhog as the key and Prediction as the value:

//: SpringDetector.java
// Looks plausible, but doesn't work right.
import java.util.*;

class Groundhog {
  int ghNumber;
  Groundhog(int n) { ghNumber = n; }
}

class Prediction {
  boolean shadow = Math.random() > 0.5;
  public String toString() {
    if(shadow)
      return "Six more weeks of Winter!";
    else
      return "Early Spring!";
  }
}

public class SpringDetector {
  public static void main(String[] args) {
    Hashtable ht = new Hashtable();
    for(int i = 0; i < 10; i++)
      ht.put(new Groundhog(i), new Prediction());
    System.out.println("ht = " + ht + "\n");
    System.out.println(
      "Looking up prediction for groundhog #3:");
    Groundhog gh = new Groundhog(3);
    if(ht.containsKey(gh))
      System.out.println((Prediction)ht.get(gh));
  }
} ///:~

Each Groundhog is given an identity number, so you can look up a Prediction in the Hashtable by saying “Give me the Prediction associated with Groundhog number 3.” The Prediction class contains a boolean that is initialized using Math.random( ), and a toString( ) that interprets the result for you. In main( ), a Hashtable is filled with Groundhogs and their associated Predictions. The Hashtable is printed so you can see that it has been filled. Then a Groundhog with an identity number of 3 is used to look up the prediction for Groundhog #3.

It seems simple enough, but it doesn’t work. The problem is that Groundhog is inherited from the common root class Object (which is what happens if you don’t specify a base class, thus all classes are ultimately inherited from Object). It is Object’s hashCode( ) method that is used to generate the hash code for each object, and by default it just uses the address of its object. Thus, the first instance of Groundhog(3) does not produce a hash code equal to the hash code for the second instance of Groundhog(3) that we tried to use as a lookup.

You might think that all you need to do is write an appropriate override for hashCode( ). But it still won’t work until you’ve done one more thing: override the equals( ) that is also part of Object. This method is used by the Hashtable when trying to determine if your key is equal to any of the keys in the table. Again, the default Object.equals( ) simply compares object addresses, so one Groundhog(3) is not equal to another Groundhog(3).

Thus, to use your own classes as keys in a Hashtable, you must override both hashCode( ) and equals( ), as shown in the following solution to the problem above:

//: SpringDetector2.java
// If you create a class that's used as a key in
// a Hashtable, you must override hashCode()
// and equals().
import java.util.*;

class Groundhog2 {
  int ghNumber;
  Groundhog2(int n) { ghNumber = n; }
  public int hashCode() { return ghNumber; }
  public boolean equals(Object o) {
    return (o instanceof Groundhog2)
      && (ghNumber == ((Groundhog2)o).ghNumber);
  }
}

public class SpringDetector2 {
  public static void main(String[] args) {
    Hashtable ht = new Hashtable();
    for(int i = 0; i < 10; i++)
      ht.put(new Groundhog2(i),new Prediction());
    System.out.println("ht = " + ht + "\n");
    System.out.println(
      "Looking up prediction for groundhog #3:");
    Groundhog2 gh = new Groundhog2(3);
    if(ht.containsKey(gh))
      System.out.println((Prediction)ht.get(gh));
  }
} ///:~

Note that this uses the Prediction class from the previous example, so SpringDetector.java must be compiled first or you’ll get a compile-time error when you try to compile SpringDetector2.java.

Groundhog2.hashCode( ) returns the ground hog number as an identifier. (In this example, the programmer is responsible for ensuring that no two ground hogs exist with the same ID number.) The hashCode( ) is not required to return a unique identifier, but the equals( ) method must be able to strictly determine whether two objects are equivalent.

Even though it appears that the equals( ) method is only checking to see whether the argument is an instance of Groundhog2 (using the instanceof keyword, which is fully explained in Chapter 11), the instanceof actually quietly does a second sanity check, to see if the object is null, since instanceof produces false if the left-hand argument is null. Assuming it’s the correct type and not null, the comparison is based on the actual ghNumbers. This time, when you run the program, you’ll see it produces the correct output. (Many of the Java library classes override the hashCode( ) and equals( ) methods to be based upon their contents.)

Properties: a type of Hashtable

In the first example in this book, a type of Hashtable was used called Properties. In that example, the lines:

Properties p = System.getProperties();
p.list(System.out);

called the static method getProperties( ) to get a special Properties object that described the system characteristics. The method list( ) is a method of Properties that sends the contents to any stream output that you choose. There’s also a save( ) method to allow you to write your property list to a file in a way that it can be retrieved later with the load( ) method.

Although the Properties class is inherited from Hashtable, it also contains a second Hashtable that acts to hold the list of “default” properties. So if a property isn’t found in the primary list, the defaults will be searched.

The Properties class is also available for use in your programs (an example is ClassScanner.java in Chapter 17). You can find more complete details in the Java library documentation.

Enumerators revisited

We can now demonstrate the true power of the Enumeration: the ability to separate the operation of traversing a sequence from the underlying structure of that sequence. In the following example, the class PrintData uses an Enumeration to move through a sequence and call the toString( ) method for every object. Two different types of collections are created, a Vector and a Hashtable, and they are each filled with, respectively, Mouse and Hamster objects. (These classes are defined earlier in the chapter; notice you must have compiled HamsterMaze.java and WorksAnyway.java for the following program to compile.) Because an Enumeration hides the structure of the underlying collection, PrintData doesn’t know or care what kind of collection the Enumeration comes from:

//: Enumerators2.java
// Revisiting Enumerations
import java.util.*;

class PrintData {
  static void print(Enumeration e) {
    while(e.hasMoreElements())
      System.out.println(
        e.nextElement().toString());
  }
}

class Enumerators2 {
  public static void main(String[] args) {
    Vector v = new Vector();
    for(int i = 0; i < 5; i++)
      v.addElement(new Mouse(i));

    Hashtable h = new Hashtable();
    for(int i = 0; i < 5; i++)
      h.put(new Integer(i), new Hamster(i));

    System.out.println("Vector");
    PrintData.print(v.elements());
    System.out.println("Hashtable");
    PrintData.print(h.elements());
  }
} ///:~

Note that PrintData.print( ) takes advantage of the fact that the objects in these collections are of class Object so it can call toString( ). It’s more likely that in your problem, you must make the assumption that your Enumeration is walking through a collection of some specific type. For example, you might assume that everything in the collection is a Shape with a draw( ) method. Then you must downcast from the Object that Enumeration.nextElement() returns to produce a Shape.

Sorting

One of the things missing in the Java 1.0 and 1.1 libraries is algorithmic operations, even simple sorting. So it makes sense to create a Vector that sorts itself using the classic Quicksort.

A problem with writing generic sorting code is that sorting must perform comparisons based on the actual type of the object. Of course, one approach is to write a different sorting method for every different type, but you should be able to recognize that this does not produce code that is easily re-used for new types.

A primary goal of programming design is to “separate things that change from things that stay the same,” and here, the code that stays the same is the general sort algorithm, but the thing that changes from one use to the next is the way objects are compared. So instead of hard-wiring the comparison code into many different sort routines, the technique of the callback will be used. With a callback, the part of the code that varies from case to case is encapsulated inside its own class, and the part of the code that’s always the same will call back to the code that changes. That way you can make different objects to express different ways of comparison and feed them to the same sorting code.

The following interface describes how to compare two objects, and thus encapsulates “the things that change” for this particular problem:

//: Compare.java
// Interface for sorting callback:
package c08;

interface Compare {
  boolean lessThan(Object lhs, Object rhs);
  boolean lessThanOrEqual(Object lhs, Object rhs);
} ///:~

For both methods, the lhs represents the “left hand” object and the rhs represents the “right hand” object in the comparison.

A subclass of Vector can be created that implements the Quicksort using Compare. The algorithm, which is known for its speed, will not be explained here. For details, see Practical Algorithms for Programmers, by Binstock & Rex, Addison-Wesley 1995.

//: SortVector.java
// A generic sorting vector
package c08;
import java.util.*;

public class SortVector extends Vector {
  private Compare compare; // To hold the callback
  public SortVector(Compare comp) {
    compare = comp;
  }
  public void sort() {
    quickSort(0, size() - 1);
  }
  private void quickSort(int left, int right) {
    if(right > left) {
      Object o1 = elementAt(right);
      int i = left - 1;
      int j = right;
      while(true) {
        while(compare.lessThan(
              elementAt(++i), o1))
          ;
        while(j > 0)
          if(compare.lessThanOrEqual(
             elementAt(--j), o1))
            break; // out of while
        if(i >= j) break;
        swap(i, j);
      }
      swap(i , right);
      quickSort(left, i-1);
      quickSort(i+1, right);
    }
  }
  private void swap(int loc1, int loc2) {
    Object tmp = elementAt(loc1);
    setElementAt(elementAt(loc2), loc1);
    setElementAt(tmp, loc2);
  }
} ///:~

You can now see the reason for the term “callback,” since the quickSort( ) method “calls back” to the methods in Compare. You can also see how this technique has produced generic, reusable code.

To use the SortVector, you must create a class that implements Compare for the kind of objects that you’re sorting. This is a place where an inner class is not essential, but it can make sense for code organization. Here’s an example for String objects:

//: StringSortTest.java
// Testing the generic sorting Vector
package c08;
import java.util.*;

public class StringSortTest {
  static class StringCompare implements Compare {
    public boolean lessThan(Object l, Object r) {
      return ((String)l).toLowerCase().compareTo(
        ((String)r).toLowerCase()) < 0;
    }
    public boolean 
    lessThanOrEqual(Object l, Object r) {
      return ((String)l).toLowerCase().compareTo(
        ((String)r).toLowerCase()) <= 0;
    }
  }
  public static void main(String[] args) {
    SortVector sv = 
      new SortVector(new StringCompare());
    sv.addElement("d");
    sv.addElement("A");
    sv.addElement("C");
    sv.addElement("c");
    sv.addElement("b");
    sv.addElement("B");
    sv.addElement("D");
    sv.addElement("a");
    sv.sort();
    Enumeration e = sv.elements();
    while(e.hasMoreElements())
      System.out.println(e.nextElement());
  }
} ///:~

The inner class is static because it does not need a link to an outer class in order for it to function.

You can see how, once the framework is set up, it’s easy to reuse a design like this – you simply write the class that encapsulates “the things that change” and hand an object to the SortVector.

The comparison forces the strings to lower case, so that the capital A’s end up next to the small a’s and not in some entirely different place. This example shows, however, a slight deficiency in this approach, since the test code above puts the uppercase and lowercase single letters of the same letter in the order that they appear: A a b B c C d D. This is not usually much of a problem, because you’re usually working with longer strings and in that situation the effect doesn’t show up. (The Java 1.2 collections provide sorting functionality that solves this problem.)

Inheritance (extends) is used here to create a new type of Vector – that is, SortVector is a Vector with some added functionality. The use of inheritance here is powerful but it presents problems. It turns out that some methods are final (described in Chapter 7), so you cannot override them. If you want to create a sorted Vector that accepts and produces only String objects you run into a wall, since addElement( ) and elementAt( ) are final, and these are precisely the methods you’d need to override so they accept and produce only String objects. No luck there.

On the other hand, consider composition: the placing of an object inside a new class. Rather than rewrite the above code to accomplish this, we can simply use a SortVector inside the new class. In this case, the inner class to implement the interface Compare will be created anonymously:

//: StrSortVector.java
// Automatically sorted Vector that 
// accepts and produces only Strings
package c08;
import java.util.*;

public class StrSortVector {
  private SortVector v = new SortVector(
    // Anonymous inner class:
    new Compare() {
      public boolean 
      lessThan(Object l, Object r) {
        return 
          ((String)l).toLowerCase().compareTo(
          ((String)r).toLowerCase()) < 0;
      }
      public boolean 
      lessThanOrEqual(Object l, Object r) {
        return 
          ((String)l).toLowerCase().compareTo(
          ((String)r).toLowerCase()) <= 0;
      }
    }
  );
  private boolean sorted = false;
  public void addElement(String s) {
    v.addElement(s);
    sorted = false;
  }
  public String elementAt(int index) {
    if(!sorted) {
      v.sort();
      sorted = true;
    }
    return (String)v.elementAt(index);
  }
  public Enumeration elements() {
    if(!sorted) {
      v.sort();
      sorted = true;
    }
    return v.elements();
  }
  // Test it:
  public static void main(String[] args) {
    StrSortVector sv = new StrSortVector();
    sv.addElement("d");
    sv.addElement("A");
    sv.addElement("C");
    sv.addElement("c");
    sv.addElement("b");
    sv.addElement("B");
    sv.addElement("D");
    sv.addElement("a");
    Enumeration e = sv.elements();
    while(e.hasMoreElements())
      System.out.println(e.nextElement());
  }
} ///:~

This quickly reuses the code from SortVector to create the desired functionality. However, not all of the public methods from SortVector and Vector appear in StrSortVector. When reusing code this way, you can make a definition in the new class for each one in the contained class, or you can start with just a few and periodically go back and add more when you need them. Eventually the new class design will settle down.

The advantage to this approach is that it will take only String objects and produce only String objects, and the checking happens at compile time instead of run time. Of course, that’s only true for addElement( ) and elementAt( ); elements( ) still produces an Enumeration that is untyped at compile time. Type checking for the Enumeration and in StrSortVector still happens, of course, it just happens at run-time by throwing exceptions if you do something wrong. It’s a trade-off: do you find out about something for sure at compile time or probably at run-time? (That is, “probably not while you’re testing the code” and “probably when the program user tries something you didn’t test for.”) Given the choices and the hassle, it’s easier to use inheritance and just grit your teeth while casting – again, if parameterized types are ever added to Java, they will solve this problem.

You can see there’s a flag called sorted in this class. You could sort the vector every time addElement( ) is called, and constantly keep it in a sorted state. But usually people add a lot of elements to a Vector before beginning to read it. So sorting after every addElement( ) would be less efficient than waiting until someone wants to read the vector and then sorting it, which is what is done here. The technique of delaying a process until it is absolutely necessary is called lazy evaluation. (There is an analogous technique called lazy initialization which waits until a field value is necessary before initializing it.)

The generic collection library

You’ve seen in this chapter that the standard Java library has some fairly useful collections, but far from a complete set. In addition, algorithms like sorting are not supported at all. One of the strengths of C++ is its libraries, in particular the Standard Template Library (STL) that provides a fairly full set of collections as well as many algorithms like sorting and searching that work with those collections. Based on this model, the ObjectSpace company was inspired to create the Generic Collection Library for Java (formerly called the Java Generic Library, but the abbreviation JGL is still used – the old name infringed on Sun’s copyright), which follows the design of the STL as much as possible (given the differences between the two languages). The JGL seems to fulfill many, if not all, of the needs for a collection library, or as far as one could go in this direction without C++’s template mechanism. The JGL includes linked lists, sets, queues, maps, stacks, sequences, and iterators that are far more functional than Enumeration, as well as a full set of algorithms such as searching and sorting. ObjectSpace also made, in some cases, more intelligent design decisions than the Sun library designers. For example, the methods in the JGL collections are not final so it’s easy to inherit and override those methods.

The JGL has been included in some vendors’ Java distributions and ObjectSpace has made the JGL freely available for all uses, including commercial use, at http://www.ObjectSpace.com. The online documentation that comes in the JGL package is quite good and should be adequate to get you started.

The new collections

To me, collection classes are one of the most powerful tools for raw programming. You might have gathered that I’m somewhat disappointed in the collections provided in Java through version 1.1. As a result, it’s a tremendous pleasure to see that collections were given proper attention in Java 1.2, and thoroughly redesigned (by Joshua Bloch at Sun). I consider the new collections to be one of the two major features in Java 1.2 (the other is the Swing library, covered in Chapter 13) because they significantly increase your programming muscle and help bring Java in line with more mature programming systems.

Some of the redesign makes things tighter and more sensible. For example, many names are shorter, cleaner, and easier to understand, as well as to type. Some names are changed to conform to accepted terminology: a particular favorite of mine is “iterator” instead of “enumeration.”

The redesign also fills out the functionality of the collections library. You can now have the behavior of linked lists, queues, and dequeues (double-ended queues, pronounced “decks”).

The design of a collections library is difficult (true of most library design problems). In C++, the STL covered the bases with many different classes. This was better than what was available prior to the STL (nothing), but it didn’t translate well into Java. The result was a rather confusing morass of classes. On the other extreme, I’ve seen a collections library that consists of a single class, “collection,” which acts like a Vector and a Hashtable at the same time. The designers of the new collections library wanted to strike a balance: the full functionality that you expect from a mature collections library, but easier to learn and use than the STL and other similar collections libraries. The result can seem a bit odd in places. Unlike some of the decisions made in the early Java libraries, these oddities were not accidents, but carefully considered decisions based on tradeoffs in complexity. It might take you a little while to get comfortable with some aspects of the library, but I think you’ll find yourself rapidly acquiring and using these new tools.

The new collections library takes the issue of “holding your objects” and divides it into two distinct concepts:

  1. Collection: a group of individual elements, often with some rule applied to them. A List must hold the elements in a particular sequence, and a Set cannot have any duplicate elements. (A bag, which is not implemented in the new collections library since Lists provide you with that functionality, has no such rules.)
  2. Map: a group of key-value object pairs (what you’ve seen up until now as a Hashtable). At first glance, this might seem like it ought to be a Collection of pairs, but when you try to implement it that way the design gets awkward, so it’s clearer to make it a separate concept. On the other hand, it’s convenient to look at portions of a Map by creating a Collection to represent that portion. Thus, a Map can return a Set of its keys, a List of its values, or a List of its pairs. Maps, like arrays, can easily be expanded to multiple dimensions without adding new concepts: you simply make a Map whose values are Maps (and the values of those Maps can be Maps, etc.).

Collections and Maps may be implemented in many different ways, according to your programming needs. It’s helpful to look at a diagram of the new collections:


This diagram can be a bit overwhelming at first, but throughout the rest of this section you’ll see that there are really only three collection components: Map, List, and Set, and only two or three implementations of each one (with, typically, a preferred version). When you see this, the new collections should not seem so daunting.

The dashed boxes represent interfaces, the dotted boxes represent abstract classes, and the solid boxes are regular (concrete) classes. The dashed arrows indicate that a particular class is implementing an interface (or in the case of an abstract class, partially implementing that interface). The double-line arrows show that a class can produce objects of the class the arrow is pointing to. For example, any Collection can produce an Iterator, while a List can produce a ListIterator (as well as an ordinary Iterator, since List is inherited from Collection).

The interfaces that are concerned with holding objects are Collection, List, Set, and Map. Typically, you’ll write the bulk of your code to talk to these interfaces, and the only place where you’ll specify the precise type you’re using is at the point of creation. So you can create a List like this:

List x = new LinkedList();

Of course, you can also decide to make x a LinkedList (instead of a generic List) and carry the precise type information around with x. The beauty (and the intent) of using the interface is that if you decide you want to change your implementation, all you need to do is change it at the point of creation, like this:

List x = new ArrayList();

The rest of your code can remain untouched.

In the class hierarchy, you can see a number of classes whose names begin with “Abstract,” and these can seem a bit confusing at first. They are simply tools that partially implement a particular interface. If you were making your own Set, for example, you wouldn’t start with the Set interface and implement all the methods, instead you’d inherit from AbstractSet and do the minimal necessary work to make your new class. However, the new collections library contains enough functionality to satisfy your needs virtually all the time. So for our purposes, you can ignore any class that begins with “Abstract.”

Therefore, when you look at the diagram, you’re really concerned with only those interfaces at the top of the diagram and the concrete classes (those with solid boxes around them). You’ll typically make an object of a concrete class, upcast it to the corresponding interface, and then use the interface throughout the rest of your code. Here’s a simple example, which fills a Collection with String objects and then prints each element in the Collection:

//: SimpleCollection.java
// A simple example using the new Collections
package c08.newcollections;
import java.util.*;

public class SimpleCollection {
  public static void main(String[] args) {
    Collection c = new ArrayList();
    for(int i = 0; i < 10; i++)
      c.add(Integer.toString(i));
    Iterator it = c.iterator();
    while(it.hasNext())
      System.out.println(it.next());
  }
} ///:~

All the code examples for the new collections libraries will be placed in the subdirectory newcollections, so you’ll be reminded that these work only with Java 1.2. As a result, you must invoke the program by saying:

java c08.newcollections.SimpleCollection

with a similar syntax for the rest of the programs in the package.

You can see that the new collections are part of the java.util library, so you don’t need to add any extra import statements to use them.

The first line in main( ) creates an ArrayList object and then upcasts it to a Collection. Since this example uses only the Collection methods, any object of a class inherited from Collection would work, but ArrayList is the typical workhorse Collection and takes the place of Vector.

The add( ) method, as its name suggests, puts a new element in the Collection. However, the documentation carefully states that add( ) “ensures that this Collection contains the specified element.” This is to allow for the meaning of Set, which adds the element only if it isn’t already there. With an ArrayList, or any sort of List, add( ) always means “put it in.”

All Collections can produce an Iterator via their iterator( ) method. An Iterator is just like an Enumeration, which it replaces, except:

  1. It uses a name (iterator) that is historically understood and accepted in the OOP community.
  2. It uses shorter method names than Enumeration: hasNext( ) instead of hasMoreElements( ), and next( ) instead of nextElement( ).
  3. It adds a new method, remove( ), which removes the last element produced by the Iterator. So you can call remove( ) only once for every time you call next( ).

In SimpleCollection.java, you can see that an Iterator is created and used to traverse the Collection, printing each element.

Using Collections

The following table shows everything you can do with a Collection, and thus, everything you can do with a Set or a List. (List also has additional functionality.) Maps are not inherited from Collection, and will be treated separately.

Boolean add(Object)

*Ensures that the Collection contains the argument. Returns false if it doesn’t add the argument.

Boolean addAll(Collection)

*Adds all the elements in the argument. Returns true if any elements were added.

void clear( )

*Removes all the elements in the Collection.

Boolean contains(Object)

True if the Collection contains the argument.

Boolean containsAll(Collection)

True if the Collection contains all the elements in the argument.

Boolean isEmpty( )

True if the Collection has no elements.

Iterator iterator( )

Returns an Iterator that you can use to move through the elements in the Collection.

Boolean remove(Object)

*If the argument is in the Collection, one instance of that element is removed. Returns true if a removal occurred.

Boolean removeAll(Collection)

*Removes all the elements that are contained in the argument. Returns true if any removals occurred.

Boolean retainAll(Collection)

*Retains only elements that are contained in the argument (an “intersection” from set theory). Returns true if any changes occurred.

int size( )

Returns the number of elements in the Collection.

Object[] toArray( )

Returns an array containing all the elements in the Collection.

Object[] toArray(Object[] a)

Returns an array containing all the elements in the Collection, whose type is that of the array a rather than plain Object (you must cast the array to the right type).


*This is an “optional” method, which means it might not be implemented by a particular Collection. If not, that method throws an UnsupportedOperationException. Exceptions will be covered in Chapter 9.

The following example demonstrates all of these methods. Again, these work with anything that inherits from Collection; an ArrayList is used as a kind of “least-common denominator”:

//: Collection1.java
// Things you can do with all Collections
package c08.newcollections;
import java.util.*;

public class Collection1 {
  // Fill with 'size' elements, start
  // counting at 'start':
  public static Collection 
  fill(Collection c, int start, int size) {
    for(int i = start; i < start + size; i++)
      c.add(Integer.toString(i));
    return c;
  }
  // Default to a "start" of 0:
  public static Collection 
  fill(Collection c, int size) {
    return fill(c, 0, size);
  }
  // Default to 10 elements:
  public static Collection fill(Collection c) {
    return fill(c, 0, 10);
  }
  // Create & upcast to Collection:
  public static Collection newCollection() {
    return fill(new ArrayList());
    // ArrayList is used for simplicity, but it's
    // only seen as a generic Collection 
    // everywhere else in the program.
  }
  // Fill a Collection with a range of values:
  public static Collection 
  newCollection(int start, int size) {
    return fill(new ArrayList(), start, size);
  }
  // Moving through a List with an iterator:
  public static void print(Collection c) {
    for(Iterator x = c.iterator(); x.hasNext();)
      System.out.print(x.next() + " ");
    System.out.println();
  }    
  public static void main(String[] args) {
    Collection c = newCollection();
    c.add("ten");
    c.add("eleven");
    print(c);
    // Make an array from the List:
    Object[] array = c.toArray(); 
    // Make a String array from the List:
    String[] str = 
      (String[])c.toArray(new String[1]);
    // Find max and min elements; this means
    // different things depending on the way
    // the Comparable interface is implemented:
    System.out.println("Collections.max(c) = " +
      Collections.max(c));
    System.out.println("Collections.min(c) = " +
      Collections.min(c));
    // Add a Collection to another Collection
    c.addAll(newCollection());
    print(c);
    c.remove("3"); // Removes the first one
    print(c);
    c.remove("3"); // Removes the second one
    print(c);
    // Remove all components that are in the
    // argument collection:
    c.removeAll(newCollection());
    print(c);
    c.addAll(newCollection());
    print(c);
    // Is an element in this Collection?
    System.out.println(
      "c.contains(\"4\") = " + c.contains("4"));
    // Is a Collection in this Collection?
    System.out.println(
      "c.containsAll(newCollection()) = " + 
      c.containsAll(newCollection()));
    Collection c2 = newCollection(5, 3);
    // Keep all the elements that are in both
    // c and c2 (an intersection of sets):
    c.retainAll(c2);
    print(c);
    // Throw away all the elements in c that
    // also appear in c2:
    c.removeAll(c2);
    System.out.println("c.isEmpty() = " +
      c.isEmpty());
    c = newCollection();
    print(c);
    c.clear(); // Remove all elements
    System.out.println("after c.clear():");
    print(c);
  }
} ///:~

The first methods provide a way to fill any Collection with test data, in this case just ints converted to Strings. The second method will be used frequently throughout the rest of this chapter.

The two versions of newCollection( ) create ArrayLists containing different sets of data and return them as Collection objects, so it’s clear that nothing other than the Collection interface is being used.

The print( ) method will also be used throughout the rest of this section. Since it moves through a Collection using an Iterator, which any Collection can produce, it will work with Lists and Sets and any Collection that a Map produces.

main( ) uses simple exercises to show all of the methods in Collection.

The following sections compare the various implementations of List, Set, and Map and indicate in each case (with an asterisk) which one should be your default choice. You’ll notice that the legacy classes Vector, Stack, and Hashtable are not included because in all cases there are preferred classes within the new collections.

Using Lists

List (interface)

Order is the most important feature of a List; it promises to maintain elements in a particular sequence. List adds a number of methods to Collection that allow insertion and removal of elements in the middle of a List. (This is recommended only for a LinkedList.) A List will produce a ListIterator, and using this you can traverse the List in both directions, as well as insert and remove elements in the middle of the list (again, recommended only for a LinkedList).

ArrayList*

A List backed by an array. Use instead of Vector as a general-purpose object holder. Allows rapid random access to elements, but is slow when inserting and removing elements from the middle of a list. ListIterator should be used only for back-and-forth traversal of an ArrayList, but not for inserting and removing elements, which is expensive compared to LinkedList.

LinkedList

Provides optimal sequential access, with inexpensive insertions and deletions from the middle of the list. Relatively slow for random access. (Use ArrayList instead.) Also has addFirst( ), addLast( ), getFirst( ), getLast( ), removeFirst( ), and removeLast( ) (which are not defined in any interfaces or base classes) to allow it to be used as a stack, a queue, and a dequeue.

The methods in the following example each cover a different group of activities: things that every list can do (basicTest( )), moving around with an Iterator (iterMotion( )) versus changing things with an Iterator (iterManipulation( )), seeing the effects of List manipulation (testVisual( )), and operations available only to LinkedLists.

//: List1.java
// Things you can do with Lists
package c08.newcollections;
import java.util.*;

public class List1 {
  // Wrap Collection1.fill() for convenience:
  public static List fill(List a) {
    return (List)Collection1.fill(a);
  }
  // You can use an Iterator, just as with a
  // Collection, but you can also use random
  // access with get():
  public static void print(List a) {
    for(int i = 0; i < a.size(); i++)
      System.out.print(a.get(i) + " ");
    System.out.println();
  }
  static boolean b;
  static Object o;
  static int i;
  static Iterator it;
  static ListIterator lit;
  public static void basicTest(List a) {
    a.add(1, "x"); // Add at location 1
    a.add("x"); // Add at end
    // Add a collection:
    a.addAll(fill(new ArrayList()));
    // Add a collection starting at location 3:
    a.addAll(3, fill(new ArrayList())); 
    b = a.contains("1"); // Is it in there?
    // Is the entire collection in there?
    b = a.containsAll(fill(new ArrayList()));
    // Lists allow random access, which is cheap
    // for ArrayList, expensive for LinkedList:
    o = a.get(1); // Get object at location 1
    i = a.indexOf("1"); // Tell index of object
    // indexOf, starting search at location 2:
    i = a.indexOf("1", 2);
    b = a.isEmpty(); // Any elements inside?
    it = a.iterator(); // Ordinary Iterator
    lit = a.listIterator(); // ListIterator
    lit = a.listIterator(3); // Start at loc 3
    i = a.lastIndexOf("1"); // Last match 
    i = a.lastIndexOf("1", 2); // ...after loc 2
    a.remove(1); // Remove location 1
    a.remove("3"); // Remove this object
    a.set(1, "y"); // Set location 1 to "y"
    // Keep everything that's in the argument
    // (the intersection of the two sets):
    a.retainAll(fill(new ArrayList()));
    // Remove elements in this range:
    a.removeRange(0, 2);
    // Remove everything that's in the argument:
    a.removeAll(fill(new ArrayList()));
    i = a.size(); // How big is it?
    a.clear(); // Remove all elements
  }
  public static void iterMotion(List a) {
    ListIterator it = a.listIterator();
    b = it.hasNext();
    b = it.hasPrevious();
    o = it.next();
    i = it.nextIndex();
    o = it.previous();
    i = it.previousIndex();
  }
  public static void iterManipulation(List a) {
    ListIterator it = a.listIterator();
    it.add("47");
    // Must move to an element after add():
    it.next();
    // Remove the element that was just produced:
    it.remove(); 
    // Must move to an element after remove():
    it.next();
    // Change the element that was just produced:
    it.set("47");
  }
  public static void testVisual(List a) {
    print(a);
    List b = new ArrayList();
    fill(b);
    System.out.print("b = ");
    print(b);
    a.addAll(b);
    a.addAll(fill(new ArrayList()));
    print(a);
    // Shrink the list by removing all the 
    // elements beyond the first 1/2 of the list
    System.out.println(a.size());
    System.out.println(a.size()/2);
    a.removeRange(a.size()/2, a.size()/2 + 2);
    print(a);
    // Insert, remove, and replace elements
    // using a ListIterator:
    ListIterator x = a.listIterator(a.size()/2);
    x.add("one"); 
    print(a);
    System.out.println(x.next());
    x.remove();
    System.out.println(x.next());
    x.set("47");
    print(a);
    // Traverse the list backwards:
    x = a.listIterator(a.size());
    while(x.hasPrevious())
      System.out.print(x.previous() + " ");
    System.out.println();
    System.out.println("testVisual finished");
  }
  // There are some things that only
  // LinkedLists can do:
  public static void testLinkedList() {
    LinkedList ll = new LinkedList();
    Collection1.fill(ll, 5);
    print(ll);
    // Treat it like a stack, pushing:
    ll.addFirst("one");
    ll.addFirst("two");
    print(ll);
    // Like "peeking" at the top of a stack:
    System.out.println(ll.getFirst());
    // Like popping a stack:
    System.out.println(ll.removeFirst());
    System.out.println(ll.removeFirst());
    // Treat it like a queue, pulling elements
    // off the tail end:
    System.out.println(ll.removeLast());
    // With the above operations, it's a dequeue!
    print(ll);
  }
  public static void main(String args[]) {
    // Make and fill a new list each time:
    basicTest(fill(new LinkedList()));
    basicTest(fill(new ArrayList()));
    iterMotion(fill(new LinkedList()));
    iterMotion(fill(new ArrayList()));
    iterManipulation(fill(new LinkedList()));
    iterManipulation(fill(new ArrayList()));
    testVisual(fill(new LinkedList()));
    testLinkedList();
  }
} ///:~

In basicTest( ) and iterMotion( ) the calls are simply made to show the proper syntax, and while the return value is captured, it is not used. In some cases, the return value isn’t captured since it isn’t typically used. You should look up the full usage of each of these methods in your online documentation before you use them.

Using Sets

Set has exactly the same interface as Collection, so there isn’t any extra functionality as there is with the two different Lists. Instead, the Set is exactly a Collection, it just has different behavior. (This is the ideal use of inheritance and polymorphism: to express different behavior.) A Set allows only one instance of each object value to exist (what constitutes the “value” of an object is more complex, as you shall see).

Set (interface)

Each element that you add to the Set must be unique; otherwise the Set doesn’t add the duplicate element. Objects added to a Set must define equals( ) to establish object uniqueness. Set has exactly the same interface as Collection. The Set interface does not guarantee it will maintain its elements in any particular order.

HashSet*

For Sets where fast lookup time is important. Objects must also define hashCode( ).

TreeSet

An ordered Set backed by a red-black tree. This way, you can extract an ordered sequence from a Set.

The following example does not show everything you can do with a Set, since the interface is the same as Collection and so was exercised in the previous example. Instead, this demonstrates the behavior that makes a Set unique:

//: Set1.java
// Things you can do with Sets
package c08.newcollections;
import java.util.*;

public class Set1 {
  public static void testVisual(Set a) {
    Collection1.fill(a);
    Collection1.fill(a);
    Collection1.fill(a);
    Collection1.print(a); // No duplicates!
    // Add another set to this one:
    a.addAll(a);
    a.add("one"); 
    a.add("one"); 
    a.add("one");
    Collection1.print(a);
    // Look something up:
    System.out.println("a.contains(\"one\"): " +
      a.contains("one"));
  }
  public static void main(String[] args) {
    testVisual(new HashSet());
    testVisual(new TreeSet());
  }
} ///:~

Duplicate values are added to the Set, but when it is printed you’ll see the Set has accepted only one instance of each value.

When you run this program you’ll notice that the order maintained by the HashSet is different from TreeSet, since each has a different way of storing elements so they can be located later. (TreeSet keeps them sorted, while HashSet uses a hashing function, which is designed specifically for rapid lookups.) When creating your own types, be aware that a Set needs a way to maintain a storage order, just as with the “groundhog” examples shown earlier in this chapter. To implement comparability with the new collections, however, you must implement the Comparable interface and define the compareTo( ) method (this will be described more fully later). Here’s an example:

//: Set2.java
// Putting your own type in a Set
package c08.newcollections;
import java.util.*;

class MyType implements Comparable {
  private int i;
  public MyType(int n) { i = n; }
  public boolean equals(Object o) {
    return 
      (o instanceof MyType) 
      && (i == ((MyType)o).i);
  }
  public int hashCode() { return i; }
  public String toString() { return i + " "; }
  public int compareTo(Object o) {
    int i2 = ((MyType) o).i;
    return (i2 < i ? -1 : (i2 == i ? 0 : 1));
  }
}

public class Set2 {
  public static Set fill(Set a, int size) {
    for(int i = 0; i < size; i++)
      a.add(new MyType(i));
    return a;
  }
  public static Set fill(Set a) {
    return fill(a, 10);
  }
  public static void test(Set a) {
    fill(a);
    fill(a); // Try to add duplicates
    fill(a);
    a.addAll(fill(new TreeSet()));
    System.out.println(a);
  }
  public static void main(String[] args) {
    test(new HashSet());
    test(new TreeSet());
  }
} ///:~

The definitions for equals( ) and hashCode( ) follow the form given in the “groundhog” examples. You must define an equals( ) in both cases, but the hashCode( ) is absolutely necessary only if the class will be placed in a HashSet (which is likely, since that should generally be your first choice as a Set implementation). However, as a programming style you should always override hashCode( ) when you override equals( ).

In the compareTo( ), note that I did not use the “simple and obvious” form return i-i2. Although this is a common programming error, it would only work properly if i and i2 were unsigned ints (if Java had an “unsigned” keyword, which it does not). It breaks for Java’s signed int, which are not big enough to represent the difference of two signed ints. If i is a large positive integer and j is a large negative integer, i-j will overflow and return a negative value, which will not work.

Using Maps

Map (interface)

Maintains key-value associations (pairs), so you can look up a value using a key.

HashMap*

Implementation based on a hash table. (Use this instead of Hashtable.) Provides constant-time performance for inserting and locating pairs. Performance can be adjusted via constructors that allow you to set the capacity and load factor of the hash table.

TreeMap

Implementation based on a red-black tree. When you view the keys or the pairs, they will be in sorted order (determined by Comparable or Comparator, discussed later). The point of a TreeMap is that you get the results in sorted order. TreeMap is the only Map with the subMap( ) method, which allows you to return a portion of the tree.

The following example contains two sets of test data and a fill( ) method that allows you to fill any map with any two-dimensional array of Objects. These tools will be used in other Map examples, as well.

//: Map1.java
// Things you can do with Maps
package c08.newcollections;
import java.util.*;

public class Map1 {
  public final static String[][] testData1 = {
    { "Happy", "Cheerful disposition" },
    { "Sleepy", "Prefers dark, quiet places" },
    { "Grumpy", "Needs to work on attitude" },
    { "Doc", "Fantasizes about advanced degree"},
    { "Dopey", "'A' for effort" },
    { "Sneezy", "Struggles with allergies" },
    { "Bashful", "Needs self-esteem workshop"},
  };
  public final static String[][] testData2 = {
    { "Belligerent", "Disruptive influence" },
    { "Lazy", "Motivational problems" },
    { "Comatose", "Excellent behavior" }
  };
  public static Map fill(Map m, Object[][] o) {
    for(int i = 0; i < o.length; i++)
      m.put(o[i][0], o[i][1]);
    return m;
  }
  // Producing a Set of the keys:
  public static void printKeys(Map m) {
    System.out.print("Size = " + m.size() +", ");
    System.out.print("Keys: ");
    Collection1.print(m.keySet());
  }
  // Producing a Collection of the values:
  public static void printValues(Map m) {
    System.out.print("Values: ");
    Collection1.print(m.values());
  }
  // Iterating through Map.Entry objects (pairs):
  public static void print(Map m) {
    Collection entries = m.entries();
    Iterator it = entries.iterator();
    while(it.hasNext()) {
      Map.Entry e = (Map.Entry)it.next();
      System.out.println("Key = " + e.getKey() +
        ", Value = " + e.getValue());
    }
  }
  public static void test(Map m) {
    fill(m, testData1);
    // Map has 'Set' behavior for keys:
    fill(m, testData1);
    printKeys(m);
    printValues(m);
    print(m);
    String key = testData1[4][0];
    String value = testData1[4][1];
    System.out.println("m.containsKey(\"" + key +
      "\"): " + m.containsKey(key));
    System.out.println("m.get(\"" + key + "\"): "
      + m.get(key));
    System.out.println("m.containsValue(\"" 
      + value + "\"): " + 
      m.containsValue(value)); 
    Map m2 = fill(new TreeMap(), testData2);
    m.putAll(m2);
    printKeys(m);
    m.remove(testData2[0][0]);
    printKeys(m);
    m.clear();
    System.out.println("m.isEmpty(): " 
      + m.isEmpty());
    fill(m, testData1);
    // Operations on the Set change the Map:
    m.keySet().removeAll(m.keySet());
    System.out.println("m.isEmpty(): " 
      + m.isEmpty());
  }
  public static void main(String args[]) {
    System.out.println("Testing HashMap");
    test(new HashMap());
    System.out.println("Testing TreeMap");
    test(new TreeMap());
  }
} ///:~

The printKeys( ), printValues( ), and print( ) methods are not only useful utilities, they also demonstrate the production of Collection views of a Map. The keySet( ) method produces a Set backed by the keys in the Map; here, it is treated as only a Collection. Similar treatment is given to values( ), which produces a List containing all the values in the Map. (Note that keys must be unique, while values can contain duplicates.) Since these Collections are backed by the Map, any changes in a Collection will be reflected in the associated Map.

The print( ) method grabs the Iterator produced by entries and uses it to print both the key and value for each pair. The rest of the program provides simple examples of each Map operation, and tests each type of Map.

When creating your own class to use as a key in a Map, you must deal with the same issues discussed previously for Sets.

Choosing an implementation

From the previous diagram, you can see that there are really only three collection components: Map, List, and Set, and only two or three implementations of each interface. If you need to use the functionality offered by a particular interface, how do you decide which particular implementation to use?

To understand the answer, you must be aware that each different implementation has its own features, strengths, and weaknesses. For example, you can see in the diagram that the “feature” of Hashtable, Vector, and Stack is that they are legacy classes, so that existing code doesn’t break. On the other hand, it’s best if you don’t use those for new (Java 1.2) code.

The distinction between the other collections often comes down to what they are ”backed by;” that is, the data structures that physically implement your desired interface. This means that, for example, ArrayList, LinkedList, and Vector (which is roughly equivalent to ArrayList) all implement the List interface so your program will produce the same results regardless of the one you use. However, ArrayList (and Vector) is backed by an array, while the LinkedList is implemented in the usual way for a doubly-linked list, as individual objects each containing data along with handles to the previous and next elements in the list. Because of this, if you want to do many insertions and removals in the middle of a list a LinkedList is the appropriate choice. (LinkedList also has additional functionality that is established in AbstractSequentialList.) If not, an ArrayList is probably faster.

As another example, a Set can be implemented as either an TreeSet or a HashSet. A TreeSet is backed by a TreeMap and is designed to produce a constantly-sorted set. However, if you’re going to have larger quantities in your Set, the performance of TreeSet insertions will get slow. When you’re writing a program that needs a Set, you should choose HashSet by default, and change to TreeSet when it's more important to have a constantly-sorted set.

Choosing between Lists

The most convincing way to see the differences between the implementations of List is with a performance test. The following code establishes an inner base class to use as a test framework, then creates an anonymous inner class for each different test. Each of these inner classes is called by the test( ) method. This approach allows you to easily add and remove new kinds of tests.

//: ListPerformance.java
// Demonstrates performance differences in Lists
package c08.newcollections;
import java.util.*;

public class ListPerformance {
  private static final int REPS = 100;
  private abstract static class Tester {
    String name;
    int size; // Test quantity
    Tester(String name, int size) { 
      this.name = name;
      this.size = size;
    }
    abstract void test(List a);
  }
  private static Tester[] tests = {
    new Tester("get", 300) { 
      void test(List a) {
        for(int i = 0; i < REPS; i++) {
          for(int j = 0; j < a.size(); j++)
            a.get(j);
        }
      }
    },
    new Tester("iteration", 300) { 
      void test(List a) {
        for(int i = 0; i < REPS; i++) {
          Iterator it = a.iterator();
          while(it.hasNext())
            it.next();
        }
      }
    },
    new Tester("insert", 1000) { 
      void test(List a) {
        int half = a.size()/2;
        String s = "test";
        ListIterator it = a.listIterator(half);
        for(int i = 0; i < size * 10; i++)
          it.add(s);
      }
    },
    new Tester("remove", 5000) { 
      void test(List a) {
        ListIterator it = a.listIterator(3);
        while(it.hasNext()) {
          it.next();
          it.remove();
        }
      }
    },
  };
  public static void test(List a) {
    // A trick to print out the class name:
    System.out.println("Testing " + 
      a.getClass().getName());
    for(int i = 0; i < tests.length; i++) {
      Collection1.fill(a, tests[i].size);
      System.out.print(tests[i].name);
      long t1 = System.currentTimeMillis();
      tests[i].test(a);
      long t2 = System.currentTimeMillis();
      System.out.println(": " + (t2 - t1));
    }
  }
  public static void main(String[] args) {
    test(new ArrayList());
    test(new LinkedList());
  }
} ///:~

The inner class Tester is abstract, to provide a base class for the specific tests. It contains a String to be printed when the test starts, a size parameter to be used by the test for quantity of elements or repetitions of tests, a constructor to initialize the fields, and an abstract method test( ) that does the work. All the different types of tests are collected in one place, the array tests, which is initialized with different anonymous inner classes that inherit from Tester. To add or remove tests, simply add or remove an inner class definition from the array, and everything else happens automatically.

The List that’s handed to test( ) is first filled with elements, then each test in the tests array is timed. The results will vary from machine to machine; they are intended to give only an order of magnitude comparison between the performance of the different collections. Here is a summary of one run:

Type

Get

Iteration

Insert

Remove

ArrayList

110

490

3790

8730

LinkedList

1980

220

110

110

You can see that random accesses (get( )) are cheap for ArrayLists and expensive for LinkedLists. (Oddly, iteration is faster for a LinkedList than an ArrayList, which is counter-intuitive.) On the other hand, insertions and removals from the middle of a list are dramatically cheaper for a LinkedList than for an ArrayList. The best approach is probably to choose an ArrayList as your default and to change to a LinkedList if you discover performance problems because of many insertions and removals from the middle of the list.

Choosing between Sets

You can choose between an TreeSet and a HashSet, depending on the size of the Set (if you need to produce an ordered sequence from a Set, use TreeSet). The following test program gives an indication of this tradeoff:

//: SetPerformance.java
package c08.newcollections;
import java.util.*;

public class SetPerformance {
  private static final int REPS = 200;
  private abstract static class Tester {
    String name;
    Tester(String name) { this.name = name; }
    abstract void test(Set s, int size);
  }
  private static Tester[] tests = {
    new Tester("add") { 
      void test(Set s, int size) {
        for(int i = 0; i < REPS; i++) {
          s.clear();
          Collection1.fill(s, size);
        }
      }
    },
    new Tester("contains") { 
      void test(Set s, int size) {
        for(int i = 0; i < REPS; i++)
          for(int j = 0; j < size; j++)
            s.contains(Integer.toString(j));
      }
    },
    new Tester("iteration") { 
      void test(Set s, int size) {
        for(int i = 0; i < REPS * 10; i++) {
          Iterator it = s.iterator();
          while(it.hasNext())
            it.next();
        }
      }
    },
  };
  public static void test(Set s, int size) {
    // A trick to print out the class name:
    System.out.println("Testing " + 
      s.getClass().getName() + " size " + size);
    Collection1.fill(s, size);
    for(int i = 0; i < tests.length; i++) {
      System.out.print(tests[i].name);
      long t1 = System.currentTimeMillis();
      tests[i].test(s, size);
      long t2 = System.currentTimeMillis();
      System.out.println(": " + 
        ((double)(t2 - t1)/(double)size));
    }
  }
  public static void main(String[] args) {
    // Small:
    test(new TreeSet(), 10);
    test(new HashSet(), 10);
    // Medium:
    test(new TreeSet(), 100);
    test(new HashSet(), 100);
    // Large:
    test(new HashSet(), 1000);
    test(new TreeSet(), 1000);
  }
} ///:~

The following table shows the results of one run (using Beta3 software on one particular platform; you should run the test yourself as well):

Type

Test size

Add

Contains

Iteration


10

22.0

11.0

16.0

TreeSet

100

22.5

13.2

12.1


1000

31.1

18.7

11.8


10

5.0

6.0

27.0

HashSet

100

6.6

6.6

10.9


1000

7.4

6.6

9.5

HashSet is generally superior to TreeSet for all operations, and the performance is effectively independent of size.

Choosing between Maps

When choosing between implementations of Map, the size of the Map is what most strongly affects performance, and the following test program gives an indication of this tradeoff:

//: MapPerformance.java
// Demonstrates performance differences in Maps
package c08.newcollections;
import java.util.*;

public class MapPerformance {
  private static final int REPS = 200;
  public static Map fill(Map m, int size) {
    for(int i = 0; i < size; i++) {
      String x = Integer.toString(i);
      m.put(x, x);
    }
    return m;
  }
  private abstract static class Tester {
    String name;
    Tester(String name) { this.name = name; }
    abstract void test(Map m, int size);
  }
  private static Tester[] tests = {
    new Tester("put") { 
      void test(Map m, int size) {
        for(int i = 0; i < REPS; i++) {
          m.clear();
          fill(m, size);
        }
      }
    },
    new Tester("get") { 
      void test(Map m, int size) {
        for(int i = 0; i < REPS; i++)
          for(int j = 0; j < size; j++)
            m.get(Integer.toString(j));
      }
    },
    new Tester("iteration") { 
      void test(Map m, int size) {
        for(int i = 0; i < REPS * 10; i++) {
          Iterator it = m.entries().iterator();
          while(it.hasNext())
            it.next();
        }
      }
    },
  };
  public static void test(Map m, int size) {
    // A trick to print out the class name:
    System.out.println("Testing " + 
      m.getClass().getName() + " size " + size);
    fill(m, size);
    for(int i = 0; i < tests.length; i++) {
      System.out.print(tests[i].name);
      long t1 = System.currentTimeMillis();
      tests[i].test(m, size);
      long t2 = System.currentTimeMillis();
      System.out.println(": " + 
        ((double)(t2 - t1)/(double)size));
    }
  }
  public static void main(String[] args) {
    // Small:
    test(new Hashtable(), 10);
    test(new HashMap(), 10);
    test(new TreeMap(), 10);
    // Medium:
    test(new Hashtable(), 100);
    test(new HashMap(), 100);
    test(new TreeMap(), 100);
    // Large:
    test(new HashMap(), 1000);
    test(new Hashtable(), 1000);
    test(new TreeMap(), 1000);
  }
} ///:~

Because the size of the map is the issue, you’ll see that the timing tests divide the time by the size to normalize each measurement. Here is one set of results. (Yours will probably be different.)

Type

Test size

Put

Get

Iteration


10

11.0

5.0

44.0

Hashtable

100

7.7

7.7

16.5


1000

8.0

8.0

14.4


10

16.0

11.0

22.0

TreeMap

100

25.8

15.4

13.2


1000

33.8

20.9

13.6


10

11.0

6.0

33.0

HashMap

100

8.2

7.7

13.7


1000

8.0

7.8

11.9

As you might expect, Hashtable performance is roughly equivalent to HashMap (you can also see that HashMap is generally a bit faster. Remember that HashMap is intended to replace Hashtable). The TreeMap is generally slower than the HashMap, so why would you use it? So you could use it not as a Map, but as a way to create an ordered list. The behavior of a tree is such that it’s always in order and doesn’t have to be specially sorted. (The way it is ordered will be discussed later.) Once you fill a TreeMap, you can call keySet( ) to get a Set view of the keys, then toArray( ) to produce an array of those keys. You can then use the static method Arrays.binarySearch( ) (discussed later) to rapidly find objects in your sorted array. Of course, you would probably only do this if, for some reason, the behavior of a HashMap was unacceptable, since HashMap is designed to rapidly find things. In the end, when you’re using a Map your first choice should be HashMap, and only if you need a constantly-sorted Map will you need TreeMap.

There is another performance issue that the above table does not address, and that is speed of creation. The following program tests creation speed for different types of Map:

//: MapCreation.java
// Demonstrates time differences in Map creation
package c08.newcollections;
import java.util.*;

public class MapCreation {
  public static void main(String[] args) {
    final long REPS = 100000;
    long t1 = System.currentTimeMillis();
    System.out.print("Hashtable");
    for(long i = 0; i < REPS; i++)
      new Hashtable();
    long t2 = System.currentTimeMillis();
    System.out.println(": " + (t2 - t1));
    t1 = System.currentTimeMillis();
    System.out.print("TreeMap");
    for(long i = 0; i < REPS; i++)
      new TreeMap();
    t2 = System.currentTimeMillis();
    System.out.println(": " + (t2 - t1));
    t1 = System.currentTimeMillis();
    System.out.print("HashMap");
    for(long i = 0; i < REPS; i++)
      new HashMap();
    t2 = System.currentTimeMillis();
    System.out.println(": " + (t2 - t1));
  }
} ///:~

At the time this program was written, the creation speed of TreeMap was dramatically faster than the other two types. This, along with the acceptable and consistent put( ) performance of TreeMap, suggests a possible strategy if you’re creating many Maps, and only later in your program doing many lookups: Create and fill TreeMaps, and when you start looking things up, convert the important TreeMaps into HashMaps using the HashMap(Map) constructor. Again, you should only worry about this sort of thing after it’s been proven that you have a performance bottleneck. (“First make it work, then make it fast – if you must.”)

Unsupported operations

It’s possible to turn an array into a List with the static Arrays.toList( ) method:

//: Unsupported.java
// Sometimes methods defined in the Collection
// interfaces don't work!
package c08.newcollections;
import java.util.*;

public class Unsupported {
  private static String[] s = {
    "one", "two", "three", "four", "five",
    "six", "seven", "eight", "nine", "ten",
  };
  static List a = Arrays.toList(s);
  static List a2 = Arrays.toList(
    new String[] { s[3], s[4], s[5] });
  public static void main(String[] args) {
    Collection1.print(a); // Iteration
    System.out.println(
      "a.contains(" + s[0] + ") = " + 
      a.contains(s[0]));
    System.out.println(
      "a.containsAll(a2) = " + 
      a.containsAll(a2));
    System.out.println("a.isEmpty() = " +
      a.isEmpty());
    System.out.println(
      "a.indexOf(" + s[5] + ") = " + 
      a.indexOf(s[5]));
    // Traverse backwards:
    ListIterator lit = a.listIterator(a.size());
    while(lit.hasPrevious())
      System.out.print(lit.previous());
    System.out.println();
    // Set the elements to different values:
    for(int i = 0; i < a.size(); i++)
      a.set(i, "47");
    Collection1.print(a);
    // Compiles, but won't run:
    lit.add("X"); // Unsupported operation
    a.clear(); // Unsupported
    a.add("eleven"); // Unsupported
    a.addAll(a2); // Unsupported
    a.retainAll(a2); // Unsupported
    a.remove(s[0]); // Unsupported
    a.removeAll(a2); // Unsupported
  }
} ///:~

You’ll discover that only a portion of the Collection and List interfaces are actually implemented. The rest of the methods cause the unwelcome appearance of something called an UnsupportedOperationException. You’ll learn all about exceptions in the next chapter, but the short story is that the Collection interface, as well as some of the other interfaces in the new collections library, contain “optional” methods, which might or might not be “supported” in the concrete class that implements that interface. Calling an unsupported method causes an UnsupportedOperationException to indicate a programming error.

“What?!?” you say, incredulous. “The whole point of interfaces and base classes is that they promise these methods will do something meaningful! This breaks that promise – it says that not only will calling some methods not perform a meaningful behavior, they will stop the program! Type safety was just thrown out the window!” It’s not quite that bad. With a Collection, List, Set, or Map, the compiler still restricts you to calling only the methods in that interface, so it’s not like Smalltalk (in which you can call any method for any object, and find out only when you run the program whether your call does anything). In addition, most methods that take a Collection as an argument only read from that Collection –all the “read” methods of Collection are not optional.

This approach prevents an explosion of interfaces in the design. Other designs for collection libraries always seem to end up with a confusing plethora of interfaces to describe each of the variations on the main theme and are thus difficult to learn. It’s not even possible to capture all of the special cases in interfaces, because someone can always invent a new interface. The “unsupported operation” approach achieves an important goal of the new collections library: it is simple to learn and use. For this approach to work, however:

  1. The UnsupportedOperationException must be a rare event. That is, for most classes all operations should work, and only in special cases should an operation be unsupported. This is true in the new collections library, since the classes you’ll use 99 percent of the time – ArrayList, LinkedList, HashSet, and HashMap, as well as the other concrete implementations – support all of the operations. The design does provide a “back door” if you want to create a new Collection without providing meaningful definitions for all the methods in the Collection interface, and yet still fit it into the existing library.
  2. When an operation is unsupported, there should be reasonable likelihood that an UnsupportedOperationException will appear at implementation time, rather than after you’ve shipped the product to the customer. After all, it indicates a programming error: you’ve used a class incorrectly. This point is less certain, and is where the experimental nature of this design comes into play. Only over time will we find out how well it works.

In the example above, Arrays.toList( ) produces a List that is backed by a fixed-size array. Therefore it makes sense that the only supported operations are the ones that don’t change the size of the array. If, on the other hand, a new interface were required to express this different kind of behavior (called, perhaps, “FixedSizeList”), it would throw open the door to complexity and soon you wouldn’t know where to start when trying to use the library.

The documentation for a method that takes a Collection, List, Set, or Map as an argument should specify which of the optional methods must be implemented. For example, sorting requires the set( ) and Iterator.set( ) methods but not add( ) and remove( ).

Sorting and searching

Java 1.2 adds utilities to perform sorting and searching for arrays or Lists. These utilities are static methods of two new classes: Arrays for sorting and searching arrays, and Collections for sorting and searching Lists.

Arrays

The Arrays class has an overloaded sort( ) and binarySearch( ) for arrays of all the primitive types, as well as for String and Object. Here’s an example that shows sorting and searching an array of byte (all the other primitives look the same) and an array of String:

//: Array1.java
// Testing the sorting & searching in Arrays
package c08.newcollections;
import java.util.*;

public class Array1 {
  static Random r = new Random();
  static String ssource = 
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
    "abcdefghijklmnopqrstuvwxyz";
  static char[] src = ssource.toCharArray();
  // Create a random String
  public static String randString(int length) {
    char[] buf = new char[length];
    int rnd;
    for(int i = 0; i < length; i++) {
      rnd = Math.abs(r.nextInt()) % src.length;
      buf[i] = src[rnd];
    }
    return new String(buf);
  }
  // Create a random array of Strings:
  public static 
  String[] randStrings(int length, int size) {
    String[] s = new String[size];
    for(int i = 0; i < size; i++)
      s[i] = randString(length);
    return s;
  }
  public static void print(byte[] b) {
    for(int i = 0; i < b.length; i++)
      System.out.print(b[i] + " ");
    System.out.println();
  }
  public static void print(String[] s) {
    for(int i = 0; i < s.length; i++)
      System.out.print(s[i] + " ");
    System.out.println();
  }
  public static void main(String[] args) {
    byte[] b = new byte[15];
    r.nextBytes(b); // Fill with random bytes
    print(b);
    Arrays.sort(b);
    print(b);
    int loc = Arrays.binarySearch(b, b[10]);
    System.out.println("Location of " + b[10] +
      " = " + loc);
    // Test String sort & search:
    String[] s = randStrings(4, 10);
    print(s);
    Arrays.sort(s);
    print(s);
    loc = Arrays.binarySearch(s, s[4]);
    System.out.println("Location of " + s[4] +
      " = " + loc);
  }
} ///:~

The first part of the class contains utilities to generate random String objects using an array of characters from which random letters can be selected. randString( ) returns a string of any length, and randStrings( ) creates an array of random Strings, given the length of each String and the desired size of the array. The two print( ) methods simplify the display of the sample arrays. In main( ), Random.nextBytes( ) fills the array argument with randomly-selected bytes. (There are no corresponding Random methods to create arrays of the other primitive data types.) Once you have an array, you can see that it’s only a single method call to perform a sort( ) or binarySearch( ). There’s an important warning concerning binarySearch( ): If you do not call sort( ) before you perform a binarySearch( ), unpredictable behavior can occur, including infinite loops.

Sorting and searching with Strings looks the same, but when you run the program you’ll notice something interesting: the sorting is lexicographic, so uppercase letters precede lowercase letters in the character set. Thus, all the capital letters are at the beginning of the list, followed by the lowercase letters, so ‘Z’ precedes ‘a’. It turns out that even telephone books are typically sorted this way.

Comparable and Comparator

What if this isn’t what you want? For example, the index in this book would not be too useful if you had to look in two places for everything that begins with ‘A’ or ‘a’.

When you want to sort an array of Object, there’s a problem. What determines the ordering of two Objects? Unfortunately, the original Java designers didn’t consider this an important problem, or it would have been defined in the root class Object. As a result, ordering must be imposed on Objects from the outside, and the new collections library provides a standard way to do this (which is almost as good as defining it in Object).

There is a sort( ) for arrays of Object (and String, of course, is an Object) that takes a second argument: an object that implements the Comparator interface (part of the new collections library) and performs comparisons with its single compare( ) method. This method takes the two objects to be compared as its arguments and returns a negative integer if the first argument is less than the second, zero if they’re equal, and a positive integer if the first argument is greater than the second. With this knowledge, the String portion of the example above can be re-implemented to perform an alphabetic sort:

//: AlphaComp.java
// Using Comparator to perform an alphabetic sort
package c08.newcollections;
import java.util.*;

public class AlphaComp implements Comparator {
  public int compare(Object o1, Object o2) {
    // Assume it's used only for Strings...
    String s1 = ((String)o1).toLowerCase();
    String s2 = ((String)o2).toLowerCase();
    return s1.compareTo(s2);
  }
  public static void main(String[] args) {
    String[] s = Array1.randStrings(4, 10);
    Array1.print(s);
    AlphaComp ac = new AlphaComp();
    Arrays.sort(s, ac);
    Array1.print(s);
    // Must use the Comparator to search, also:
    int loc = Arrays.binarySearch(s, s[3], ac);
    System.out.println("Location of " + s[3] +
     " = " + loc);
  }
} ///:~

By casting to String, the compare( ) method implicitly tests to ensure that it is used only with String objects – the run-time system will catch any discrepancies. After forcing both Strings to lower case, the String.compareTo( ) method produces the desired results.

When you use your own Comparator to perform a sort( ), you must use that same Comparator when using binarySearch( ).

The Arrays class has another sort( ) method that takes a single argument: an array of Object, but with no Comparator. This sort( ) method must also have some way to compare two Objects. It uses the natural comparison method that is imparted to a class by implementing the Comparable interface. This interface has a single method, compareTo( ), which compares the object to its argument and returns negative, zero, or positive depending on whether it is less than, equal to, or greater than the argument. A simple example demonstrates this:

//: CompClass.java
// A class that implements Comparable
package c08.newcollections;
import java.util.*;

public class CompClass implements Comparable {
  private int i;
  public CompClass(int ii) { i = ii; }
  public int compareTo(Object o) {
    // Implicitly tests for correct type:
    int argi = ((CompClass)o).i;
    if(i == argi) return 0;
    if(i < argi) return -1;
    return 1;
  }
  public static void print(Object[] a) {
    for(int i = 0; i < a.length; i++)
      System.out.print(a[i] + " ");
    System.out.println();
  }
  public String toString() { return i + ""; }
  public static void main(String[] args) {
    CompClass[] a = new CompClass[20];
    for(int i = 0; i < a.length; i++)
      a[i] = new CompClass(
        (int)(Math.random() *100));
    print(a);
    Arrays.sort(a);
    print(a);
    int loc = Arrays.binarySearch(a, a[3]);
    System.out.println("Location of " + a[3] +
     " = " + loc);
  }
} ///:~

Of course, your compareTo( ) method can be as complex as necessary.

Lists

A List can be sorted and searched in the same fashion as an array. The static methods to sort and search a List are contained in the class Collections, but they have similar signatures as the ones in Arrays: sort(List) to sort a List of objects that implement Comparable, binarySearch(List, Object) to find an object in the list, sort(List, Comparator) to sort a List using a Comparator, and binarySearch(List, Object, Comparator) to find an object in that list.[37] This example uses the previously-defined CompClass and AlphaComp to demonstrate the sorting tools in Collections:

//: ListSort.java
// Sorting and searching Lists with 'Collections'
package c08.newcollections;
import java.util.*;

public class ListSort {
  public static void main(String[] args) {
    final int SZ = 20;
    // Using "natural comparison method":
    List a = new ArrayList();
    for(int i = 0; i < SZ; i++)
      a.add(new CompClass(
        (int)(Math.random() *100)));
    Collection1.print(a);
    Collections.sort(a);
    Collection1.print(a);
    Object find = a.get(SZ/2);
    int loc = Collections.binarySearch(a, find);
    System.out.println("Location of " + find +
     " = " + loc);
    // Using a Comparator:
    List b = new ArrayList();
    for(int i = 0; i < SZ; i++)
      b.add(Array1.randString(4));
    Collection1.print(b);
    AlphaComp ac = new AlphaComp();
    Collections.sort(b, ac);
    Collection1.print(b);
    find = b.get(SZ/2);
    // Must use the Comparator to search, also:
    loc = Collections.binarySearch(b, find, ac);
    System.out.println("Location of " + find +
     " = " + loc);
  }
} ///:~ 

The use of these methods is identical to the ones in Arrays, but you’re using a List instead of an array.

The TreeMap must also order its objects according to Comparable or Comparator.

Utilities

There are a number of other useful utilities in the Collections class:

enumeration(Collection)

Produces an old-style Enumeration for the argument.

max(Collection)

min(Collection)

Produces the maximum or minimum element in the argument using the natural comparison method of the objects in the Collection.

max(Collection, Comparator)

min(Collection, Comparator)

Produces the maximum or minimum element in the Collection using the Comparator.

nCopies(int n, Object o)

Returns an immutable List of size n whose handles all point to o.

subList(List, int min, int max)

Returns a new List backed by the specified argument List that is a window into that argument with indexes starting at min and stopping just before max.

Note that min( ) and max( ) work with Collection objects, not with Lists, so you don’t need to worry about whether the Collection should be sorted or not. (As mentioned earlier, you do need to sort( ) a List or an array before performing a binarySearch( ).)

Making a Collection or Map unmodifiable

Often it is convenient to create a read-only version of a Collection or Map. The Collections class allows you to do this by passing the original container into a method that hands back a read-only version. There are four variations on this method, one each for Collection (if you don’t want to treat a Collection as a more specific type), List, Set, and Map. This example shows the proper way to build read-only versions of each:

//: ReadOnly.java
// Using the Collections.unmodifiable methods
package c08.newcollections;
import java.util.*;

public class ReadOnly {
  public static void main(String[] args) {
    Collection c = new ArrayList();
    Collection1.fill(c); // Insert useful data
    c = Collections.unmodifiableCollection(c);
    Collection1.print(c); // Reading is OK
    //! c.add("one"); // Can't change it
    
    List a = new ArrayList();
    Collection1.fill(a);
    a = Collections.unmodifiableList(a);
    ListIterator lit = a.listIterator();
    System.out.println(lit.next()); // Reading OK
    //! lit.add("one"); // Can't change it

    Set s = new HashSet();
    Collection1.fill(s);
    s = Collections.unmodifiableSet(s);
    Collection1.print(s); // Reading OK
    //! s.add("one"); // Can't change it
    
    Map m = new HashMap();
    Map1.fill(m, Map1.testData1);
    m = Collections.unmodifiableMap(m);
    Map1.print(m); // Reading OK
    //! m.put("Ralph", "Howdy!");
  }
} ///:~

In each case, you must fill the container with meaningful data before you make it read-only. Once it is loaded, the best approach is to replace the existing handle with the handle that is produced by the “unmodifiable” call. That way, you don’t run the risk of accidentally changing the contents once you’ve made it unmodifiable. On the other hand, this tool also allows you to keep a modifiable container as private within a class and to return a read-only handle to that container from a method call. So you can change it from within the class but everyone else can only read it.

Calling the “unmodifiable” method for a particular type does not cause compile-time checking, but once the transformation has occurred, any calls to methods that modify the contents of a particular container will produce an UnsupportedOperationException.

Synchronizing a Collection or Map

The synchronized keyword is an important part of the subject of multithreading, a more complicated topic that will not be introduced until Chapter 14. Here, I shall note only that the Collections class contains a way to automatically synchronize an entire container. The syntax is similar to the “unmodifiable” methods:

//: Synchronization.java
// Using the Collections.synchronized methods
package c08.newcollections;
import java.util.*;

public class Synchronization {
  public static void main(String[] args) {
    Collection c = 
      Collections.synchronizedCollection(
        new ArrayList());
    List list = Collections.synchronizedList(
      new ArrayList());
    Set s = Collections.synchronizedSet(
      new HashSet());
    Map m = Collections.synchronizedMap(
      new HashMap());
  }
} ///:~

In this case, you immediately pass the new container through the appropriate “synchronized” method; that way there’s no chance of accidentally exposing the unsynchronized version.

The new collections also have a mechanism to prevent more than one process from modifying the contents of a container. The problem occurs if you’re iterating through a container and some other process steps in and inserts, removes, or changes an object in that container. Maybe you’ve already passed that object, maybe it’s ahead of you, maybe the size of the container shrinks after you call size( ) – there are many scenarios for disaster. The new collections library incorporates a fail fast mechanism that looks for any changes to the container other than the ones your process is personally responsible for. If it detects that someone else is modifying the container, it immediately produces a ConcurrentModificationException. This is the “fail-fast” aspect – it doesn’t try to detect a problem later on using a more complex algorithm.

Summary

To review the collections provided in the standard Java (1.0 and 1.1) library (BitSet is not included here since it’s more of a special-purpose class):

  1. An array associates numerical indices to objects. It holds objects of a known type so you don’t have to cast the result when you’re looking up an object. It can be multidimensional, and it can hold primitives. However, its size cannot be changed once you create it.
  2. A Vector also associates numerical indices to objects – you can think of arrays and Vectors as random-access collections. The Vector automatically resizes itself as you add more elements. But a Vector can hold only Object handles, so it won’t hold primitives and you must always cast the result when you pull an Object handle out of a collection.
  3. A Hashtable is a type of Dictionary, which is a way to associate, not numbers, but objects with other objects. A Hashtable also supports random access to objects, in fact, its whole design is focused around rapid access.
  4. A Stack is a last-in, first-out (LIFO) queue.

If you’re familiar with data structures, you might wonder why there’s not a larger set of collections. From a functionality standpoint, do you really need a larger set of collections? With a Hashtable, you can put things in and find them quickly, and with an Enumeration, you can iterate through the sequence and perform an operation on every element in the sequence. That’s a powerful tool, and maybe it should be enough.

But a Hashtable has no concept of order. Vectors and arrays give you a linear order, but it’s expensive to insert an element into the middle of either one. In addition, queues, dequeues, priority queues, and trees are about ordering the elements, not just putting them in and later finding them or moving through them linearly. These data structures are also useful, and that’s why they were included in Standard C++. For this reason, you should consider the collections in the standard Java library only as a starting point, and, if you must use Java 1.0 or 1.1, use the JGL when your needs go beyond that.

If you can use Java 1.2 you should use only the new collections, which are likely to satisfy all your needs. Note that the bulk of this book was created using Java 1.1, so you’ll see that the collections used through the rest of the book are the ones that are available only in Java 1.1: Vector and Hashtable. This is a somewhat painful restriction at times, but it provides better backward compatibility with older Java code. If you’re writing new code in Java 1.2, the new collections will serve you much better.

Exercises

  1. Create a new class called Gerbil with an int gerbilNumber that’s initialized in the constructor (similar to the Mouse example in this chapter). Give it a method called hop( ) that prints out which gerbil number this is and that it’s hopping. Create a Vector and add a bunch of Gerbil objects to the Vector. Now use the elementAt( ) method to move through the Vector and call hop( ) for each Gerbil.
  2. Modify Exercise 1 so you use an Enumeration to move through the Vector while calling hop( ).
  3. In AssocArray.java, change the example so it uses a Hashtable instead of an AssocArray.
  4. Take the Gerbil class in Exercise 1 and put it into a Hashtable instead, associating the name of the Gerbil as a String (the key) for each Gerbil (the value) you put in the table. Get an Enumeration for the keys( ) and use it to move through the Hashtable, looking up the Gerbil for each key and printing out the key and telling the gerbil to hop( ).
  5. Change Exercise 1 in Chapter 7 to use a Vector to hold the Rodents and an Enumeration to move through the sequence of Rodents. Remember that a Vector holds only Objects so you must use a cast (i.e.: RTTI) when accessing individual Rodents.
  6. (Intermediate) In Chapter 7, locate the GreenhouseControls.java example, which consists of three files. In Controller.java, the class EventSet is just a collection. Change the code to use a Stack instead of an EventSet. This will require more than just replacing EventSet with Stack; you’ll also need to use an Enumeration to cycle through the set of events. You’ll probably find it easier if at times you treat the collection as a Stack and at other times as a Vector.
  7. (Challenging). Find the source code for Vector in the Java source code library that comes with all Java distributions. Copy this code and make a special version called intVector that holds only ints. Consider what it would take to make a special version of Vector for all the primitive types. Now consider what happens if you want to make a linked list class that works with all the primitive types. If parameterized types are ever implemented in Java, they will provide a way to do this work for you automatically (as well as many other benefits).

[32] This is one of the places where C++ is distinctly superior to Java, since C++ supports parameterized types with the template keyword.

[33] The term iterator is common in C++ and elsewhere in OOP, so it’s difficult to know why the Java team used a strange name. The collections library in Java 1.2 fixes this as well as many other problems.

[34] If you plan to use RMI (described in Chapter 15), you should be aware that there’s a problem when putting remote objects into a Hashtable. (See Core Java, by Cornell & Horstmann, Prentice-Hall 1997).

[35] If these speedups still don’t meet your performance needs, you can further accelerate table lookup by writing your own hash table routine. This avoids delays due to casting to and from Objects and synchronization built into the Java Class Library hash table routine. To reach even higher levels of performance, speed enthusiasts can use Donald Knuth’s The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition to replace overflow bucket lists with arrays that have two additional benefits: they can be optimized for disk storage characteristics and they can save most of the time of creating and garbage collecting individual records.

[36] The best reference I know of is Practical Algorithms for Programmers, by Andrew Binstock and John Rex, Addison-Wesley 1995.

[37] At the time of this writing, Collections.sort( ) has been modified to use a stable sort algorithm (one that does not reorder equal elements).

[ Previous Chapter ] [ Short TOC ] [ Table of Contents ] [ Index ] [ Next Chapter ]
Last Update:02/04/2000