1: Introduction to objects

Object-oriented programming appeals at multiple levels. For managers, it promises faster and cheaper development and maintenance. For analysts and designers, the modeling process becomes simpler and produces a clear, manageable design. For programmers, the elegance and clarity of the object model and the power of object-oriented tools and libraries makes programming a much more pleasant task, and programmers experience an increase in productivity. Everybody wins, it would seem.

If there’s a downside, it is the expense of the learning curve. Thinking in objects is a dramatic departure from thinking procedurally, and the process of designing objects is much more challenging than procedural design, especially if you’re trying to create reusable objects. In the past, a novice practitioner of object-oriented programming was faced with a choice between two daunting tasks:

It is, in fact, difficult to design objects well – for that matter, it’s hard to design anything well. But the intent is that a relatively few experts design the best objects for others to consume. Successful OOP languages incorporate not just language syntax and a compiler, but an entire development environment including a significant library of well-designed, easy to use objects. Thus, the primary job of most programmers is to use existing objects to solve their application problems. The goal of this chapter is to show you what object-oriented programming is and how simple it can be.

This chapter will introduce many of the ideas of Java and object-oriented programming on a conceptual level, but keep in mind that you’re not expected to be able to write full-fledged Java programs after reading this chapter. All the detailed descriptions and examples will follow throughout the course of this book.

All programming languages provide abstractions. It can be argued that the complexity of the problems you can solve is directly related to the kind and quality of abstraction. By “kind” I mean: what is it that you are abstracting? Assembly language is a small abstraction of the underlying machine. Many so-called “imperative” languages that followed (such as FORTRAN, BASIC, and C) were abstractions of assembly language. These languages are big improvements over assembly language, but their primary abstraction still requires you to think in terms of the structure of the computer rather than the structure of the problem you are trying to solve. The programmer must establish the association between the machine model (in the “solution space”) and the model of the problem that is actually being solved (in the “problem space”). The effort required to perform this mapping, and the fact that it is extrinsic to the programming language, produces programs that are difficult to write and expensive to maintain, and as a side effect created the entire “programming methods” industry.

The alternative to modeling the machine is to model the problem you’re trying to solve. Early languages such as LISP and APL chose particular views of the world (“all problems are ultimately lists” or “all problems are algorithmic”). PROLOG casts all problems into chains of decisions. Languages have been created for constraint-based programming and for programming exclusively by manipulating graphical symbols. (The latter proved to be too restrictive.) Each of these approaches is a good solution to the particular class of problem they’re designed to solve, but when you step outside of that domain they become awkward.

The object-oriented approach takes a step farther by providing tools for the programmer to represent elements in the problem space. This representation is general enough that the programmer is not constrained to any particular type of problem. We refer to the elements in the problem space and their representations in the solution space as “objects.” (Of course, you will also need other objects that don’t have problem-space analogs.) The idea is that the program is allowed to adapt itself to the lingo of the problem by adding new types of objects, so when you read the code describing the solution, you’re reading words that also express the problem. This is a more flexible and powerful language abstraction than what we’ve had before. Thus OOP allows you to describe the problem in terms of the problem, rather than in the terms of the solution. There’s still a connection back to the computer, though. Each object looks quite a bit like a little computer; it has a state, and it has operations you can ask it to perform. However, this doesn’t seem like such a bad analogy to objects in the real world; they all have characteristics and behaviors.

Aristotle was probably the first to begin a careful study of the concept of type. He was known to speak of “the class of fishes and the class of birds.” The concept that all objects, while being unique, are also part of a set of objects that have characteristics and behaviors in common was directly used in the first object-oriented language, Simula-67, with its fundamental keyword class that introduces a new type into a program (thus class and type are often used synonymously[3]).

Simula, as its name implies, was created for developing simulations such as the classic “bank teller problem.” In this, you have a bunch of tellers, customers, accounts, transactions, etc. The members (elements) of each class share some commonality: every account has a balance, every teller can accept a deposit, etc. At the same time, each member has its own state; each account has a different balance, each teller has a name. Thus the tellers, customers, accounts, transactions, etc. can each be represented with a unique entity in the computer program. This entity is the object, and each object belongs to a particular class that defines its characteristics and behaviors.

Once a type is established, you can make as many objects of that type as you like, and then manipulate those objects as the elements that exist in the problem you are trying to solve. Indeed, one of the challenges of object-oriented programming is to create a one-to-one mapping between the elements in the problem space (the place where the problem actually exists) and the solution space (the place where you’re modeling that problem, such as a computer).

But how do you get an object to do useful work for you? There must be a way to make a request of that object so it will do something, such as complete a transaction, draw something on the screen or turn on a switch. And each object can satisfy only certain requests. The requests you can make of an object are defined by its interface, and the type is what determines the interface. The idea of type being equivalent to interface is fundamental in object-oriented programming.

The name of the type/class is Light, and the requests that you can make of a Light object are to turn it on, turn it off, make it brighter or make it dimmer. You create a “handle” for a Light simply by declaring a name (lt) for that identifier, and you make an object of type Light with the new keyword, assigning it to the handle with the = sign. To send a message to the object, you state the handle name and connect it to the message name with a period (dot). From the standpoint of the user of a pre-defined class, that’s pretty much all there is to programming with objects.

It is helpful to break up the playing field into class creators (those who create new data types) and client programmers[4] (the class consumers who use the data types in their applications). The goal of the client programmer is to collect a toolbox full of classes to use for rapid application development. The goal of the class creator is to build a class that exposes only what’s necessary to the client programmer and keeps everything else hidden. Why? If it’s hidden, the client programmer can’t use it, which means that the class creator can change the hidden portion at will without worrying about the impact to anyone else.

The interface establishes what requests you can make for a particular object. However, there must be code somewhere to satisfy that request. This, along with the hidden data, comprises the implementation. From a procedural programming standpoint, it’s not that complicated. A type has a function associated with each possible request, and when you make a particular request to an object, that function is called. This process is often summarized by saying that you “send a message” (make a request) to an object, and the object figures out what to do with that message (it executes code).

In any relationship it’s important to have boundaries that are respected by all parties involved. When you create a library, you establish a relationship with the client programmer , who is another programmer, but one who is putting together an application or using your library to build a bigger library.

If all the members of a class are available to everyone, then the client programmer can do anything with that class and there’s no way to force any particular behaviors. Even though you might really prefer that the client programmer not directly manipulate some of the members of your class, without access control there’s no way to prevent it. Everything’s naked to the world.

There are two reasons for controlling access to members. The first is to keep client programmers’ hands off portions they shouldn’t touch – parts that are necessary for the internal machinations of the data type but not part of the interface that users need to solve their particular problems. This is actually a service to users because they can easily see what’s important to them and what they can ignore.

The second reason for access control is to allow the library designer to change the internal workings of the structure without worrying about how it will affect the client programmer. For example, you might implement a particular class in a simple fashion to ease development, and then later decide you need to rewrite it to make it run faster. If the interface and implementation are clearly separated and protected, you can accomplish this and require only a relink by the user.

Java uses three explicit keywords and one implied keyword to set the boundaries in a class: public, private, protected and the implied “friendly,” which is what you get if you don’t specify one of the other keywords. Their use and meaning are remarkably straightforward. These access specifiers determine who can use the definition that follows. public means the following definition is available to everyone. The private keyword, on the other hand, means that no one can access that definition except you, the creator of the type, inside function members of that type. private is a brick wall between you and the client programmer. If someone tries to access a private member, they’ll get a compile-time error. “Friendly” has to do with something called a “package,” which is Java’s way of making libraries. If something is “friendly” it’s available only within the package. (Thus this access level is sometimes referred to as “package access.”) protected acts just like private, with the exception that an inheriting class has access to protected members, but not private members. Inheritance will be covered shortly.

Once a class has been created and tested, it should (ideally) represent a useful unit of code. It turns out that this reusability is not nearly so easy to achieve as many would hope; it takes experience and insight to achieve a good design. But once you have such a design, it begs to be reused. Code reuse is arguably the greatest leverage that object-oriented programming languages provide.

The simplest way to reuse a class is to just use an object of that class directly, but you can also place an object of that class inside a new class. We call this “creating a member object.” Your new class can be made up of any number and type of other objects, whatever is necessary to achieve the functionality desired in your new class. This concept is called composition, since you are composing a new class from existing classes. Sometimes composition is referred to as a “has-a” relationship, as in “a car has a trunk.”

Composition comes with a great deal of flexibility. The member objects of your new class are usually private, making them inaccessible to client programmers using the class. This allows you to change those members without disturbing existing client code. You can also change the member objects at run time, which provides great flexibility. Inheritance, which is described next, does not have this flexibility since the compiler must place restrictions on classes created with inheritance.

Because inheritance is so important in object-oriented programming it is often highly emphasized, and the new programmer can get the idea that inheritance should be used everywhere. This can result in awkward and overcomplicated designs. Instead, you should first look to composition when creating new classes, since it is simpler and more flexible. If you take this approach, your designs will stay cleaner. It will be reasonably obvious when you need inheritance.

By itself, the concept of an object is a convenient tool. It allows you to package data and functionality together by concept, so you can represent an appropriate problem-space idea rather than being forced to use the idioms of the underlying machine. These concepts are expressed in the primary idea of the programming language as a data type (using the class keyword).

It seems a pity, however, to go to all the trouble to create a data type and then be forced to create a brand new one that might have similar functionality. It’s nicer if we can take the existing data type, clone it and make additions and modifications to the clone. This is effectively what you get with inheritance, with the exception that if the original class (called the base or super or parent class) is changed, the modified “clone” (called the derived or inherited or sub or child class) also reflects the appropriate changes. Inheritance is implemented in Java with the extends keyword. You make a new class and you say that it extends an existing class.

When you inherit you create a new type, and the new type contains not only all the members of the existing type (although the private ones are hidden away and inaccessible), but more importantly it duplicates the interface of the base class. That is, all the messages you can send to objects of the base class you can also send to objects of the derived class. Since we know the type of a class by the messages we can send to it, this means that the derived class is the same type as the base class. This type equivalence via inheritance is one of the fundamental gateways in understanding the meaning of object-oriented programming.

Since both the base class and derived class have the same interface, there must be some implementation to go along with that interface. That is, there must be a method to execute when an object receives a particular message. If you simply inherit a class and don’t do anything else, the methods from the base-class interface come right along into the derived class. That means objects of the derived class have not only the same type, they also have the same behavior, which doesn’t seem particularly interesting.

You have two ways to differentiate your new derived class from the original base class it inherits from. The first is quite straightforward: you simply add brand new functions to the derived class. These new functions are not part of the base class interface. This means that the base class simply didn’t do as much as you wanted it to, so you add more functions. This simple and primitive use for inheritance is, at times, the perfect solution to your problem. However, you should look closely for the possibility that your base class might need these additional functions.

Although the extends keyword implies that you are going to add new functions to the interface, that’s not necessarily true. The second way to differentiate your new class is to change the behavior of an existing base-class function. This is referred to as overriding that function.

There’s a certain debate that can occur about inheritance: Should inheritance override only base-class functions? This means that the derived type is exactly the same type as the base class since it has exactly the same interface. As a result, you can exactly substitute an object of the derived class for an object of the base class. This can be thought of as pure substitution. In a sense, this is the ideal way to treat inheritance. We often refer to the relationship between the base class and derived classes in this case as an is-a relationship, because you can say “a circle is a shape.” A test for inheritance is whether you can state the is-a relationship about the classes and have it make sense.

There are times when you must add new interface elements to a derived type, thus extending the interface and creating a new type. The new type can still be substituted for the base type, but the substitution isn’t perfect in a sense because your new functions are not accessible from the base type. This can be described as an is-like-a relationship; the new type has the interface of the old type but it also contains other functions, so you can’t really say it’s exactly the same. For example, consider an air conditioner. Suppose your house is wired with all the controls for cooling; that is, it has an interface that allows you to control cooling. Imagine that the air conditioner breaks down and you replace it with a heat pump, which can both heat and cool. The heat pump is-like-an air conditioner, but it can do more. Because your house is wired only to control cooling, it is restricted to communication with the cooling part of the new object. The interface of the new object has been extended, and the existing system doesn’t know about anything except the original interface.

When you see the substitution principle it’s easy to feel like that’s the only way to do things, and in fact it is nice if your design works out that way. But you’ll find that there are times when it’s equally clear that you must add new functions to the interface of a derived class. With inspection both cases should be reasonably obvious.

One of the most important things you do with such a family of classes is to treat an object of a derived class as an object of the base class. This is important because it means you can write a single piece of code that ignores the specific details of type and talks just to the base class. That code is then decoupled from type-specific information, and thus is simpler to write and easier to understand. And, if a new type – a Triangle, for example – is added through inheritance, the code you write will work just as well for the new type of Shape as it did on the existing types. Thus the program is extensible.

What’s happening here is that a Circle handle is being passed into a function that’s expecting a Shape handle. Since a Circle is a Shape it can be treated as one by doStuff( ). That is, any message that doStuff( ) can send to a Shape, a Circle can accept. So it is a completely safe and logical thing to do.

We call this process of treating a derived type as though it were its base type upcasting. The name cast is used in the sense of casting into a mold and the up comes from the way the inheritance diagram is typically arranged, with the base type at the top and the derived classes fanning out downward. Thus, casting to a base type is moving up the inheritance diagram: upcasting.

Notice that it doesn’t say “If you’re a Circle, do this, if you’re a Square, do that, etc.” If you write that kind of code, which checks for all the possible types a Shape can actually be, it’s messy and you need to change it every time you add a new kind of Shape. Here, you just say “You’re a shape, I know you can erase( ) yourself, do it and take care of the details correctly.”

What’s amazing about the code in doStuff( ) is that somehow the right thing happens. Calling draw( ) for Circle causes different code to be executed than when calling draw( ) for a Square or a Line, but when the draw( ) message is sent to an anonymous Shape, the correct behavior occurs based on the actual type that the Shape handle happens to be connected to. This is amazing because when the Java compiler is compiling the code for doStuff( ), it cannot know exactly what types it is dealing with. So ordinarily, you’d expect it to end up calling the version of erase( ) for Shape, and draw( ) for Shape and not for the specific Circle, Square, or Line. And yet the right thing happens. Here’s how it works.

When you send a message to an object even though you don’t know what specific type it is, and the right thing happens, that’s called polymorphism. The process used by object-oriented programming languages to implement polymorphism is called dynamic binding. The compiler and run-time system handle the details; all you need to know is that it happens and more importantly how to design with it.

Some languages require you to use a special keyword to enable dynamic binding. In C++ this keyword is virtual. In Java, you never need to remember to add a keyword because functions are automatically dynamically bound. So you can always expect that when you send a message to an object, the object will do the right thing, even when upcasting is involved.

Often in a design, you want the base class to present only an interface for its derived classes. That is, you don’t want anyone to actually create an object of the base class, only to upcast to it so that its interface can be used. This is accomplished by making that class abstract using the abstract keyword. If anyone tries to make an object of an abstract class, the compiler prevents them. This is a tool to enforce a particular design.

You can also use the abstract keyword to describe a method that hasn’t been implemented yet – as a stub indicating “here is an interface function for all types inherited from this class, but at this point I don’t have any implementation for it.” An abstract method may be created only inside an abstract class. When the class is inherited, that method must be implemented, or the inherited class becomes abstract as well. Creating an abstract method allows you to put a method in an interface without being forced to provide a possibly meaningless body of code for that method.

The interface keyword takes the concept of an abstract class one step further by preventing any function definitions at all. The interface is a very useful and commonly-used tool, as it provides the perfect separation of interface and implementation. In addition, you can combine many interfaces together, if you wish. (You cannot inherit from more than one regular class or abstract class.)

One of the most important factors is the way objects are created and destroyed. Where is the data for an object and how is the lifetime of the object controlled? There are different philosophies at work here. C++ takes the approach that control of efficiency is the most important issue, so it gives the programmer a choice. For maximum run-time speed, the storage and lifetime can be determined while the program is being written, by placing the objects on the stack (these are sometimes called automatic or scoped variables) or in the static storage area. This places a priority on the speed of storage allocation and release, and control of these can be very valuable in some situations. However, you sacrifice flexibility because you must know the exact quantity, lifetime and type of objects while you’re writing the program. If you are trying to solve a more general problem such as computer-aided design, warehouse management or air-traffic control, this is too restrictive.

The second approach is to create objects dynamically in a pool of memory called the heap. In this approach you don’t know until run time how many objects you need, what their lifetime is or what their exact type is. Those are determined at the spur of the moment while the program is running. If you need a new object, you simply make it on the heap at the point that you need it. Because the storage is managed dynamically, at run time, the amount of time required to allocate storage on the heap is significantly longer than the time to create storage on the stack. (Creating storage on the stack is often a single assembly instruction to move the stack pointer down, and another to move it back up.) The dynamic approach makes the generally logical assumption that objects tend to be complicated, so the extra overhead of finding storage and releasing that storage will not have an important impact on the creation of an object. In addition, the greater flexibility is essential to solve the general programming problem.

C++ allows you to determine whether the objects are created while you write the program or at run time to allow the control of efficiency. You might think that since it’s more flexible, you’d always want to create objects on the heap rather than the stack. There’s another issue, however, and that’s the lifetime of an object. If you create an object on the stack or in static storage, the compiler determines how long the object lasts and can automatically destroy it. However, if you create it on the heap the compiler has no knowledge of its lifetime. A programmer has two options for destroying objects: you can determine programmatically when to destroy the object, or the environment can provide a feature called a garbage collector that automatically discovers when an object is no longer in use and destroys it. Of course, a garbage collector is much more convenient, but it requires that all applications must be able to tolerate the existence of the garbage collector and the other overhead for garbage collection. This does not meet the design requirements of the C++ language and so it was not included, but Java does have a garbage collector (as does Smalltalk; Delphi does not but one could be added. Third-party garbage collectors exist for C++).

If you don’t know how many objects you’re going to need to solve a particular problem, or how long they will last, you also don’t know how to store those objects. How can you know how much space to create for those objects? You can’t, since that information isn’t known until run time.

The solution to most problems in object-oriented design seems flippant: you create another type of object. The new type of object that solves this particular problem holds handles to other objects. Of course, you can do the same thing with an array, which is available in most languages. But there’s more. This new object, generally called a collection (also called a container, but the AWT uses that term in a different sense so this book will use “collection”), will expand itself whenever necessary to accommodate everything you place inside it. So you don’t need to know how many objects you’re going to hold in a collection. Just create a collection object and let it take care of the details.

Fortunately, a good OOP language comes with a set of collections as part of the package. In C++, it’s the Standard Template Library (STL). Object Pascal has collections in its Visual Component Library (VCL). Smalltalk has a very complete set of collections. Java also has collections in its standard library. In some libraries, a generic collection is considered good enough for all needs, and in others (C++ in particular) the library has different types of collections for different needs: a vector for consistent access to all elements, and a linked list for consistent insertion at all elements, for example, so you can choose the particular type that fits your needs. These may include sets, queues, hash tables, trees, stacks, etc.

All collections have some way to put things in and get things out. The way that you place something into a collection is fairly obvious. There’s a function called “push” or “add” or a similar name. Fetching things out of a collection is not always as apparent; if it’s an array-like entity such as a vector, you might be able to use an indexing operator or function. But in many situations this doesn’t make sense. Also, a single-selection function is restrictive. What if you want to manipulate or compare a set of elements in the collection instead of just one?

The solution is an iterator, which is an object whose job is to select the elements within a collection and present them to the user of the iterator. As a class, it also provides a level of abstraction. This abstraction can be used to separate the details of the collection from the code that’s accessing that collection. The collection, via the iterator, is abstracted to be simply a sequence. The iterator allows you to traverse that sequence without worrying about the underlying structure – that is, whether it’s a vector, a linked list, a stack or something else. This gives you the flexibility to easily change the underlying data structure without disturbing the code in your program. Java began (in version 1.0 and 1.1) with a standard iterator, called Enumeration, for all of its collection classes. Java 1.2 has added a much more complete collection library which contains an iterator called Iterator that does more than the older Enumeration.

From the design standpoint, all you really want is a sequence that can be manipulated to solve your problem. If a single type of sequence satisfied all of your needs, there’d be no reason to have different kinds. There are two reasons that you need a choice of collections. First, collections provide different types of interfaces and external behavior. A stack has a different interface and behavior than that of a queue, which is different from that of a set or a list. One of these might provide a more flexible solution to your problem than the other. Second, different collections have different efficiencies for certain operations. The best example is a vector and a list. Both are simple sequences that can have identical interfaces and external behaviors. But certain operations can have radically different costs. Randomly accessing elements in a vector is a constant-time operation; it takes the same amount of time regardless of the element you select. However, in a linked list it is expensive to move through the list to randomly select an element, and it takes longer to find an element if it is further down the list. On the other hand, if you want to insert an element in the middle of a sequence, it’s much cheaper in a list than in a vector. These and other operations have different efficiencies depending upon the underlying structure of the sequence. In the design phase, you might start with a list and, when tuning for performance, change to a vector. Because of the abstraction via iterators, you can change from one to the other with minimal impact on your code.

In the end, remember that a collection is only a storage cabinet to put objects in. If that cabinet solves all of your needs, it doesn’t really matter how it is implemented (a basic concept with most types of objects). If you’re working in a programming environment that has built-in overhead due to other factors (running under Windows, for example, or the cost of a garbage collector), then the cost difference between a vector and a linked list might not matter. You might need only one type of sequence. You can even imagine the “perfect” collection abstraction, which can automatically change its underlying implementation according to the way it is used.

One of the issues in OOP that has become especially prominent since the introduction of C++ is whether all classes should ultimately be inherited from a single base class. In Java (as with virtually all other OOP languages) the answer is “yes” and the name of this ultimate base class is simply Object. It turns out that the benefits of the singly-rooted hierarchy are many.

All objects in a singly-rooted hierarchy have an interface in common, so they are all ultimately the same type. The alternative (provided by C++) is that you don’t know that everything is the same fundamental type. From a backwards-compatibility standpoint this fits the model of C better and can be thought of as less restrictive, but when you want to do full-on object-oriented programming you must then build your own hierarchy to provide the same convenience that’s built into other OOP languages. And in any new class library you acquire, some other incompatible interface will be used. It requires effort (and possibly multiple inheritance) to work the new interface into your design. Is the extra “flexibility” of C++ worth it? If you need it – if you have a large investment in C – it’s quite valuable. If you’re starting from scratch, other alternatives such as Java can often be more productive.

All objects in a singly-rooted hierarchy (such as Java provides) can be guaranteed to have certain functionality. You know you can perform certain basic operations on every object in your system. A singly-rooted hierarchy, along with creating all objects on the heap, greatly simplifies argument passing (one of the more complex topics in C++).

A singly-rooted hierarchy makes it much easier to implement a garbage collector. The necessary support can be installed in the base class, and the garbage collector can thus send the appropriate messages to every object in the system. Without a singly-rooted hierarchy and a system to manipulate an object via a handle, it is difficult to implement a garbage collector.

You may wonder why, if it’s so beneficial, a singly-rooted hierarchy isn’t in C++. It’s the old bugaboo of efficiency and control. A singly-rooted hierarchy puts constraints on your program designs, and in particular it was perceived to put constraints on the use of existing C code. These constraints cause problems only in certain situations, but for maximum flexibility there is no requirement for a singly-rooted hierarchy in C++. In Java, which started from scratch and has no backward-compatibility issues with any existing language, it was a logical choice to use the singly-rooted hierarchy in common with most other object-oriented programming languages.

Because a collection is a tool that you’ll use frequently, it makes sense to have a library of collections that are built in a reusable fashion, so you can take one off the shelf and plug it into your program. Java provides such a library, although it is fairly limited in Java 1.0 and 1.1 (the Java 1.2 collections library, however, satisfies most needs).

To use such a collection, you simply add object handles to it, and later ask for them back. But, since the collection holds only Objects, when you add your object handle into the collection it is upcast to Object, thus losing its identity. When you fetch it back, you get an Object handle, and not a handle to the type that you put in. So how do you turn it back into something that has the useful interface of the object that you put into the collection?

Here, the cast is used again, but this time you’re not casting up the inheritance hierarchy to a more general type, you cast down the hierarchy to a more specific type. This manner of casting is called downcasting. With upcasting, you know, for example, that a Circle is a type of Shape so it’s safe to upcast, but you don’t know that an Object is necessarily a Circle or a Shape so it’s hardly safe to downcast unless you know that’s what you’re dealing with.

It’s not completely dangerous, however, because if you downcast to the wrong thing you’ll get a run-time error called an exception, which will be described shortly. When you fetch object handles from a collection, though, you must have some way to remember exactly what they are so you can perform a proper downcast.

Downcasting and the run-time checks require extra time for the running program, and extra effort from the programmer. Wouldn’t it make sense to somehow create the collection so that it knows the types that it holds, eliminating the need for the downcast and possible mistake? The solution is parameterized types, which are classes that the compiler can automatically customize to work with particular types. For example, with a parameterized collection, the compiler could customize that collection so that it would accept only Shapes and fetch only Shapes.

Parameterized types are an important part of C++, partly because C++ has no singly-rooted hierarchy. In C++, the keyword that implements parameterized types is template. Java currently has no parameterized types since it is possible for it to get by – however awkwardly – using the singly-rooted hierarchy. At one point the word generic (the keyword used by Ada for its templates) was on a list of keywords that were “reserved for future implementation.” Some of these seemed to have mysteriously slipped into a kind of “keyword Bermuda Triangle” and it’s difficult to know what might eventually happen.

Each object requires resources in order to exist, most notably memory. When an object is no longer needed it must be cleaned up so that these resources are released for reuse. In simple programming situations the question of how an object is cleaned up doesn’t seem too challenging: you create the object, use it for as long as it’s needed, and then it should be destroyed. It’s not too hard, however, to encounter situations in which the situation is more complex.

Suppose, for example, you are designing a system to manage air traffic for an airport. (The same model might also work for managing crates in a warehouse, or a video rental system, or a kennel for boarding pets.) At first it seems simple: make a collection to hold airplanes, then create a new airplane and place it in the collection for each airplane that enters the air-traffic-control zone. For cleanup, simply delete the appropriate airplane object when a plane leaves the zone.

But perhaps you have some other system to record data about the planes; perhaps data that doesn’t require such immediate attention as the main controller function. Maybe it’s a record of the flight plans of all the small planes that leave the airport. So you have a second collection of small planes, and whenever you create a plane object you also put it in this collection if it’s a small plane. Then some background process performs operations on the objects in this collection during idle moments.

Now the problem is more difficult: how can you possibly know when to destroy the objects? When you’re done with the object, some other part of the system might not be. This same problem can arise in a number of other situations, and in programming systems (such as C++) in which you must explicitly delete an object when you’re done with it this can become quite complex.[6]

With Java, the garbage collector is designed to take care of the problem of releasing the memory (although this doesn’t include other aspects of cleaning up an object). The garbage collector “knows” when an object is no longer in use, and it then automatically releases the memory for that object. This, combined with the fact that all objects are inherited from the single root class Object and that you can create objects only one way, on the heap, makes the process of programming in Java much simpler than programming in C++. You have far fewer decisions to make and hurdles to overcome.

If all this is such a good idea, why didn’t they do the same thing in C++? Well of course there’s a price you pay for all this programming convenience, and that price is run-time overhead. As mentioned before, in C++ you can create objects on the stack, and in this case they’re automatically cleaned up (but you don’t have the flexibility of creating as many as you want at run-time). Creating objects on the stack is the most efficient way to allocate storage for objects and to free that storage. Creating objects on the heap can be much more expensive. Always inheriting from a base class and making all function calls polymorphic also exacts a small toll. But the garbage collector is a particular problem because you never quite know when it’s going to start up or how long it will take. This means that there’s an inconsistency in the rate of execution of a Java program, so you can’t use it in certain situations, such as when the rate of execution of a program is uniformly critical. (These are generally called real time programs, although not all real-time programming problems are this stringent.)[7]

The designers of the C++ language, trying to woo C programmers (and most successfully, at that), did not want to add any features to the language that would impact the speed or the use of C++ in any situation where C might be used. This goal was realized, but at the price of greater complexity when programming in C++. Java is simpler than C++, but the tradeoff is in efficiency and sometimes applicability. For a significant portion of programming problems, however, Java is often the superior choice.

Ever since the beginning of programming languages, error handling has been one of the most difficult issues. Because it’s so hard to design a good error-handling scheme, many languages simply ignore the issue, passing the problem on to library designers who come up with halfway measures that can work in many situations but can easily be circumvented, generally by just ignoring them. A major problem with most error-handling schemes is that they rely on programmer vigilance in following an agreed-upon convention that is not enforced by the language. If the programmer is not vigilant, which is often if they are in a hurry, these schemes can easily be forgotten.

Exception handling wires error handling directly into the programming language and sometimes even the operating system. An exception is an object that is “thrown” from the site of the error and can be “caught” by an appropriate exception handler designed to handle that particular type of error. It’s as if exception handling is a different, parallel path of execution that can be taken when things go wrong. And because it uses a separate execution path, it doesn’t need to interfere with your normally-executing code. This makes that code simpler to write since you aren’t constantly forced to check for errors. In addition, a thrown exception is unlike an error value that’s returned from a function or a flag that’s set by a function in order to indicate an error condition, These can be ignored. An exception cannot be ignored so it’s guaranteed to be dealt with at some point. Finally, exceptions provide a way to reliably recover from a bad situation. Instead of just exiting you are often able to set things right and restore the execution of a program, which produces much more robust programs.

Java’s exception handling stands out among programming languages, because in Java, exception-handling was wired in from the beginning and you’re forced to use it. If you don’t write your code to properly handle exceptions, you’ll get a compile-time error message. This guaranteed consistency makes error-handling much easier.

A fundamental concept in computer programming is the idea of handling more than one task at a time. Many programming problems require that the program be able to stop what it’s doing, deal with some other problem and return to the main process. The solution has been approached in many ways. Initially, programmers with low-level knowledge of the machine wrote interrupt service routines and the suspension of the main process was initiated through a hardware interrupt. Although this worked well, it was difficult and non-portable, so it made moving a program to a new type of machine slow and expensive.

Sometimes interrupts are necessary for handling time-critical tasks, but there’s a large class of problems in which you’re simply trying to partition the problem into separately-running pieces so that the whole program can be more responsive. Within a program, these separately-running pieces are called threads and the general concept is called multithreading. A common example of multithreading is the user interface. By using threads, a user can press a button and get a quick response rather than being forced to wait until the program finishes its current task.

Ordinarily, threads are just a way to allocate the time of a single processor. But if the operating system supports multiple processors, each thread can be assigned to a different processor and they can truly run in parallel. One of the convenient features of multithreading at the language level is that the programmer doesn’t need to worry about whether there are many processors or just one. The program is logically divided into threads and if the machine has more than one processor then the program runs faster, without any special adjustments.

All this makes threading sound pretty simple. There is a catch: shared resources. If you have more than one thread running that’s expecting to access the same resource you have a problem. For example, two processes can’t simultaneously send information to a printer. To solve the problem, resources that can be shared, such as the printer, must be locked while they are being used. So a thread locks a resource, completes its task and then releases the lock so that someone else can use the resource.

Java’s threading is built into the language, which makes a complicated subject much simpler. The threading is supported on an object level, so one thread of execution is represented by one object. Java also provides limited resource locking. It can lock the memory of any object (which is, after all, one kind of shared resource) so that only one thread can use it at a time. This is accomplished with the synchronized keyword. Other types of resources must be locked explicitly by the programmer, typically by creating an object to represent the lock that all threads must check before accessing that resource.

When you create an object, it exists for as long as you need it, but under no circumstances does it exist when the program terminates. While this makes sense at first, there are situations in which it would be incredibly useful if an object could exist and hold its information even while the program wasn’t running. Then the next time you started the program, the object would be there and it would have the same information it had the previous time the program was running. Of course you can get a similar effect now by writing the information to a file or to a database, but in the spirit of making everything an object it would be quite convenient to be able to declare an object persistent and have all the details taken care of for you.

Java 1.1 provides support for “lightweight persistence,” which means that you can easily store objects on disk and later retrieve them. The reason it’s “lightweight” is that you’re still forced to make explicit calls to do the storage and retrieval. In some future release more complete support for persistence might appear.

If Java is, in fact, yet another computer programming language, you may question why it is so important and why it is being promoted as a revolutionary step in computer programming. The answer isn’t immediately obvious if you’re coming from a traditional programming perspective. Although Java will solve traditional stand-alone programming problems, the reason it is important is that it will also solve programming problems on the World Wide Web.

The Web can seem a bit of a mystery at first, with all this talk of “surfing,” “presence” and “home pages.” There has even been a growing reaction against “Internet-mania,” questioning the economic value and outcome of such a sweeping movement. It’s helpful to step back and see what it really is, but to do this you must understand client/server systems, another aspect of computing that’s full of confusing issues.

The primary idea of a client/server system is that you have a central repository of information – some kind of data, typically in a database – that you want to distribute on demand to some set of people or machines. A key to the client/server concept is that the repository of information is centrally located so that it can be changed and so that those changes will propagate out to the information consumers. Taken together, the information repository, the software that distributes the information and the machine(s) where the information and software reside is called the server. The software that resides on the remote machine, and that communicates with the server, fetches the information, processes it, and displays it on the remote machine is called the client.

The basic concept of client/server computing, then, is not so complicated. The problems arise because you have a single server trying to serve many clients at once. Generally a database management system is involved so the designer “balances” the layout of data into tables for optimal use. In addition, systems often allow a client to insert new information into a server. This means you must ensure that one client’s new data doesn’t walk over another client’s new data, or that data isn’t lost in the process of adding it to the database. (This is called transaction processing.) As client software changes, it must be built, debugged and installed on the client machines, which turns out to be more complicated and expensive than you might think. It’s especially problematic to support multiple types of computers and operating systems. Finally, there’s the all-important performance issue: you might have hundreds of clients making requests of your server at any one time, and so any small delay is crucial. To minimize latency, programmers work hard to offload processing tasks, often to the client machine but sometimes to other machines at the server site using so-called middleware. (Middleware is also used to improve maintainability.)

So the simple idea of distributing information to people has so many layers of complexity in implementing it that the whole problem can seem hopelessly enigmatic. And yet it’s crucial: client/server computing accounts for roughly half of all programming activities. It’s responsible for everything from taking orders and credit-card transactions to the distribution of any kind of data – stock market, scientific, government – you name it. What we’ve come up with in the past is individual solutions to individual problems, inventing a new solution each time. These were hard to create and hard to use and the user had to learn a new interface for each one. The entire client/server problem needs to be solved in a big way.

The Web is actually one giant client-server system. It’s a bit worse than that, since you have all the servers and clients coexisting on a single network at once. You don’t need to know that, since all you care about is connecting to and interacting with one server at a time (even though you might be hopping around the world in your search for the correct server).

Initially it was a simple one-way process. You made a request of a server and it handed you a file, which your machine’s browser software (i.e. the client) would interpret by formatting onto your local machine. But in short order people began wanting to do more than just deliver pages from a server. They wanted full client/server capability so that the client could feed information back to the server, for example, to do database lookups on the server, to add new information to the server or to place an order (which required more security than the original systems offered). These are the changes we’ve been seeing in the development of the Web.

The Web browser was a big step forward: the concept that one piece of information could be displayed on any type of computer without change. However, browsers were still rather primitive and rapidly bogged down by the demands placed on them. They weren’t particularly interactive and tended to clog up both the server and the Internet because any time you needed to do something that required programming you had to send information back to the server to be processed. It could take many seconds or minutes to find out you had misspelled something in your request. Since the browser was just a viewer it couldn’t perform even the simplest computing tasks. (On the other hand, it was safe, since it couldn’t execute any programs on your local machine that contained bugs or viruses.)

To solve this problem, different approaches have been taken. To begin with, graphics standards have been enhanced to allow better animation and video within browsers. The remainder of the problem can be solved only by incorporating the ability to run programs on the client end, under the browser. This is called client-side programming.

The Web’s initial server-browser design provided for interactive content, but the interactivity was completely provided by the server. The server produced static pages for the client browser, which would simply interpret and display them. Basic HTML contains simple mechanisms for data gathering: text-entry boxes, check boxes, radio boxes, lists and drop-down lists, as well as a button that can only be programmed to reset the data on the form or “submit” the data on the form back to the server. This submission passes through the Common Gateway Interface (CGI) provided on all Web servers. The text within the submission tells CGI what to do with it. The most common action is to run a program located on the server in a directory that’s typically called “cgi-bin.” (If you watch the address window at the top of your browser when you push a button on a Web page, you can sometimes see “cgi-bin” within all the gobbledygook there.) These programs can be written in most languages. Perl is a common choice because it is designed for text manipulation and is interpreted, so it can be installed on any server regardless of processor or operating system.

Many powerful Web sites today are built strictly on CGI, and you can in fact do nearly anything with it. The problem is response time. The response of a CGI program depends on how much data must be sent as well as the load on both the server and the Internet. (On top of this, starting a CGI program tends to be slow.) The initial designers of the Web did not foresee how rapidly this bandwidth would be exhausted for the kinds of applications people developed. For example, any sort of dynamic graphing is nearly impossible to perform with consistency because a GIF file must be created and moved from the server to the client for each version of the graph. And you’ve no doubt had direct experience with something as simple as validating the data on an input form. You press the submit button on a page; the data is shipped back to the server; the server starts a CGI program that discovers an error, formats an HTML page informing you of the error and sends the page back to you; you must then back up a page and try again. Not only is this slow, it’s not elegant.

The solution is client-side programming. Most machines that run Web browsers are powerful engines capable of doing vast work, and with the original static HTML approach they are sitting there, just idly waiting for the server to dish up the next page. Client-side programming means that the Web browser is harnessed to do whatever work it can, and the result for the user is a much speedier and more interactive experience at your Web site.

The problem with discussions of client-side programming is that they aren’t very different from discussions of programming in general. The parameters are almost the same, but the platform is different: a Web browser is like a limited operating system. In the end, it’s still programming and this accounts for the dizzying array of problems and solutions produced by client-side programming. The rest of this section provides an overview of the issues and approaches in client-side programming.

One of the most significant steps forward in client-side programming is the development of the plug-in. This is a way for a programmer to add new functionality to the browser by downloading a piece of code that plugs itself into the appropriate spot in the browser. It tells the browser “from now on you can perform this new activity.” (You need to download the plug-in only once.) Some fast and powerful behavior is added to browsers via plug-ins, but writing a plug-in is not a trivial task and isn’t something you’d want to do as part of the process of building a particular site. The value of the plug-in for client-side programming is that it allows an expert programmer to develop a new language and add that language to a browser without the permission of the browser manufacturer. Thus, plug-ins provide the back door that allows the creation of new client-side programming languages (although not all languages are implemented as plug-ins).

Plug-ins resulted in an explosion of scripting languages. With a scripting language you embed the source code for your client-side program directly into the HTML page and the plug-in that interprets that language is automatically activated while the HTML page is being displayed. Scripting languages tend to be reasonably simple to understand, and because they are simply text that is part of an HTML page they load very quickly as part of the single server hit required to procure that page. The trade-off is that your code is exposed for everyone to see (and steal) but generally you aren’t doing amazingly sophisticated things with scripting languages so it’s not too much of a hardship.

This points out that scripting languages are really intended to solve specific types of problems, primarily the creation of richer and more interactive graphical user interfaces (GUIs). However, a scripting language might solve 80 percent of the problems encountered in client-side programming. Your problems might very well fit completely within that 80 percent, and since scripting languages tend to be easier and faster to develop, you should probably consider a scripting language before looking at a more involved solution such as Java or ActiveX programming.

The most commonly-discussed scripting languages are JavaScript (which has nothing to do with Java; it’s named that way just to grab some of Java’s marketing momentum), VBScript (which looks like Visual Basic) and Tcl/Tk, which comes from the popular cross-platform GUI-building language. There are others out there and no doubt more in development.

JavaScript is probably the most commonly supported. It comes built into both Netscape Navigator and the Microsoft Internet Explorer (IE). In addition, there are probably more JavaScript books out than for the other languages, and some tools automatically create pages using JavaScript. However, if you’re already fluent in Visual Basic or Tcl/Tk, you’ll be more productive using those scripting languages rather than learning a new one. (You’ll have your hands full dealing with the Web issues already.)

If a scripting language can solve 80 percent of the client-side programming problems, what about the other 20 percent – the “really hard stuff?” The most popular solution today is Java. Not only is it a powerful programming language built to be secure, cross-platform and international, but Java is being continuously extended to provide language features and libraries that elegantly handle problems that are difficult in traditional programming languages, such as multithreading, database access, network programming and distributed computing. Java allows client-side programming via the applet.

An applet is a mini-program that will run only under a Web browser. The applet is downloaded automatically as part of a Web page (just as, for example, a graphic is automatically downloaded). When the applet is activated it executes a program. This is part of its beauty – it provides you with a way to automatically distribute the client software from the server at the time the user needs the client software, and no sooner. They get the latest version of the client software without fail and without difficult re-installation. Because of the way Java is designed, the programmer needs to create only a single program, and that program automatically works with all computers that have browsers with built-in Java interpreters. (This safely includes the vast majority of machines.) Since Java is a full-fledged programming language, you can do as much work as possible on the client before and after making requests of the server. For example, you won’t need to send a request form across the Internet to discover that you’ve gotten a date or some other parameter wrong, and your client computer can quickly do the work of plotting data instead of waiting for the server to make a plot and ship a graphic image back to you. Not only do you get the immediate win of speed and responsiveness, but the general network traffic and load upon servers can be reduced, preventing the entire Internet from slowing down.

One advantage a Java applet has over a scripted program is that it’s in compiled form, so the source code isn’t available to the client. On the other hand, a Java applet can be decompiled without too much trouble, and hiding your code is often not an important issue anyway. Two other factors can be important. As you will see later in the book, a compiled Java applet can comprise many modules and take multiple server “hits” (accesses) to download. (In Java 1.1 this is minimized by Java archives, called JAR files, that allow all the required modules to be packaged together for a single download.) A scripted program will just be integrated into the Web page as part of its text (and will generally be smaller and reduce server hits). This could be important to the responsiveness of your Web site. Another factor is the all-important learning curve. Regardless of what you’ve heard, Java is not a trivial language to learn. If you’re a Visual Basic programmer, moving to VBScript will be your fastest solution and since it will probably solve most typical client/server problems you might be hard pressed to justify learning Java. If you’re experienced with a scripting language you will certainly benefit from looking at JavaScript or VBScript before committing to Java, since they might fit your needs handily and you’ll be more productive sooner.

To some degree, the competitor to Java is Microsoft’s ActiveX, although it takes a completely different approach. ActiveX is originally a Windows-only solution, although it is now being developed via an independent consortium to become cross-platform. Effectively, ActiveX says “if your program connects to its environment just so, it can be dropped into a Web page and run under a browser that supports ActiveX.” (IE directly supports ActiveX and Netscape does so using a plug-in.) Thus, ActiveX does not constrain you to a particular language. If, for example, you’re already an experienced Windows programmer using a language such as C++, Visual Basic, or Borland’s Delphi, you can create ActiveX components with almost no changes to your programming knowledge. ActiveX also provides a path for the use of legacy code in your Web pages.

Automatically downloading and running programs across the Internet can sound like a virus-builder’s dream. ActiveX especially brings up the thorny issue of security in client-side programming. If you click on a Web site, you might automatically download any number of things along with the HTML page: GIF files, script code, compiled Java code, and ActiveX components. Some of these are benign; GIF files can’t do any harm, and scripting languages are generally limited in what they can do. Java was also designed to run its applets within a “sandbox” of safety, which prevents it from writing to disk or accessing memory outside the sandbox.

ActiveX is at the opposite end of the spectrum. Programming with ActiveX is like programming Windows – you can do anything you want. So if you click on a page that downloads an ActiveX component, that component might cause damage to the files on your disk. Of course, programs that you load onto your computer that are not restricted to running inside a Web browser can do the same thing. Viruses downloaded from Bulletin-Board Systems (BBSs) have long been a problem, but the speed of the Internet amplifies the difficulty.

The solution seems to be “digital signatures,” whereby code is verified to show who the author is. This is based on the idea that a virus works because its creator can be anonymous, so if you remove the anonymity individuals will be forced to be responsible for their actions. This seems like a good plan because it allows programs to be much more functional, and I suspect it will eliminate malicious mischief. If, however, a program has an unintentional bug that’s destructive it will still cause problems.

The Java approach is to prevent these problems from occurring, via the sandbox. The Java interpreter that lives on your local Web browser examines the applet for any untoward instructions as the applet is being loaded. In particular, the applet cannot write files to disk or erase files (one of the mainstays of the virus). Applets are generally considered to be safe, and since this is essential for reliable client-server systems, any bugs that allow viruses are rapidly repaired. (It’s worth noting that the browser software actually enforces these security restrictions, and some browsers allow you to select different security levels to provide varying degrees of access to your system.)

You might be skeptical of this rather draconian restriction against writing files to your local disk. For example, you may want to build a local database or save data for later use offline. The initial vision seemed to be that eventually everyone would be online to do anything important, but that was soon seen to be impractical (although low-cost “Internet appliances” might someday satisfy the needs of a significant segment of users). The solution is the “signed applet” that uses public-key encryption to verify that an applet does indeed come from where it claims it does. A signed applet can then go ahead and trash your disk, but the theory is that since you can now hold the applet creator accountable they won’t do vicious things. Java 1.1 provides a framework for digital signatures so that you will eventually be able to allow an applet to step outside the sandbox if necessary.

Digital signatures have missed an important issue, which is the speed that people move around on the Internet. If you download a buggy program and it does something untoward, how long will it be before you discover the damage? It could be days or even weeks. And by then, how will you track down the program that’s done it (and what good will it do at that point?).

The Web is the most general solution to the client/server problem, so it makes sense that you can use the same technology to solve a subset of the problem, in particular the classic client/server problem within a company. With traditional client/server approaches you have the problem of multiple different types of client computers, as well as the difficulty of installing new client software, both of which are handily solved with Web browsers and client-side programming. When Web technology is used for an information network that is restricted to a particular company, it is referred to as an Intranet. Intranets provide much greater security than the Internet, since you can physically control access to the servers within your company. In terms of training, it seems that once people understand the general concept of a browser it’s much easier for them to deal with differences in the way pages and applets look, so the learning curve for new kinds of systems seems to be reduced.

The security problem brings us to one of the divisions that seems to be automatically forming in the world of client-side programming. If your program is running on the Internet, you don’t know what platform it will be working under and you want to be extra careful that you don’t disseminate buggy code. You need something cross-platform and secure, like a scripting language or Java.

If you’re running on an Intranet, you might have a different set of constraints. It’s not uncommon that your machines could all be Intel/Windows platforms. On an Intranet, you’re responsible for the quality of your own code and can repair bugs when they’re discovered. In addition, you might already have a body of legacy code that you’ve been using in a more traditional client/server approach, whereby you must physically install client programs every time you do an upgrade. The time wasted in installing upgrades is the most compelling reason to move to browsers because upgrades are invisible and automatic. If you are involved in such an Intranet, the most sensible approach to take is ActiveX rather than trying to recode your programs in a new language.

When faced with this bewildering array of solutions to the client-side programming problem, the best plan of attack is a cost-benefit analysis. Consider the constraints of your problem and what would be the fastest way to get to your solution. Since client-side programming is still programming, it’s always a good idea to take the fastest development approach for your particular situation. This is an aggressive stance to prepare for inevitable encounters with the problems of program development.

This whole discussion has ignored the issue of server-side programming. What happens when you make a request of a server? Most of the time the request is simply “send me this file.” Your browser then interprets the file in some appropriate fashion: as an HTML page, a graphic image, a Java applet, a script program, etc. A more complicated request to a server generally involves a database transaction. A common scenario involves a request for a complex database search, which the server then formats into an HTML page and sends to you as the result. (Of course, if the client has more intelligence via Java or a scripting language, the raw data can be sent and formatted at the client end, which will be faster and less load on the server.) Or you might want to register your name in a database when you join a group or place an order, which will involve changes to that database. These database requests must be processed via some code on the server side, which is generally referred to as server-side programming. Traditionally, server-side programming has been performed using Perl and CGI scripts, but more sophisticated systems have been appearing. These include Java-based Web servers that allow you to perform all your server-side programming in Java by writing what are called servlets.

Most of the brouhaha over Java has been about applets. Java is actually a general-purpose programming language that can solve any type of problem, at least in theory. And as pointed out previously, there might be more effective ways to solve most client/server problems. When you move out of the applet arena (and simultaneously release the restrictions, such as the one against writing to disk) you enter the world of general-purpose applications that run standalone, without a Web browser, just like any ordinary program does. Here, Java’s strength is not only in its portability, but also its programmability. As you’ll see throughout this book, Java has many features that allow you to create robust programs in a shorter period than with previous programming languages.

Be aware that this is a mixed blessing. You pay for the improvements through slower execution speed (although there is significant work going on in this area). Like any language, Java has built-in limitations that might make it inappropriate to solve certain types of programming problems. Java is a rapidly-evolving language, however, and as each new release comes out it becomes more and more attractive for solving larger sets of problems.

The object-oriented paradigm is a new and different way of thinking about programming and many folks have trouble at first knowing how to approach a project. Now that you know that everything is supposed to be an object, you can create a “good” design, one that will take advantage of all the benefits that OOP has to offer.

Books on OOP analysis and design are coming out of the woodwork. Most of these books are filled with lots of long words, awkward prose and important-sounding pronouncements.[9] I come away thinking the book would be better as a chapter or at the most a very short book and feeling annoyed that this process couldn’t be described simply and directly. (It disturbs me that people who purport to specialize in managing complexity have such trouble writing clear and simple books.) After all, the whole point of OOP is to make the process of software development easier, and although it would seem to threaten the livelihood of those of us who consult because things are complex, why not make it simple? So, hoping I’ve built a healthy skepticism within you, I shall endeavor to give you my own perspective on analysis and design in as few paragraphs as possible.

While you’re going through the development process, the most important issue is this: don’t get lost. It’s easy to do. Most of these methodologies are designed to solve the largest of problems. (This makes sense; these are the especially difficult projects that justify calling in that author as consultant, and justify the author’s large fees.) Remember that most projects don’t fit into that category, so you can usually have a successful analysis and design with a relatively small subset of what a methodology recommends. But some sort of process, no matter how limited, will generally get you on your way in a much better fashion than simply beginning to code.

The first step is to decide what steps you’re going to have in your process. It sounds simple (in fact, all of this sounds simple) and yet, often, people don’t even get around to phase one before they start coding. If your plan is “let’s jump in and start coding,” fine. (Sometimes that’s appropriate when you have a well-understood problem.) At least agree that this is the plan.

You might also decide at this phase that some additional process structure is necessary but not the whole nine yards. Understandably enough, some programmers like to work in “vacation mode” in which no structure is imposed on the process of developing their work: “It will be done when it’s done.” This can be appealing for awhile, but I’ve found that having a few milestones along the way helps to focus and galvanize your efforts around those milestones instead of being stuck with the single goal of “finish the project.” In addition, it divides the project into more bite-sized pieces and make it seem less threatening.

When I began to study story structure (so that I will someday write a novel) I was initially resistant to the idea, feeling that when I wrote I simply let it flow onto the page. What I found was that when I wrote about computers the structure was simple enough so I didn’t need to think much about it, but I was still structuring my work, albeit only semi-consciously in my head. So even if you think that your plan is to just start coding, you still go through the following phases while asking and answering certain questions.

In the previous generation of program design (procedural design), this would be called “creating the requirements analysis and system specification.” These, of course, were places to get lost: intimidatingly-named documents that could become big projects in their own right. Their intention was good, however. The requirements analysis says “Make a list of the guidelines we will use to know when the job is done and the customer is satisfied.” The system specification says “Here’s a description of what the program will do (not how) to satisfy the requirements.” The requirements analysis is really a contract between you and the customer (even if the customer works within your company or is some other object or system). The system specification is a top-level exploration into the problem and in some sense a discovery of whether it can be done and how long it will take. Since both of these will require consensus among people, I think it’s best to keep them as bare as possible – ideally, to lists and basic diagrams – to save time. You might have other constraints that require you to expand them into bigger documents.

It’s necessary to stay focused on the heart of what you’re trying to accomplish in this phase: determine what the system is supposed to do. The most valuable tool for this is a collection of what are called “use-cases.” These are essentially descriptive answers to questions that start with “What does the system do if ...” For example, “What does the auto-teller do if a customer has just deposited a check within 24 hours and there’s not enough in the account without the check to provide the desired withdrawal?” The use-case then describes what the auto-teller does in that case.

You try to discover a full set of use-cases for your system, and once you’ve done that you’ve got the core of what the system is supposed to do. The nice thing about focusing on use-cases is that they always bring you back to the essentials and keep you from drifting off into issues that aren’t critical for getting the job done. That is, if you have a full set of use-cases you can describe your system and move onto the next phase. You probably won’t get it all figured out perfectly at this phase, but that’s OK. Everything will reveal itself in the fullness of time, and if you demand a perfect system specification at this point you’ll get stuck.

It helps to kick-start this phase by describing the system in a few paragraphs and then looking for nouns and verbs. The nouns become the objects and the verbs become the methods in the object interfaces. You’ll be surprised at how useful a tool this can be; sometimes it will accomplish the lion’s share of the work for you.

Although it’s a black art, at this point some kind of scheduling can be quite useful. You now have an overview of what you’re building so you’ll probably be able to get some idea of how long it will take. A lot of factors come into play here: if you estimate a long schedule then the company might not decide to build it, or a manager might have already decided how long the project should take and will try to influence your estimate. But it’s best to have an honest schedule from the beginning and deal with the tough decisions early. There have been a lot of attempts to come up with accurate scheduling techniques (like techniques to predict the stock market), but probably the best approach is to rely on your experience and intuition. Get a gut feeling for how long it will really take, then double that and add 10 percent. Your gut feeling is probably correct; you can get something working in that time. The “doubling” will turn that into something decent, and the 10 percent will deal with final polishing and details. However you want to explain it, and regardless of the moans and manipulations that happen when you reveal such a schedule, it just seems to work out that way.

In this phase you must come up with a design that describes what the classes look like and how they will interact. A useful diagramming tool that has evolved over time is the Unified Modeling Language (UML). You can get the specification for UML at www.rational.com. UML can also be helpful as a descriptive tool during phase 1, and some of the diagrams you create there will probably show up unmodified in phase 2. You don’t need to use UML, but it can be helpful, especially if you want to put a diagram up on the wall for everyone to ponder, which is a good idea. An alternative to UML is a textual description of the objects and their interfaces (as I described in Thinking in C++), but this can be limiting.

The most successful consulting experiences I’ve had when coming up with an initial design involves standing in front of a team, who hadn’t built an OOP project before, and drawing objects on a whiteboard. We talked about how the objects should communicate with each other, and erased some of them and replaced them with other objects. The team (who knew what the project was supposed to do) actually created the design; they “owned” the design rather than having it given to them. All I was doing was guiding the process by asking the right questions, trying out the assumptions and taking the feedback from the team to modify those assumptions. The true beauty of the process was that the team learned how to do object-oriented design not by reviewing abstract examples, but by working on the one design that was most interesting to them at that moment: theirs.

You’ll know you’re done with phase 2 when you have described the objects and their interfaces. Well, most of them – there are usually a few that slip through the cracks and don’t make themselves known until phase 3. But that’s OK. All you are concerned with is that you eventually discover all of your objects. It’s nice to discover them early in the process but OOP provides enough structure so that it’s not so bad if you discover them later.

If you’re reading this book you’re probably a programmer, so now we’re at the part you’ve been trying to get to. By following a plan – no matter how simple and brief – and coming up with design structure before coding, you’ll discover that things fall together far more easily than if you dive in and start hacking, and this provides a great deal of satisfaction. Getting code to run and do what you want is fulfilling, even like some kind of drug if you look at the obsessive behavior of some programmers. But it’s my experience that coming up with an elegant solution is deeply satisfying at an entirely different level; it feels closer to art than technology. And elegance always pays off; it’s not a frivolous pursuit. Not only does it give you a program that’s easier to build and debug, but it’s also easier to understand and maintain, and that’s where the financial value lies.

After you build the system and get it running, it’s important to do a reality check, and here’s where the requirements analysis and system specification comes in. Go through your program and make sure that all the requirements are checked off, and that all the use-cases work the way they’re described. Now you’re done. Or are you?

This is the point in the development cycle that has traditionally been called “maintenance,” a catch-all term that can mean everything from “getting it to work the way it was really supposed to in the first place” to “adding features that the customer forgot to mention before” to the more traditional “fixing the bugs that show up” and “adding new features as the need arises.” So many misconceptions have been applied to the term “maintenance” that it has taken on a slightly deceiving quality, partly because it suggests that you’ve actually built a pristine program and that all you need to do is change parts, oil it and keep it from rusting. Perhaps there’s a better term to describe what’s going on.

The term is iteration. That is, “You won’t get it right the first time, so give yourself the latitude to learn and to go back and make changes.” You might need to make a lot of changes as you learn and understand the problem more deeply. The elegance you’ll produce if you iterate until you’ve got it right will pay off, both in the short and the long run.

What it means to “get it right” isn’t just that the program works according to the requirements and the use-cases. It also means that the internal structure of the code makes sense to you, and feels like it fits together well, with no awkward syntax, oversized objects or ungainly exposed bits of code. In addition, you must have some sense that the program structure will survive the changes that it will inevitably go through during its lifetime, and that those changes can be made easily and cleanly. This is no small feat. You must not only understand what you’re building, but also how the program will evolve (what I call the vector of change). Fortunately, object-oriented programming languages are particularly adept at supporting this kind of continuing modification – the boundaries created by the objects are what tend to keep the structure from breaking down. They are also what allow you to make changes that would seem drastic in a procedural program without causing earthquakes throughout your code. In fact, support for iteration might be the most important benefit of OOP.

With iteration, you create something that at least approximates what you think you’re building, and then you kick the tires, compare it to your requirements and see where it falls short. Then you can go back and fix it by redesigning and re-implementing the portions of the program that didn’t work right.[10] You might actually need to solve the problem, or an aspect of the problem, several times before you hit on the right solution. (A study of Design Patterns, described in Chapter 16, is usually helpful here.)

Iteration also occurs when you build a system, see that it matches your requirements and then discover it wasn’t actually what you wanted. When you see the system, you realize you want to solve a different problem. If you think this kind of iteration is going to happen, then you owe it to yourself to build your first version as quickly as possible so you can find out if it’s what you want.

Iteration is closely tied to incremental development. Incremental development means that you start with the core of your system and implement it as a framework upon which to build the rest of the system piece by piece. Then you start adding features one at a time. The trick to this is in designing a framework that will accommodate all the features you plan to add to it. (See Chapter 16 for more insight into this issue.) The advantage is that once you get the core framework working, each feature you add is like a small project in itself rather than part of a big project. Also, new features that are incorporated later in the development or maintenance phases can be added more easily. OOP supports incremental development because if your program is designed well, your increments will turn out to be discreet objects or groups of objects.

Of course you wouldn’t build a house without a lot of carefully-drawn plans. If you build a deck or a dog house, your plans won’t be so elaborate but you’ll still probably start with some kind of sketches to guide you on your way. Software development has gone to extremes. For a long time, people didn’t have much structure in their development, but then big projects began failing. In reaction, we ended up with methodologies that had an intimidating amount of structure and detail. These were too scary to use – it looked like you’d spend all your time writing documents and no time programming. (This was often the case.) I hope that what I’ve shown you here suggests a middle path – a sliding scale. Use an approach that fits your needs (and your personality). No matter how minimal you choose to make it, some kind of plan will make a big improvement in your project as opposed to no plan at all. Remember that, by some estimates, over 50 percent of projects fail.

Java looks a lot like C++, and so naturally it would seem that C++ will be replaced by Java. But I’m starting to question this logic. For one thing, C++ still has some features that Java doesn’t, and although there have been a lot of promises about Java someday being as fast or faster than C++ the breakthroughs haven’t happened yet (it’s getting steadily faster, but still hasn’t touched C++). Also, there seems to be a perking interest in C++ in many fields, so I don’t think that language is going away any time soon. (Languages seem to hang around. Speaking at one of my “Intermediate/Advanced Java Seminars,” Allen Holub asserted that the two most commonly-used languages are Rexx and COBOL, in that order.)

I’m beginning to think that the strength of Java lies in a slightly different arena than that of C++. C++ is a language that doesn’t try to fit a mold. Certainly it has been adapted in a number of ways to solve particular problems. Some tools combine libraries, component models and code generation tools to solve the problem of developing windowed end-user applications (for Microsoft Windows). And yet, what do the vast majority of Windows developers use? Microsoft’s Visual Basic (VB). This despite the fact that VB produces the kind of code that becomes unmanageable when the program is only a few pages long (and syntax that can be positively mystifying). As successful and popular as VB is, from a language design viewpoint it’s a mountain of hacks. It would be nice to have the ease and power of VB without the resulting unmanageable code. And that’s where I think Java will shine: as the “next VB.” You may or may not shudder to hear this, but think about it: so much of Java is designed to make it easy for the programmer to solve application-level problems like networking and cross-platform UI, and yet it has a language design intended to allow the creation of very large and flexible bodies of code. Add to this the fact that Java has the most robust type checking and error-handling systems I’ve ever seen in a language and you have the makings of a significant leap forward in programming productivity.

Should you use Java instead of C++ for your project? Other than Web applets, there are two issues to consider. First, if you want to use a lot of existing libraries (and you’ll certainly get a lot of productivity gains there), or if you have an existing C or C++ code base, Java might slow your development down rather than speeding it up. If you’re developing all your code primarily from scratch, then the simplicity of Java over C++ will shorten your development time.

The biggest issue is speed. Interpreted Java has been slow, even 20 to 50 times slower than C in the original Java interpreters. This has improved quite a bit over time, but it will still remain an important number. Computers are about speed; if it wasn’t significantly faster to do something on a computer then you’d do it by hand. (I’ve even heard it suggested that you start with Java, to gain the short development time, then use a tool and support libraries to translate your code to C++, if you need faster execution speed.)

The key to making Java feasible for most non-Web development projects is the appearance of speed improvements like so-called “just-in time” (JIT) compilers and possibly even native code compilers (two of which exist at this writing). Of course, native-code compilers will eliminate the touted cross-platform execution of the compiled programs, but they will also bring the speed of the executable closer to that of C and C++. And cross compiling programs in Java should be a lot easier than doing so in C or C++. (In theory, you just recompile, but that promise has been made before for other languages.)

[9] The best introduction is still Grady Booch’s Object-Oriented Design with Applications, 2^nd edition, Wiley & Sons 1996. His insights are clear and his prose is straightforward, although his notations are needlessly complex for most designs. (You can easily get by with a subset.)

[10] This is something like “rapid prototyping,” where you were supposed to build a quick-and-dirty version so that you could learn about the system, and then throw away your prototype and build it right. The trouble with rapid prototyping is that people didn’t throw away the prototype, but instead built upon it. Combined with the lack of structure in procedural programming, this often leads to messy systems that are expensive to maintain.

Thinking in Java, 1st edition

©1998 by Bruce Eckel

1: Introduction to objects

The progress of abstraction

An object has an interface

The hidden implementation

Reusing the implementation

Inheritance: reusing the interface

Overriding base-class functionality

Is-a vs. is-like-a relationships

Interchangeable objects with polymorphism

Dynamic binding

Abstract base classes and interfaces

Object landscapes and lifetimes

Collections and iterators

The singly-rooted hierarchy

Collection libraries and support for easy collection use

Downcasting vs. templates/generics

The housekeeping dilemma: who should clean up?

Garbage collectors vs. efficiency and flexibility

Exception handling: dealing with errors

Multithreading

Persistence

Java and the Internet

What is the Web?

Client/Server computing

The Web as a giant server

Client-side programming[8]

Plug-ins

Scripting languages

Java

ActiveX

Security

Internet vs. Intranet

Server-side programming

A separate arena: applications

Analysis and Design

Staying on course

Phase 0: Let’s make a plan

Phase 1: What are we making?

Phase 2: How will we build it?

Phase 3: Let’s build it!

Phase 4: Iteration

Plans pay off

Java vs. C++?

1: Introduction
to objects

Reusing
the implementation

Inheritance:
reusing the interface

Interchangeable objects
with polymorphism

Object landscapes
and lifetimes

Collection libraries and support
for easy collection use

The housekeeping dilemma:
who should clean up?

Garbage collectors
vs. efficiency and flexibility

Exception handling:
dealing with errors