Why has object-oriented
programming had such a sweeping impact on the software development community?
Object-oriented
programming appeals at multiple levels. For managers, it promises faster and
cheaper development and maintenance. For analysts and designers, the modeling
process becomes simpler and produces a clear, manageable design. For
programmers, the elegance and clarity of the object model and the power of
object-oriented tools and libraries makes programming a much more pleasant task,
and programmers experience an increase in productivity. Everybody wins, it would
seem.
If there’s a downside, it is
the expense of the learning curve. Thinking in objects is a dramatic departure
from thinking procedurally, and the process of
designing objects is much more challenging than
procedural design, especially if you’re trying to create reusable
objects. In the past, a novice practitioner of object-oriented
programming was faced with a choice between two daunting tasks:
It
is, in fact, difficult to design objects well – for that matter,
it’s hard to design anything well. But the intent is that a
relatively few experts design the best objects for others to consume. Successful
OOP languages incorporate not just language syntax and a compiler, but an entire
development environment including a significant library of well-designed,
easy to use objects. Thus, the primary job of most programmers is to use
existing objects to solve their application problems. The goal of this chapter
is to show you what object-oriented programming is and how simple it can
be.
This chapter will introduce many of
the ideas of Java and object-oriented programming on a conceptual level, but
keep in mind that you’re not expected to be able to write full-fledged
Java programs after reading this chapter. All the detailed descriptions and
examples will follow throughout the course of this
book.
All programming languages provide
abstractions. It can be argued that the complexity of the problems you can solve
is directly related to the kind and quality of
abstraction. By “kind” I mean: what is it that
you are abstracting? Assembly language is a small abstraction of the underlying
machine. Many so-called “imperative” languages that followed (such
as FORTRAN, BASIC, and C) were abstractions of assembly language. These
languages are big improvements over assembly language, but their primary
abstraction still requires you to think in terms of the structure of the
computer rather than the structure of the problem you are trying to solve. The
programmer must establish the association between the machine model (in the
“solution space”) and the model of the problem that is actually
being solved (in the “problem space”). The effort required to
perform this mapping, and the fact that it is extrinsic to the programming
language, produces programs that are difficult to write and expensive to
maintain, and as a side effect created the entire “programming
methods” industry.
The alternative to modeling the
machine is to model the problem you’re trying to solve. Early languages
such as LISP and APL chose particular views of the world (“all problems
are ultimately lists” or “all problems are algorithmic”).
PROLOG casts all problems into chains of decisions. Languages have been created
for constraint-based programming and for programming exclusively by manipulating
graphical symbols. (The latter proved to be too restrictive.) Each of these
approaches is a good solution to the particular class of problem they’re
designed to solve, but when you step outside of that domain they become awkward.
The object-oriented approach takes
a step farther by providing tools for the programmer to represent elements in
the problem space. This representation is general enough
that the programmer is not constrained to any particular type of problem. We
refer to the elements in the problem space and their representations in the
solution space as “objects.” (Of course, you
will also need other objects that don’t have problem-space analogs.) The
idea is that the program is allowed to adapt itself to the lingo of the problem
by adding new types of objects, so when you read the code describing the
solution, you’re reading words that also express the problem. This is a
more flexible and powerful language abstraction than what we’ve had
before. Thus OOP allows you to describe the problem in terms of the problem,
rather than in the terms of the solution. There’s still a connection back
to the computer, though. Each object looks quite a bit like a little computer;
it has a state, and it has operations you can ask it to perform. However, this
doesn’t seem like such a bad analogy to objects in the real world; they
all have characteristics and behaviors.
Alan Kay summarized five basic
characteristics of Smalltalk, the first successful
object-oriented language and one of the languages upon which Java is based.
These characteristics represent a pure approach to object-oriented
programming:
Some language
designers have decided that object-oriented programming itself is not adequate
to easily solve all programming problems, and advocate the combination of
various approaches into multiparadigm programming
languages.[2]
Aristotle was probably the first to
begin a careful study of the concept of type. He was known to speak of
“the class of fishes and the class of birds.” The concept that all
objects, while being unique, are also part of a set of objects that have
characteristics and behaviors in common was directly used in the first
object-oriented language, Simula-67, with its fundamental keyword class
that introduces a new type into a program (thus class and type are
often used
synonymously[3]).
Simula, as its name implies, was
created for developing simulations such as the classic “bank teller
problem.” In this, you have a bunch of tellers, customers, accounts,
transactions, etc. The members (elements) of each class share some commonality:
every account has a balance, every teller can accept a deposit, etc. At the same
time, each member has its own state; each account has a different balance, each
teller has a name. Thus the tellers, customers, accounts, transactions, etc. can
each be represented with a unique entity in the computer program. This entity is
the object, and each object belongs to a particular class that defines its
characteristics and behaviors.
So, although what we really do in
object-oriented programming is create new data types, virtually all
object-oriented programming languages use the “class” keyword. When
you see the word “type” think “class” and vice
versa.
Once a type is established, you can
make as many objects of that type as you like, and then manipulate those objects
as the elements that exist in the problem you are trying to solve. Indeed, one
of the challenges of object-oriented programming is to create a one-to-one
mapping between the elements in the problem space (the place where the
problem actually exists) and the solution space (the place where
you’re modeling that problem, such as a computer).
But how do you get an object to do
useful work for you? There must be a way to make a request of that object so it
will do something, such as complete a transaction, draw something on the screen
or turn on a switch. And each object can satisfy only certain requests. The
requests you can make of an object are defined by its interface, and the
type is what determines the interface. The idea of type being equivalent to
interface is fundamental in object-oriented programming.
A simple example might be a
representation of a light bulb:
Light lt = new Light(); lt.on();
The name of the type/class is
Light, and the requests that you can make of a Light object are to
turn it on, turn it off, make it brighter or make it dimmer. You create a
“handle” for a Light simply by declaring a name (lt)
for that identifier, and you make an object of type Light with the
new keyword, assigning it to the handle with the = sign. To send a
message to the object, you state the handle name and connect it to the message
name with a period (dot). From the standpoint of the user of a pre-defined
class, that’s pretty much all there is to programming with
objects.
It is helpful to break up the
playing field into class creators (those who create new data types) and
client
programmers[4]
(the class consumers who use the data types in their applications). The goal of
the client programmer is to collect a toolbox full of classes to use for rapid
application development. The goal of the class creator is to build a class that
exposes only what’s necessary to the client programmer and keeps
everything else hidden. Why? If it’s hidden, the client programmer
can’t use it, which means that the class creator can change the hidden
portion at will without worrying about the impact to anyone
else.
The interface establishes
what requests you can make for a particular object. However, there must
be code somewhere to satisfy that request. This, along with the hidden data,
comprises the implementation. From a procedural programming standpoint,
it’s not that complicated. A type has a function associated with each
possible request, and when you make a particular request to an object, that
function is called. This process is often summarized by saying that you
“send a message” (make a request) to an object, and the object
figures out what to do with that message (it executes code).
In any relationship it’s
important to have boundaries that are respected by all parties involved. When
you create a library, you establish a relationship with the client
programmer, who is another
programmer, but one who is putting together an application or using your library
to build a bigger library.
If all the members of a class are
available to everyone, then the client programmer can do anything with that
class and there’s no way to force any particular behaviors. Even though
you might really prefer that the client programmer not directly manipulate some
of the members of your class, without access control there’s no way to
prevent it. Everything’s naked to the world.
There are two reasons for
controlling access to members. The
first is to keep client programmers’ hands off portions they
shouldn’t touch – parts that are necessary for the internal
machinations of the data type but not part of the interface that users need to
solve their particular problems. This is actually a service to users because
they can easily see what’s important to them and what they can
ignore.
The second reason for access
control is to allow the library designer to change the internal workings of the
structure without worrying about how it will affect the client programmer. For
example, you might implement a particular class in a simple fashion to ease
development, and then later decide you need to rewrite it to make it run faster.
If the interface and implementation are clearly separated and protected, you can
accomplish this and require only a relink by the
user.
Java uses three explicit keywords
and one implied keyword to set the boundaries in a class: public,
private, protected and the implied “friendly,” which
is what you get if you don’t specify one of the other keywords. Their use
and meaning are remarkably straightforward. These access specifiers
determine who
can use the definition that follows. public means
the following definition is available to everyone. The private
keyword, on the other hand, means that no one can access
that definition except you, the creator of the type, inside function members of
that type. private is a brick wall between you and the client programmer.
If someone tries to access a private member, they’ll get a compile-time
error. “Friendly” has to do with something called a
“package,” which is Java’s way of making libraries. If
something is “friendly” it’s available only within the
package. (Thus this access level is sometimes referred to as “package
access.”) protected acts just like private, with the
exception that an inheriting class has access to protected members, but
not private members. Inheritance will be covered
shortly.
Once a class has been created and
tested, it should (ideally) represent a useful unit of code. It turns out that
this reusability is not nearly so easy to achieve as many would hope; it takes
experience and insight to achieve a good design. But once you have such a
design, it begs to be reused. Code reuse is arguably the greatest leverage that
object-oriented programming languages provide.
The simplest way to reuse a class
is to just use an object of that class directly, but you can also place an
object of that class inside a new class. We call this “creating a member
object.” Your new class can be made up of any number and type of other
objects, whatever is necessary to achieve the functionality desired in your new
class. This concept is called composition, since you are composing a new
class from existing classes. Sometimes composition is referred to as a
“has-a” relationship, as in “a car has a
trunk.”
Composition comes with a great deal
of flexibility. The member objects of your new class are usually private,
making them inaccessible to client programmers using the class. This allows you
to change those members without disturbing existing client code. You can also
change the member objects at run time, which provides great flexibility.
Inheritance, which is described next, does not have this flexibility since the
compiler must place restrictions on classes created with
inheritance.
Because inheritance is so important
in object-oriented programming it is often highly emphasized, and the new
programmer can get the idea that inheritance should be used everywhere. This can
result in awkward and overcomplicated designs. Instead, you should first look to
composition when creating new classes, since it is simpler and more flexible. If
you take this approach, your designs will stay cleaner. It will be reasonably
obvious when you need
inheritance.
By itself, the concept of an object
is a convenient tool. It allows you to package data and functionality together
by concept, so you can represent an appropriate problem-space idea rather
than being forced to use the idioms of the underlying machine. These concepts
are expressed in the primary idea of the programming language as a data type
(using the class keyword).
It seems a pity, however, to go to
all the trouble to create a data type and then be forced to create a brand new
one that might have similar functionality. It’s nicer if we can take the
existing data type, clone it and make additions and modifications to the clone.
This is effectively what you get with inheritance, with the exception
that if the original class (called the base or super or
parent class) is changed, the modified “clone” (called the
derived or inherited or sub or child class)
also reflects the appropriate changes. Inheritance is implemented in Java with
the extends keyword. You make a new class and you say that it
extends an existing class.
When you inherit you create a new
type, and the new type contains not only all the members of the existing type
(although the private ones are hidden away and inaccessible), but more
importantly it duplicates the interface of the base class. That is, all the
messages you can send to objects of the base class you can also send to objects
of the derived class. Since we know the type of a class by the messages we can
send to it, this means that the derived class is the same type as the base
class. This type equivalence via inheritance is one of the fundamental
gateways in understanding the meaning of object-oriented
programming.
Since both the base class and
derived class have the same interface, there must be some implementation to go
along with that interface. That is, there must be a method to execute when an
object receives a particular message. If you simply inherit a class and
don’t do anything else, the methods from the base-class interface come
right along into the derived class. That means objects of the derived class have
not only the same type, they also have the same behavior, which doesn’t
seem particularly interesting.
You have two ways to differentiate
your new derived class from the original base class it inherits from. The first
is quite straightforward: you simply add brand new functions to the derived
class. These new functions are not part of the base class interface. This means
that the base class simply didn’t do as much as you wanted it to, so you
add more functions. This simple and primitive use for inheritance is, at times,
the perfect solution to your problem. However, you should look closely for the
possibility that your base class might need these additional
functions.
Although the extends keyword
implies that you are going to add new functions to the interface, that’s
not necessarily true. The second way to differentiate your new class is to
change the behavior of an existing base-class function. This is referred
to as overriding that function.
To override a function, you simply
create a new definition for the function in the derived class. You’re
saying “I’m using the same interface function here, but I want it to
do something different for my new
type.”
There’s a certain debate that
can occur about inheritance: Should inheritance override only base-class
functions? This means that the derived type is exactly the same type as
the base class since it has exactly the same interface. As a result, you can
exactly substitute an object of the derived class for an object of the base
class. This can be thought of as pure substitution. In a sense, this is
the ideal way to treat inheritance. We often refer to the relationship between
the base class and derived classes in this case as an is-a relationship,
because you can say “a circle is a shape.” A test for
inheritance is whether you can state the is-a relationship about the classes and
have it make sense.
There are times when you must add
new interface elements to a derived type, thus extending the interface and
creating a new type. The new type can still be substituted for the base type,
but the substitution isn’t perfect in a sense because your new functions
are not accessible from the base type. This can be described as an
is-like-a relationship; the new type has the interface of the old type
but it also contains other functions, so you can’t really say it’s
exactly the same. For example, consider an air conditioner. Suppose your house
is wired with all the controls for cooling; that is, it has an interface that
allows you to control cooling. Imagine that the air conditioner breaks down and
you replace it with a heat pump, which can both heat and cool. The heat pump
is-like-an air conditioner, but it can do more. Because your house is
wired only to control cooling, it is restricted to communication with the
cooling part of the new object. The interface of the new object has been
extended, and the existing system doesn’t know about anything except the
original interface.
When you see the substitution
principle it’s easy to feel like that’s the only way to do things,
and in fact it is nice if your design works out that way. But you’ll find
that there are times when it’s equally clear that you must add new
functions to the interface of a derived class. With inspection both cases should
be reasonably
obvious.
Inheritance usually ends up
creating a family of classes, all based on the same uniform interface. We
express this with an inverted tree
diagram:[5]
One of the most important things
you do with such a family of classes is to treat an object of a derived class as
an object of the base class. This is important because it means you can write a
single piece of code that ignores the specific details of type and talks just to
the base class. That code is then decoupled from type-specific
information, and thus is simpler to write and easier to understand. And, if a
new type – a Triangle, for example – is added through
inheritance, the code you write will work just as well for the new type of
Shape as it did on the existing types. Thus the program is
extensible.
Consider the above example. If you
write a function in Java:
void doStuff(Shape s) { s.erase(); // ... s.draw(); }
This function speaks to any
Shape, so it is independent of the specific type of object it’s
drawing and erasing. If in some other program we use the doStuff( )
function:
Circle c = new Circle(); Triangle t = new Triangle(); Line l = new Line(); doStuff(c); doStuff(t); doStuff(l);
The calls to doStuff( )
automatically work right, regardless of the exact type of the object.
This is actually a pretty amazing
trick. Consider the line:
doStuff(c);
What’s happening here is that
a Circle handle is being passed into a function that’s expecting a
Shape handle. Since a Circle is a Shape it can be
treated as one by doStuff( ). That is, any message that
doStuff( ) can send to a Shape, a Circle can accept.
So it is a completely safe and logical thing to do.
We call this process of treating a
derived type as though it were its base type upcasting. The name cast
is used in the sense of casting into a mold and the up comes from the
way the inheritance diagram is typically arranged, with the base type at the top
and the derived classes fanning out downward. Thus, casting to a base type is
moving up the inheritance diagram: upcasting.
An object-oriented program contains
some upcasting somewhere, because that’s how you decouple yourself from
knowing about the exact type you’re working with. Look at the code in
doStuff( ):
s.erase(); // ... s.draw();
Notice that it doesn’t say
“If you’re a Circle, do this, if you’re a
Square, do that, etc.” If you write that kind of code, which checks
for all the possible types a Shape can actually be, it’s messy and
you need to change it every time you add a new kind of Shape. Here, you
just say “You’re a shape, I know you can erase( )
yourself, do it and take care of the details correctly.”
What’s amazing about the code
in doStuff( ) is that somehow the right thing happens. Calling
draw( ) for Circle causes different code to be executed than
when calling draw( ) for a Square or a Line, but when
the draw( ) message is sent to an anonymous Shape, the
correct behavior occurs based on the actual type that the Shape handle
happens to be connected to. This is amazing because when the Java compiler is
compiling the code for doStuff( ), it cannot know exactly what types
it is dealing with. So ordinarily, you’d expect it to end up calling the
version of erase( ) for Shape, and draw( ) for
Shape and not for the specific Circle, Square, or
Line. And yet the right thing happens. Here’s how it
works.
When you send a message to an
object even though you don’t know what specific type it is, and the right
thing happens, that’s called polymorphism. The process used by
object-oriented programming languages to implement polymorphism is called
dynamic binding. The compiler and run-time system handle the details; all
you need to know is that it happens and more importantly how to design with
it.
Some languages require you to use a
special keyword to enable dynamic binding. In C++ this keyword is
virtual. In Java, you never need to remember to add a keyword because
functions are automatically dynamically bound. So you can always expect that
when you send a message to an object, the object will do the right thing, even
when upcasting is
involved.
Often in a design, you want the
base class to present only an interface for its derived classes. That is,
you don’t want anyone to actually create an object of the base class, only
to upcast to it so that its interface can be used. This is accomplished by
making that class abstract using the abstract keyword. If anyone
tries to make an object of an abstract class, the compiler prevents them.
This is a tool to enforce a particular design.
You can also use the
abstract keyword to describe a method that hasn’t been implemented
yet – as a stub indicating “here is an interface function for all
types inherited from this class, but at this point I don’t have any
implementation for it.” An abstract method may be created only
inside an abstract class. When the class is inherited, that method must
be implemented, or the inherited class becomes abstract as well. Creating
an abstract method allows you to put a method in an interface without
being forced to provide a possibly meaningless body of code for that
method.
The interface keyword takes
the concept of an abstract class one step further by preventing any
function definitions at all. The interface is a very useful and
commonly-used tool, as it provides the perfect separation of interface and
implementation. In addition, you can combine many interfaces together, if you
wish. (You cannot inherit from more than one regular class or abstract
class.)
Technically, OOP is just about
abstract data typing, inheritance and polymorphism, but other issues can be at
least as important. The remainder of this section will cover these
issues.
One of the most important factors
is the way objects are created and destroyed. Where is the data for an object
and how is the lifetime of the object controlled? There are different
philosophies at work here. C++ takes the approach that control of efficiency is
the most important issue, so it gives the programmer a choice. For maximum
run-time speed, the storage and lifetime can be determined while the program is
being written, by placing the objects on the stack (these are sometimes called
automatic or scoped variables) or in the static storage area. This
places a priority on the speed of storage allocation and release, and control of
these can be very valuable in some situations. However, you sacrifice
flexibility because you must know the exact quantity, lifetime and type of
objects while you’re writing the program. If you are trying to
solve a more general problem such as computer-aided design, warehouse management
or air-traffic control, this is too restrictive.
The second approach is to create
objects dynamically in a pool of memory called the heap. In this approach
you don’t know until run time how many objects you need, what their
lifetime is or what their exact type is. Those are determined at the spur of the
moment while the program is running. If you need a new object, you simply make
it on the heap at the point that you need it. Because the storage is managed
dynamically, at run time, the amount of time required to allocate storage on the
heap is significantly longer than the time to create storage on the stack.
(Creating storage on the stack is often a single assembly instruction to move
the stack pointer down, and another to move it back up.) The dynamic approach
makes the generally logical assumption that objects tend to be complicated, so
the extra overhead of finding storage and releasing that storage will not have
an important impact on the creation of an object. In addition, the greater
flexibility is essential to solve the general programming
problem.
C++ allows you to determine whether
the objects are created while you write the program or at run time to allow the
control of efficiency. You might think that since it’s more flexible,
you’d always want to create objects on the heap rather than the stack.
There’s another issue, however, and that’s the lifetime of an
object. If you create an object on the stack or in static storage, the compiler
determines how long the object lasts and can automatically destroy it. However,
if you create it on the heap the compiler has no knowledge of its lifetime. A
programmer has two options for destroying objects: you can determine
programmatically when to destroy the object, or the environment can provide a
feature called a garbage collector that automatically discovers when an
object is no longer in use and destroys it. Of course, a garbage collector is
much more convenient, but it requires that all applications must be able to
tolerate the existence of the garbage collector and the other overhead for
garbage collection. This does not meet the design requirements of the C++
language and so it was not included, but Java does have a garbage collector (as
does Smalltalk; Delphi does not but one could be added. Third-party garbage
collectors exist for C++).
If you don’t know how many
objects you’re going to need to solve a particular problem, or how long
they will last, you also don’t know how to store those objects. How can
you know how much space to create for those objects? You can’t, since that
information isn’t known until run time.
The solution to most problems in
object-oriented design seems flippant: you create another type of object. The
new type of object that solves this particular problem holds handles to other
objects. Of course, you can do the same thing with an array, which is available
in most languages. But there’s more. This new object, generally called a
collection (also called a container, but the AWT uses that term in
a different sense so this book will use “collection”), will expand
itself whenever necessary to accommodate everything you place inside it. So you
don’t need to know how many objects you’re going to hold in a
collection. Just create a collection object and let it take care of the
details.
Fortunately, a good OOP language
comes with a set of collections as part of the package. In C++, it’s the
Standard Template Library (STL). Object Pascal has collections in its Visual
Component Library (VCL). Smalltalk has a very complete set of collections. Java
also has collections in its standard library. In some libraries, a generic
collection is considered good enough for all needs, and in others (C++ in
particular) the library has different types of collections for different needs:
a vector for consistent access to all elements, and a linked list for consistent
insertion at all elements, for example, so you can choose the particular type
that fits your needs. These may include sets, queues, hash tables, trees,
stacks, etc.
All collections have some way to
put things in and get things out. The way that you place something into a
collection is fairly obvious. There’s a function called “push”
or “add” or a similar name. Fetching things out of a collection is
not always as apparent; if it’s an array-like entity such as a vector, you
might be able to use an indexing operator or function. But in many situations
this doesn’t make sense. Also, a single-selection function is restrictive.
What if you want to manipulate or compare a set of elements in the collection
instead of just one?
The solution is an iterator,
which is an object whose job is to select the elements within a collection and
present them to the user of the iterator. As a class, it also provides a level
of abstraction. This abstraction can be used to separate the details of the
collection from the code that’s accessing that collection. The collection,
via the iterator, is abstracted to be simply a sequence. The iterator allows you
to traverse that sequence without worrying about the underlying structure
– that is, whether it’s a vector, a linked list, a stack or
something else. This gives you the flexibility to easily change the underlying
data structure without disturbing the code in your program. Java began (in
version 1.0 and 1.1) with a standard iterator, called Enumeration, for
all of its collection classes. Java 1.2 has added a much more complete
collection library which contains an iterator called Iterator that does
more than the older Enumeration.
From the design standpoint, all you
really want is a sequence that can be manipulated to solve your problem. If a
single type of sequence satisfied all of your needs, there’d be no reason
to have different kinds. There are two reasons that you need a choice of
collections. First, collections provide different types of interfaces and
external behavior. A stack has a different interface and behavior than that of a
queue, which is different from that of a set or a list. One of these might
provide a more flexible solution to your problem than the other. Second,
different collections have different efficiencies for certain operations. The
best example is a vector and a list. Both are simple sequences that can have
identical interfaces and external behaviors. But certain operations can have
radically different costs. Randomly accessing elements in a vector is a
constant-time operation; it takes the same amount of time regardless of the
element you select. However, in a linked list it is expensive to move through
the list to randomly select an element, and it takes longer to find an element
if it is further down the list. On the other hand, if you want to insert an
element in the middle of a sequence, it’s much cheaper in a list than in a
vector. These and other operations have different efficiencies depending upon
the underlying structure of the sequence. In the design phase, you might start
with a list and, when tuning for performance, change to a vector. Because of the
abstraction via iterators, you can change from one to the other with minimal
impact on your code.
In the end, remember that a
collection is only a storage cabinet to put objects in. If that cabinet solves
all of your needs, it doesn’t really matter how it is implemented
(a basic concept with most types of objects). If you’re working in a
programming environment that has built-in overhead due to other factors (running
under Windows, for example, or the cost of a garbage collector), then the cost
difference between a vector and a linked list might not matter. You might need
only one type of sequence. You can even imagine the “perfect”
collection abstraction, which can automatically change its underlying
implementation according to the way it is
used.
One of the issues in OOP that has
become especially prominent since the introduction of C++ is whether all classes
should ultimately be inherited from a single base class. In Java (as with
virtually all other OOP languages) the answer is “yes” and the name
of this ultimate base class is simply Object. It turns out that
the benefits of the singly-rooted hierarchy are many.
All objects in a singly-rooted
hierarchy have an interface in common, so they are all ultimately the same type.
The alternative (provided by C++) is that you don’t know that everything
is the same fundamental type. From a backwards-compatibility standpoint this
fits the model of C better and can be thought of as less restrictive, but when
you want to do full-on object-oriented programming you must then build your own
hierarchy to provide the same convenience that’s built into other OOP
languages. And in any new class library you acquire, some other incompatible
interface will be used. It requires effort (and possibly multiple inheritance)
to work the new interface into your design. Is the extra
“flexibility” of C++ worth it? If you need it – if you have a
large investment in C – it’s quite valuable. If you’re
starting from scratch, other alternatives such as Java can often be more
productive.
All objects in a singly-rooted
hierarchy (such as Java provides) can be guaranteed to have certain
functionality. You know you can perform certain basic operations on every object
in your system. A singly-rooted hierarchy, along with creating all objects on
the heap, greatly simplifies argument passing (one of the more complex topics in
C++).
A singly-rooted hierarchy makes it
much easier to implement a garbage collector. The necessary support can be
installed in the base class, and the garbage collector can thus send the
appropriate messages to every object in the system. Without a singly-rooted
hierarchy and a system to manipulate an object via a handle, it is difficult to
implement a garbage collector.
Since run-time type information is
guaranteed to be in all objects, you’ll never end up with an object whose
type you cannot determine. This is especially important with system level
operations, such as exception handling, and to allow greater flexibility in
programming.
You may wonder why, if it’s
so beneficial, a singly-rooted hierarchy isn’t in C++. It’s the old
bugaboo of efficiency and control. A singly-rooted hierarchy puts constraints on
your program designs, and in particular it was perceived to put constraints on
the use of existing C code. These constraints cause problems only in certain
situations, but for maximum flexibility there is no requirement for a
singly-rooted hierarchy in C++. In Java, which started from scratch and has no
backward-compatibility issues with any existing language, it was a logical
choice to use the singly-rooted hierarchy in common with most other
object-oriented programming
languages.
Because a collection is a tool that
you’ll use frequently, it makes sense to have a library of collections
that are built in a reusable fashion, so you can take one off the shelf and plug
it into your program. Java provides such a library, although it is fairly
limited in Java 1.0 and 1.1 (the Java 1.2 collections library, however,
satisfies most needs).
To make these collections reusable,
they contain the one universal type in Java that was previously mentioned:
Object. The singly-rooted hierarchy means that everything is an
Object, so a collection that holds Objects can hold anything. This
makes it easy to reuse.
To use such a collection, you
simply add object handles to it, and later ask for them back. But, since the
collection holds only Objects, when you add your object handle into the
collection it is upcast to Object, thus losing its identity. When you
fetch it back, you get an Object handle, and not a handle to the type
that you put in. So how do you turn it back into something that has the useful
interface of the object that you put into the collection?
Here, the cast is used again, but
this time you’re not casting up the inheritance hierarchy to a more
general type, you cast down the hierarchy to a more specific type. This
manner of casting is called downcasting. With upcasting, you know, for
example, that a Circle is a type of Shape so it’s safe to
upcast, but you don’t know that an Object is necessarily a
Circle or a Shape so it’s hardly safe to downcast unless you
know that’s what you’re dealing with.
It’s not completely
dangerous, however, because if you downcast to the wrong thing you’ll get
a run-time error called an exception, which will be described shortly.
When you fetch object handles from a collection, though, you must have some way
to remember exactly what they are so you can perform a proper
downcast.
Downcasting and the run-time checks
require extra time for the running program, and extra effort from the
programmer. Wouldn’t it make sense to somehow create the collection so
that it knows the types that it holds, eliminating the need for the downcast and
possible mistake? The solution is parameterized types, which are classes
that the compiler can automatically customize to work with particular types. For
example, with a parameterized collection, the compiler could customize that
collection so that it would accept only Shapes and fetch only
Shapes.
Parameterized types are an
important part of C++, partly because C++ has no singly-rooted hierarchy. In
C++, the keyword that implements parameterized types is template. Java
currently has no parameterized types since it is possible for it to get by
– however awkwardly – using the singly-rooted hierarchy. At one
point the word generic (the keyword used by Ada for its templates) was on
a list of keywords that were “reserved for future implementation.”
Some of these seemed to have mysteriously slipped into a kind of “keyword
Bermuda Triangle” and it’s difficult to know what might eventually
happen.
Each object requires resources in
order to exist, most notably memory. When an object is no longer needed it must
be cleaned up so that these resources are released for reuse. In simple
programming situations the question of how an object is cleaned up doesn’t
seem too challenging: you create the object, use it for as long as it’s
needed, and then it should be destroyed. It’s not too hard, however, to
encounter situations in which the situation is more complex.
Suppose, for example, you are
designing a system to manage air traffic for an airport. (The same model might
also work for managing crates in a warehouse, or a video rental system, or a
kennel for boarding pets.) At first it seems simple: make a collection to hold
airplanes, then create a new airplane and place it in the collection for each
airplane that enters the air-traffic-control zone. For cleanup, simply delete
the appropriate airplane object when a plane leaves the zone.
But perhaps you have some other
system to record data about the planes; perhaps data that doesn’t require
such immediate attention as the main controller function. Maybe it’s a
record of the flight plans of all the small planes that leave the airport. So
you have a second collection of small planes, and whenever you create a plane
object you also put it in this collection if it’s a small plane. Then some
background process performs operations on the objects in this collection during
idle moments.
Now the problem is more difficult:
how can you possibly know when to destroy the objects? When you’re done
with the object, some other part of the system might not be. This same problem
can arise in a number of other situations, and in programming systems (such as
C++) in which you must explicitly delete an object when you’re done with
it this can become quite
complex.[6]
With Java, the garbage collector is
designed to take care of the problem of releasing the memory (although this
doesn’t include other aspects of cleaning up an object). The garbage
collector “knows” when an object is no longer in use, and it then
automatically releases the memory for that object. This, combined with the fact
that all objects are inherited from the single root class Object and that
you can create objects only one way, on the heap, makes the process of
programming in Java much simpler than programming in C++. You have far fewer
decisions to make and hurdles to overcome.
If all this is such a good idea,
why didn’t they do the same thing in C++? Well of course there’s a
price you pay for all this programming convenience, and that price is run-time
overhead. As mentioned before, in C++ you can create objects on the stack, and
in this case they’re automatically cleaned up (but you don’t have
the flexibility of creating as many as you want at run-time). Creating objects
on the stack is the most efficient way to allocate storage for objects and to
free that storage. Creating objects on the heap can be much more expensive.
Always inheriting from a base class and making all function calls polymorphic
also exacts a small toll. But the garbage collector is a particular problem
because you never quite know when it’s going to start up or how long it
will take. This means that there’s an inconsistency in the rate of
execution of a Java program, so you can’t use it in certain situations,
such as when the rate of execution of a program is uniformly critical. (These
are generally called real time programs, although not all real-time
programming problems are this
stringent.)[7]
The designers of the C++ language,
trying to woo C programmers (and most successfully, at that), did not want to
add any features to the language that would impact the speed or the use of C++
in any situation where C might be used. This goal was realized, but at the price
of greater complexity when programming in C++. Java is simpler than C++, but the
tradeoff is in efficiency and sometimes applicability. For a significant portion
of programming problems, however, Java is often the superior
choice.
Ever since the beginning of
programming languages, error handling has been one of the most difficult issues.
Because it’s so hard to design a good error-handling scheme, many
languages simply ignore the issue, passing the problem on to library designers
who come up with halfway measures that can work in many situations but can
easily be circumvented, generally by just ignoring them. A major problem with
most error-handling schemes is that they rely on programmer vigilance in
following an agreed-upon convention that is not enforced by the language. If the
programmer is not vigilant, which is often if they are in a hurry, these schemes
can easily be forgotten.
Exception handling wires
error handling directly into the programming language and sometimes even the
operating system. An exception is an object that is “thrown” from
the site of the error and can be “caught” by an appropriate
exception handler designed to handle that particular type of error.
It’s as if exception handling is a different, parallel path of execution
that can be taken when things go wrong. And because it uses a separate execution
path, it doesn’t need to interfere with your normally-executing code. This
makes that code simpler to write since you aren’t constantly forced to
check for errors. In addition, a thrown exception is unlike an error value
that’s returned from a function or a flag that’s set by a function
in order to indicate an error condition, These can be ignored. An exception
cannot be ignored so it’s guaranteed to be dealt with at some point.
Finally, exceptions provide a way to reliably recover from a bad situation.
Instead of just exiting you are often able to set things right and restore the
execution of a program, which produces much more robust
programs.
Java’s exception handling
stands out among programming languages, because in Java, exception-handling was
wired in from the beginning and you’re forced to use it. If you
don’t write your code to properly handle exceptions, you’ll get a
compile-time error message. This guaranteed consistency makes error-handling
much easier.
It’s worth noting that
exception handling isn’t an object-oriented feature, although in
object-oriented languages the exception is normally represented with an object.
Exception handling existed before object-oriented
languages.
A fundamental concept in computer
programming is the idea of handling more than one task at a time. Many
programming problems require that the program be able to stop what it’s
doing, deal with some other problem and return to the main process. The solution
has been approached in many ways. Initially, programmers with low-level
knowledge of the machine wrote interrupt service routines and the
suspension of the main process was initiated through a hardware interrupt.
Although this worked well, it was difficult and non-portable, so it made moving
a program to a new type of machine slow and expensive.
Sometimes interrupts are necessary
for handling time-critical tasks, but there’s a large class of problems in
which you’re simply trying to partition the problem into
separately-running pieces so that the whole program can be more responsive.
Within a program, these separately-running pieces are called threads and
the general concept is called multithreading. A common example of
multithreading is the user interface. By using threads, a user can press a
button and get a quick response rather than being forced to wait until the
program finishes its current task.
Ordinarily, threads are just a way
to allocate the time of a single processor. But if the operating system supports
multiple processors, each thread can be assigned to a different processor and
they can truly run in parallel. One of the convenient features of multithreading
at the language level is that the programmer doesn’t need to worry about
whether there are many processors or just one. The program is logically divided
into threads and if the machine has more than one processor then the program
runs faster, without any special adjustments.
All this makes threading sound
pretty simple. There is a catch: shared resources. If you have more than one
thread running that’s expecting to access the same resource you have a
problem. For example, two processes can’t simultaneously send information
to a printer. To solve the problem, resources that can be shared, such as the
printer, must be locked while they are being used. So a thread locks a resource,
completes its task and then releases the lock so that someone else can use the
resource.
Java’s threading is built
into the language, which makes a complicated subject much simpler. The threading
is supported on an object level, so one thread of execution is represented by
one object. Java also provides limited resource locking. It can lock the memory
of any object (which is, after all, one kind of shared resource) so that only
one thread can use it at a time. This is accomplished with the
synchronized keyword. Other types of resources must be locked explicitly
by the programmer, typically by creating an object to represent the lock that
all threads must check before accessing that
resource.
When you create an object, it
exists for as long as you need it, but under no circumstances does it exist when
the program terminates. While this makes sense at first, there are situations in
which it would be incredibly useful if an object could exist and hold its
information even while the program wasn’t running. Then the next
time you started the program, the object would be there and it would have the
same information it had the previous time the program was running. Of course you
can get a similar effect now by writing the information to a file or to a
database, but in the spirit of making everything an object it would be quite
convenient to be able to declare an object persistent and have all the
details taken care of for you.
Java 1.1
provides support for “lightweight persistence,” which means that you
can easily store objects on disk and later retrieve them. The reason it’s
“lightweight” is that you’re still forced to make explicit
calls to do the storage and retrieval. In some future release more complete
support for persistence might
appear.
If Java is, in fact, yet another
computer programming language, you may question why it is so important and why
it is being promoted as a revolutionary step in computer programming. The answer
isn’t immediately obvious if you’re coming from a traditional
programming perspective. Although Java will solve traditional stand-alone
programming problems, the reason it is important is that it will also solve
programming problems on the World Wide
Web.
The Web can seem a bit of a mystery
at first, with all this talk of “surfing,” “presence”
and “home pages.” There has even been a growing reaction against
“Internet-mania,” questioning the economic value and outcome of such
a sweeping movement. It’s helpful to step back and see what it really is,
but to do this you must understand client/server systems, another aspect of
computing that’s full of confusing issues.
The primary idea of a client/server
system is that you have a central repository of information – some kind of
data, typically in a database – that you want to distribute on demand to
some set of people or machines. A key to the client/server concept is that the
repository of information is centrally located so that it can be changed
and so that those changes will propagate out to the information consumers. Taken
together, the information repository, the software that distributes the
information and the machine(s) where the information and software reside is
called the server. The software that resides on the remote machine, and
that communicates with the server, fetches the information, processes it, and
displays it on the remote machine is called the client.
The basic concept of client/server
computing, then, is not so complicated. The problems arise because you have a
single server trying to serve many clients at once. Generally a database
management system is involved so the designer “balances” the layout
of data into tables for optimal use. In addition, systems often allow a client
to insert new information into a server. This means you must ensure that one
client’s new data doesn’t walk over another client’s new data,
or that data isn’t lost in the process of adding it to the database. (This
is called transaction processing.) As client software changes, it must be
built, debugged and installed on the client machines, which turns out to be more
complicated and expensive than you might think. It’s especially
problematic to support multiple types of computers and operating systems.
Finally, there’s the all-important performance issue: you might have
hundreds of clients making requests of your server at any one time, and so any
small delay is crucial. To minimize latency, programmers work hard to offload
processing tasks, often to the client machine but sometimes to other machines at
the server site using so-called middleware. (Middleware is also used to
improve maintainability.)
So the simple idea of distributing
information to people has so many layers of complexity in implementing it that
the whole problem can seem hopelessly enigmatic. And yet it’s crucial:
client/server computing accounts for roughly half of all programming activities.
It’s responsible for everything from taking orders and credit-card
transactions to the distribution of any kind of data – stock market,
scientific, government – you name it. What we’ve come up with in the
past is individual solutions to individual problems, inventing a new solution
each time. These were hard to create and hard to use and the user had to learn a
new interface for each one. The entire client/server problem needs to be solved
in a big way.
The Web is actually one giant
client-server system. It’s a bit worse than that, since you have all the
servers and clients coexisting on a single network at once. You don’t need
to know that, since all you care about is connecting to and interacting with one
server at a time (even though you might be hopping around the world in your
search for the correct server).
Initially it was a simple one-way
process. You made a request of a server and it handed you a file, which your
machine’s browser software (i.e. the client) would interpret by formatting
onto your local machine. But in short order people began wanting to do more than
just deliver pages from a server. They wanted full client/server capability so
that the client could feed information back to the server, for example, to do
database lookups on the server, to add new information to the server or to place
an order (which required more security than the original systems offered). These
are the changes we’ve been seeing in the development of the
Web.
The Web browser was a big step
forward: the concept that one piece of information could be displayed on any
type of computer without change. However, browsers were still rather primitive
and rapidly bogged down by the demands placed on them. They weren’t
particularly interactive and tended to clog up both the server and the Internet
because any time you needed to do something that required programming you had to
send information back to the server to be processed. It could take many seconds
or minutes to find out you had misspelled something in your request. Since the
browser was just a viewer it couldn’t perform even the simplest computing
tasks. (On the other hand, it was safe, since it couldn’t execute any
programs on your local machine that contained bugs or viruses.)
To solve this problem, different
approaches have been taken. To begin with, graphics standards have been enhanced
to allow better animation and video within browsers. The remainder of the
problem can be solved only by incorporating the ability to run programs on the
client end, under the browser. This is called client-side
programming.
The Web’s initial
server-browser design provided for interactive content, but the interactivity
was completely provided by the server. The server produced static pages for the
client browser, which would simply interpret and display them. Basic HTML
contains simple mechanisms for data gathering: text-entry boxes, check boxes,
radio boxes, lists and drop-down lists, as well as a button that can only be
programmed to reset the data on the form or “submit” the data on the
form back to the server. This submission passes through the Common Gateway
Interface (CGI) provided on all Web servers. The text within the submission
tells CGI what to do with it. The most common action is to run a program located
on the server in a directory that’s typically called
“cgi-bin.” (If you watch the address window at the top of your
browser when you push a button on a Web page, you can sometimes see
“cgi-bin” within all the gobbledygook there.) These programs can be
written in most languages. Perl is a common choice because it is designed for
text manipulation and is interpreted, so it can be installed on any server
regardless of processor or operating system.
Many powerful Web sites today are
built strictly on CGI, and you can in fact do nearly anything with it. The
problem is response time. The response of a CGI program depends on how much data
must be sent as well as the load on both the server and the Internet. (On top of
this, starting a CGI program tends to be slow.) The initial designers of the Web
did not foresee how rapidly this bandwidth would be exhausted for the kinds of
applications people developed. For example, any sort of dynamic graphing is
nearly impossible to perform with consistency because a GIF file must be created
and moved from the server to the client for each version of the graph. And
you’ve no doubt had direct experience with something as simple as
validating the data on an input form. You press the submit button on a page; the
data is shipped back to the server; the server starts a CGI program that
discovers an error, formats an HTML page informing you of the error and sends
the page back to you; you must then back up a page and try again. Not only is
this slow, it’s not elegant.
The solution is client-side
programming. Most machines that run Web browsers are powerful engines capable of
doing vast work, and with the original static HTML approach they are sitting
there, just idly waiting for the server to dish up the next page. Client-side
programming means that the Web browser is harnessed to do whatever work it can,
and the result for the user is a much speedier and more interactive experience
at your Web site.
The problem with discussions of
client-side programming is that they aren’t very different from
discussions of programming in general. The parameters are almost the same, but
the platform is different: a Web browser is like a limited operating system. In
the end, it’s still programming and this accounts for the dizzying array
of problems and solutions produced by client-side programming. The rest of this
section provides an overview of the issues and approaches in client-side
programming.
One of the most significant steps
forward in client-side programming is the development of the plug-in. This is a
way for a programmer to add new functionality to the browser by downloading a
piece of code that plugs itself into the appropriate spot in the browser. It
tells the browser “from now on you can perform this new activity.”
(You need to download the plug-in only once.) Some fast and powerful behavior is
added to browsers via plug-ins, but writing a plug-in is not a trivial task and
isn’t something you’d want to do as part of the process of building
a particular site. The value of the plug-in for client-side programming is that
it allows an expert programmer to develop a new language and add that language
to a browser without the permission of the browser manufacturer. Thus,
plug-ins provide the back door that allows the creation of new client-side
programming languages (although not all languages are implemented as
plug-ins).
Plug-ins resulted in an explosion
of scripting languages. With a scripting language you embed the source code for
your client-side program directly into the HTML page and the plug-in that
interprets that language is automatically activated while the HTML page is being
displayed. Scripting languages tend to be reasonably simple to understand, and
because they are simply text that is part of an HTML page they load very quickly
as part of the single server hit required to procure that page. The trade-off is
that your code is exposed for everyone to see (and steal) but generally you
aren’t doing amazingly sophisticated things with scripting languages so
it’s not too much of a hardship.
This points out that scripting
languages are really intended to solve specific types of problems, primarily the
creation of richer and more interactive graphical user interfaces (GUIs).
However, a scripting language might solve 80 percent of the problems encountered
in client-side programming. Your problems might very well fit completely within
that 80 percent, and since scripting languages tend to be easier and faster to
develop, you should probably consider a scripting language before looking at a
more involved solution such as Java or ActiveX programming.
The most commonly-discussed
scripting languages are JavaScript (which has nothing to do with Java;
it’s named that way just to grab some of Java’s marketing momentum),
VBScript (which looks like Visual Basic) and Tcl/Tk, which comes from the
popular cross-platform GUI-building language. There are others out there and no
doubt more in development.
JavaScript is probably the most
commonly supported. It comes built into both Netscape Navigator and the
Microsoft Internet Explorer (IE). In addition, there are probably more
JavaScript books out than for the other languages, and some tools automatically
create pages using JavaScript. However, if you’re already fluent in Visual
Basic or Tcl/Tk, you’ll be more productive using those scripting languages
rather than learning a new one. (You’ll have your hands full dealing with
the Web issues already.)
If a scripting language can solve
80 percent of the client-side programming problems, what about the other 20
percent – the “really hard stuff?” The most popular solution
today is Java. Not only is it a powerful programming language built to be
secure, cross-platform and international, but Java is being continuously
extended to provide language features and libraries that elegantly handle
problems that are difficult in traditional programming languages, such as
multithreading, database access, network programming and distributed computing.
Java allows client-side programming via the applet.
An applet is a mini-program that
will run only under a Web browser. The applet is downloaded automatically as
part of a Web page (just as, for example, a graphic is automatically
downloaded). When the applet is activated it executes a program. This is part of
its beauty – it provides you with a way to automatically distribute the
client software from the server at the time the user needs the client software,
and no sooner. They get the latest version of the client software without fail
and without difficult re-installation. Because of the way Java is designed, the
programmer needs to create only a single program, and that program automatically
works with all computers that have browsers with built-in Java interpreters.
(This safely includes the vast majority of machines.) Since Java is a
full-fledged programming language, you can do as much work as possible on the
client before and after making requests of the server. For example, you
won’t need to send a request form across the Internet to discover that
you’ve gotten a date or some other parameter wrong, and your client
computer can quickly do the work of plotting data instead of waiting for the
server to make a plot and ship a graphic image back to you. Not only do you get
the immediate win of speed and responsiveness, but the general network traffic
and load upon servers can be reduced, preventing the entire Internet from
slowing down.
One advantage a Java applet has
over a scripted program is that it’s in compiled form, so the source code
isn’t available to the client. On the other hand, a Java applet can be
decompiled without too much trouble, and hiding your code is often not an
important issue anyway. Two other factors can be important. As you will see
later in the book, a compiled Java applet can comprise many modules and take
multiple server “hits” (accesses) to download. (In Java
1.1 this is minimized by Java archives, called JAR files,
that allow all the required modules to be packaged together for a single
download.) A scripted program will just be integrated into the Web page as part
of its text (and will generally be smaller and reduce server hits). This could
be important to the responsiveness of your Web site. Another factor is the
all-important learning curve. Regardless of what you’ve heard, Java is not
a trivial language to learn. If you’re a Visual Basic programmer, moving
to VBScript will be your fastest solution and since it will probably solve most
typical client/server problems you might be hard pressed to justify learning
Java. If you’re experienced with a scripting language you will certainly
benefit from looking at JavaScript or VBScript before committing to Java, since
they might fit your needs handily and you’ll be more productive
sooner.
To some degree, the competitor to
Java is Microsoft’s ActiveX, although it takes a completely different
approach. ActiveX is originally a Windows-only solution, although it is now
being developed via an independent consortium to become cross-platform.
Effectively, ActiveX says “if your program connects to its environment
just so, it can be dropped into a Web page and run under a browser that supports
ActiveX.” (IE directly supports ActiveX and Netscape does so using a
plug-in.) Thus, ActiveX does not constrain you to a particular language. If, for
example, you’re already an experienced Windows programmer using a language
such as C++, Visual Basic, or Borland’s Delphi, you can create ActiveX
components with almost no changes to your programming knowledge. ActiveX also
provides a path for the use of legacy code in your Web pages.
Automatically downloading and
running programs across the Internet can sound like a virus-builder’s
dream. ActiveX especially brings up the thorny issue of security in client-side
programming. If you click on a Web site, you might automatically download any
number of things along with the HTML page: GIF files, script code, compiled Java
code, and ActiveX components. Some of these are benign; GIF files can’t do
any harm, and scripting languages are generally limited in what they can do.
Java was also designed to run its applets within a “sandbox” of
safety, which prevents it from writing to disk or accessing memory outside the
sandbox.
ActiveX is at the opposite end of
the spectrum. Programming with ActiveX is like programming Windows – you
can do anything you want. So if you click on a page that downloads an ActiveX
component, that component might cause damage to the files on your disk. Of
course, programs that you load onto your computer that are not restricted to
running inside a Web browser can do the same thing. Viruses downloaded from
Bulletin-Board Systems (BBSs) have long been a problem, but the speed of the
Internet amplifies the difficulty.
The solution seems to be
“digital signatures,” whereby code is verified to show who the
author is. This is based on the idea that a virus works because its creator can
be anonymous, so if you remove the anonymity individuals will be forced to be
responsible for their actions. This seems like a good plan because it allows
programs to be much more functional, and I suspect it will eliminate malicious
mischief. If, however, a program has an unintentional bug that’s
destructive it will still cause problems.
The Java approach is to prevent
these problems from occurring, via the sandbox. The Java interpreter that lives
on your local Web browser examines the applet for any untoward instructions as
the applet is being loaded. In particular, the applet cannot write files to disk
or erase files (one of the mainstays of the virus). Applets are generally
considered to be safe, and since this is essential for reliable client-server
systems, any bugs that allow viruses are rapidly repaired. (It’s worth
noting that the browser software actually enforces these security restrictions,
and some browsers allow you to select different security levels to provide
varying degrees of access to your system.)
You might be skeptical of this
rather draconian restriction against writing files to your local disk. For
example, you may want to build a local database or save data for later use
offline. The initial vision seemed to be that eventually everyone would be
online to do anything important, but that was soon seen to be impractical
(although low-cost “Internet appliances” might someday satisfy the
needs of a significant segment of users). The solution is the “signed
applet” that uses public-key encryption to verify that an applet does
indeed come from where it claims it does. A signed applet can then go ahead and
trash your disk, but the theory is that since you can now hold the applet
creator accountable they won’t do vicious things. Java
1.1 provides a framework for digital signatures so that
you will eventually be able to allow an applet to step outside the sandbox if
necessary.
Digital signatures have missed an
important issue, which is the speed that people move around on the Internet. If
you download a buggy program and it does something untoward, how long will it be
before you discover the damage? It could be days or even weeks. And by then, how
will you track down the program that’s done it (and what good will it do
at that point?).
The Web is the most general
solution to the client/server problem, so it makes sense that you can use the
same technology to solve a subset of the problem, in particular the classic
client/server problem within a company. With traditional client/server
approaches you have the problem of multiple different types of client computers,
as well as the difficulty of installing new client software, both of which are
handily solved with Web browsers and client-side programming. When Web
technology is used for an information network that is restricted to a particular
company, it is referred to as an Intranet. Intranets provide much greater
security than the Internet, since you can physically control access to the
servers within your company. In terms of training, it seems that once people
understand the general concept of a browser it’s much easier for them to
deal with differences in the way pages and applets look, so the learning curve
for new kinds of systems seems to be reduced.
The security problem brings us to
one of the divisions that seems to be automatically forming in the world of
client-side programming. If your program is running on the Internet, you
don’t know what platform it will be working under and you want to be extra
careful that you don’t disseminate buggy code. You need something
cross-platform and secure, like a scripting language or Java.
If you’re running on an
Intranet, you might have a different set of constraints. It’s not uncommon
that your machines could all be Intel/Windows platforms. On an Intranet,
you’re responsible for the quality of your own code and can repair bugs
when they’re discovered. In addition, you might already have a body of
legacy code that you’ve been using in a more traditional client/server
approach, whereby you must physically install client programs every time you do
an upgrade. The time wasted in installing upgrades is the most compelling reason
to move to browsers because upgrades are invisible and automatic. If you are
involved in such an Intranet, the most sensible approach to take is ActiveX
rather than trying to recode your programs in a new language.
When faced with this bewildering
array of solutions to the client-side programming problem, the best plan of
attack is a cost-benefit analysis. Consider the constraints of your problem and
what would be the fastest way to get to your solution. Since client-side
programming is still programming, it’s always a good idea to take the
fastest development approach for your particular situation. This is an
aggressive stance to prepare for inevitable encounters with the problems of
program development.
This whole discussion has ignored
the issue of server-side programming. What happens when you make a request of a
server? Most of the time the request is simply “send me this file.”
Your browser then interprets the file in some appropriate fashion: as an HTML
page, a graphic image, a Java applet, a script program, etc. A more complicated
request to a server generally involves a database transaction. A common scenario
involves a request for a complex database search, which the server then formats
into an HTML page and sends to you as the result. (Of course, if the client has
more intelligence via Java or a scripting language, the raw data can be sent and
formatted at the client end, which will be faster and less load on the server.)
Or you might want to register your name in a database when you join a group or
place an order, which will involve changes to that database. These database
requests must be processed via some code on the server side, which is generally
referred to as server-side programming.
Traditionally, server-side programming has been performed using Perl and CGI
scripts, but more sophisticated systems have been appearing. These include
Java-based Web servers that allow you to perform all your server-side
programming in Java by writing what are called servlets.
Most of the brouhaha over Java has
been about applets. Java is actually a general-purpose programming language that
can solve any type of problem, at least in theory. And as pointed out
previously, there might be more effective ways to solve most client/server
problems. When you move out of the applet arena (and simultaneously release the
restrictions, such as the one against writing to disk) you enter the world of
general-purpose applications that run standalone, without a Web browser, just
like any ordinary program does. Here, Java’s strength is not only in its
portability, but also its programmability. As you’ll see throughout this
book, Java has many features that allow you to create robust programs in a
shorter period than with previous programming languages.
Be aware that this is a mixed
blessing. You pay for the improvements through slower execution speed (although
there is significant work going on in this area). Like any language, Java has
built-in limitations that might make it inappropriate to solve certain types of
programming problems. Java is a rapidly-evolving language, however, and as each
new release comes out it becomes more and more attractive for solving larger
sets of
problems.
The object-oriented paradigm is a
new and different way of thinking about programming and many folks have trouble
at first knowing how to approach a project. Now that you know that everything is
supposed to be an object, you can create a “good” design, one that
will take advantage of all the benefits that OOP has to offer.
Books on OOP analysis and design
are coming out of the woodwork. Most of these books are filled with lots of long
words, awkward prose and important-sounding
pronouncements.[9]
I come away thinking the book would be better as a chapter or at the most a very
short book and feeling annoyed that this process couldn’t be described
simply and directly. (It disturbs me that people who purport to specialize in
managing complexity have such trouble writing clear and simple books.) After
all, the whole point of OOP is to make the process of
software development easier, and although it would seem to threaten the
livelihood of those of us who consult because things are complex, why not make
it simple? So, hoping I’ve built a healthy skepticism within you, I shall
endeavor to give you my own perspective on analysis and design in as few
paragraphs as possible.
While you’re going through
the development process, the most important issue is this: don’t get lost.
It’s easy to do. Most of these methodologies are
designed to solve the largest of problems. (This makes sense; these are the
especially difficult projects that justify calling in that author as consultant,
and justify the author’s large fees.) Remember that most projects
don’t fit into that category, so you can usually have a successful
analysis and design with a relatively small subset of what a methodology
recommends. But some sort of process, no matter how limited, will generally get
you on your way in a much better fashion than simply beginning to
code.
That said, if you’re looking
at a methodology that contains tremendous detail and suggests many steps and
documents, it’s still difficult to know when to stop. Keep in mind what
you’re trying to discover:
If you come up
with nothing more than the objects and their interfaces then you can write a
program. For various reasons you might need more descriptions and documents than
this, but you can’t really get away with any less.
The process can be undertaken in
four phases, and a phase 0 which is just the initial commitment to using some
kind of structure.
The first step is to decide what
steps you’re going to have in your process. It sounds simple (in fact,
all of this sounds simple) and yet, often, people don’t even get
around to phase one before they start coding. If your plan is “let’s
jump in and start coding,” fine. (Sometimes that’s appropriate when
you have a well-understood problem.) At least agree that this is the
plan.
You might also decide at this phase
that some additional process structure is necessary but not the whole nine
yards. Understandably enough, some programmers like to work in “vacation
mode” in which no structure is imposed on the process of developing their
work: “It will be done when it’s done.” This can be appealing
for awhile, but I’ve found that having a few milestones along the way
helps to focus and galvanize your efforts around those milestones instead of
being stuck with the single goal of “finish the project.” In
addition, it divides the project into more bite-sized pieces and make it seem
less threatening.
When I began to study story
structure (so that I will someday write a novel) I was initially resistant to
the idea, feeling that when I wrote I simply let it flow onto the page. What I
found was that when I wrote about computers the structure was simple enough so I
didn’t need to think much about it, but I was still structuring my work,
albeit only semi-consciously in my head. So even if you think that your plan is
to just start coding, you still go through the following phases while asking and
answering certain questions.
In the previous generation of
program design (procedural design), this would be called “creating the
requirements analysis and
system specification.”
These, of course, were places to get lost: intimidatingly-named documents that
could become big projects in their own right. Their intention was good, however.
The requirements analysis says “Make a list of the guidelines we will use
to know when the job is done and the customer is satisfied.” The system
specification says “Here’s a description of what the program
will do (not how) to satisfy the requirements.” The requirements
analysis is really a contract between you and the customer (even if the customer
works within your company or is some other object or system). The system
specification is a top-level exploration into the problem and in some sense a
discovery of whether it can be done and how long it will take. Since both of
these will require consensus among people, I think it’s best to keep them
as bare as possible – ideally, to lists and basic diagrams – to save
time. You might have other constraints that require you to expand them into
bigger documents.
It’s necessary to stay
focused on the heart of what you’re trying to accomplish in this phase:
determine what the system is supposed to do. The most valuable tool for this is
a collection of what are called “use-cases.”
These are essentially descriptive answers to questions that start with
“What does the system do if ...” For example, “What does the
auto-teller do if a customer has just deposited a check within 24 hours and
there’s not enough in the account without the check to provide the desired
withdrawal?” The use-case then describes what the auto-teller does in that
case.
You try to discover a full set of
use-cases for your system, and once you’ve done that you’ve got the
core of what the system is supposed to do. The nice thing about focusing on
use-cases is that they always bring you back to the essentials and keep you from
drifting off into issues that aren’t critical for getting the job done.
That is, if you have a full set of use-cases you can describe your system and
move onto the next phase. You probably won’t get it all figured out
perfectly at this phase, but that’s OK. Everything will reveal itself in
the fullness of time, and if you demand a perfect system specification at this
point you’ll get stuck.
It helps to kick-start this phase
by describing the system in a few paragraphs and then looking for nouns and
verbs. The nouns become the objects and the verbs become the methods in the
object interfaces. You’ll be surprised at how useful a tool this can be;
sometimes it will accomplish the lion’s share of the work for
you.
Although it’s a black art, at
this point some kind of scheduling can be quite useful.
You now have an overview of what you’re building so you’ll probably
be able to get some idea of how long it will take. A lot of factors come into
play here: if you estimate a long schedule then the company might not decide to
build it, or a manager might have already decided how long the project should
take and will try to influence your estimate. But it’s best to have an
honest schedule from the beginning and deal with the tough decisions early.
There have been a lot of attempts to come up with accurate scheduling techniques
(like techniques to predict the stock market), but probably the best approach is
to rely on your experience and intuition. Get a gut feeling for how long it will
really take, then double that and add 10 percent. Your gut feeling is probably
correct; you can get something working in that time. The
“doubling” will turn that into something decent, and the 10 percent
will deal with final polishing and details. However you want to explain it, and
regardless of the moans and manipulations that happen when you reveal such a
schedule, it just seems to work out that
way.
In this phase you must come up with
a design that describes what the classes look like and how they will interact. A
useful diagramming tool that has evolved over time is the
Unified Modeling Language (UML). You can get the
specification for UML at www.rational.com. UML can also be helpful as a
descriptive tool during phase 1, and some of the diagrams you create there will
probably show up unmodified in phase 2. You don’t need to use UML, but it
can be helpful, especially if you want to put a diagram up on the wall for
everyone to ponder, which is a good idea. An alternative to UML is a textual
description of the objects and their interfaces (as I described in Thinking
in C++), but this can be limiting.
The most successful consulting
experiences I’ve had when coming up with an initial design involves
standing in front of a team, who hadn’t built an OOP project before, and
drawing objects on a whiteboard. We talked about how the objects should
communicate with each other, and erased some of them and replaced them with
other objects. The team (who knew what the project was supposed to do) actually
created the design; they “owned” the design rather than having it
given to them. All I was doing was guiding the process by asking the right
questions, trying out the assumptions and taking the feedback from the team to
modify those assumptions. The true beauty of the process was that the team
learned how to do object-oriented design not by reviewing abstract examples, but
by working on the one design that was most interesting to them at that moment:
theirs.
You’ll know you’re done
with phase 2 when you have described the objects and their interfaces. Well,
most of them – there are usually a few that slip through the cracks and
don’t make themselves known until phase 3. But that’s OK. All you
are concerned with is that you eventually discover all of your objects.
It’s nice to discover them early in the process but OOP provides enough
structure so that it’s not so bad if you discover them
later.
If you’re reading this book
you’re probably a programmer, so now we’re at the part you’ve
been trying to get to. By following a plan – no matter how simple and
brief – and coming up with design structure before coding, you’ll
discover that things fall together far more easily than if you dive in and start
hacking, and this provides a great deal of satisfaction. Getting code to run and
do what you want is fulfilling, even like some kind of drug if you look at the
obsessive behavior of some programmers. But it’s my experience that coming
up with an elegant solution is deeply satisfying at an entirely different level;
it feels closer to art than technology. And elegance
always pays off; it’s not a frivolous pursuit. Not only does it give you a
program that’s easier to build and debug, but it’s also easier to
understand and maintain, and that’s where the financial value
lies.
After you build the system and get
it running, it’s important to do a reality check, and here’s where
the requirements analysis and system specification comes in. Go through your
program and make sure that all the requirements are checked off, and that all
the use-cases work the way they’re described. Now you’re done. Or
are you?
This is the point in the
development cycle that has traditionally been called “maintenance,”
a catch-all term that can mean everything from “getting it to work the way
it was really supposed to in the first place” to “adding features
that the customer forgot to mention before” to the more traditional
“fixing the bugs that show up” and “adding new features as the
need arises.” So many misconceptions have been applied to the term
“maintenance” that it has taken on a slightly deceiving quality,
partly because it suggests that you’ve actually built a pristine program
and that all you need to do is change parts, oil it and keep it from rusting.
Perhaps there’s a better term to describe what’s going
on.
The term is
iteration. That is, “You won’t get it
right the first time, so give yourself the latitude to learn and to go back and
make changes.” You might need to make a lot of changes as you learn and
understand the problem more deeply. The elegance you’ll produce if you
iterate until you’ve got it right will pay off, both in the short and the
long run.
What it means to “get it
right” isn’t just that the program works according to the
requirements and the use-cases. It also means that the internal structure of the
code makes sense to you, and feels like it fits together well, with no awkward
syntax, oversized objects or ungainly exposed bits of code. In addition, you
must have some sense that the program structure will survive the changes that it
will inevitably go through during its lifetime, and that those changes can be
made easily and cleanly. This is no small feat. You must not only understand
what you’re building, but also how the program will evolve (what I call
the vector of change). Fortunately, object-oriented
programming languages are particularly adept at supporting this kind of
continuing modification – the boundaries created by the objects are what
tend to keep the structure from breaking down. They are also what allow you to
make changes that would seem drastic in a procedural program without causing
earthquakes throughout your code. In fact, support for iteration might be the
most important benefit of OOP.
With iteration, you create
something that at least approximates what you think you’re building, and
then you kick the tires, compare it to your requirements and see where it falls
short. Then you can go back and fix it by redesigning and re-implementing the
portions of the program that didn’t work
right.[10]
You might actually need to solve the problem, or an aspect of the problem,
several times before you hit on the right solution. (A study of Design
Patterns, described in Chapter 16, is usually helpful
here.)
Iteration also occurs when you
build a system, see that it matches your requirements and then discover it
wasn’t actually what you wanted. When you see the system, you realize you
want to solve a different problem. If you think this kind of iteration is going
to happen, then you owe it to yourself to build your first version as quickly as
possible so you can find out if it’s what you want.
Iteration is closely tied to
incremental development. Incremental development
means that you start with the core of your system and implement it as a
framework upon which to build the rest of the system piece by piece. Then you
start adding features one at a time. The trick to this is in designing a
framework that will accommodate all the features you plan to add to it. (See
Chapter 16 for more insight into this issue.) The advantage is that once you get
the core framework working, each feature you add is like a small project in
itself rather than part of a big project. Also, new features that are
incorporated later in the development or maintenance phases can be added more
easily. OOP supports incremental development because if your program is designed
well, your increments will turn out to be discreet objects or groups of
objects.
Of course you wouldn’t build
a house without a lot of carefully-drawn plans. If you build a deck or a dog
house, your plans won’t be so elaborate but you’ll still probably
start with some kind of sketches to guide you on your way. Software development
has gone to extremes. For a long time, people didn’t have much structure
in their development, but then big projects began failing. In reaction, we ended
up with methodologies that had an intimidating amount of structure and detail.
These were too scary to use – it looked like you’d spend all your
time writing documents and no time programming. (This was often the case.) I
hope that what I’ve shown you here suggests a middle path – a
sliding scale. Use an approach that fits your needs (and your personality). No
matter how minimal you choose to make it, some kind of plan will make a
big improvement in your project as opposed to no plan at all. Remember that, by
some estimates, over 50 percent of projects
fail.
Java looks a lot like C++, and so
naturally it would seem that C++ will be replaced by Java. But I’m
starting to question this logic. For one thing, C++ still has some features that
Java doesn’t, and although there have been a lot of promises about Java
someday being as fast or faster than C++ the breakthroughs haven’t
happened yet (it’s getting steadily faster, but still hasn’t touched
C++). Also, there seems to be a perking interest in C++ in many fields, so I
don’t think that language is going away any time soon. (Languages seem to
hang around. Speaking at one of my “Intermediate/Advanced Java
Seminars,” Allen Holub asserted that the two most commonly-used languages
are Rexx and COBOL, in that order.)
I’m beginning to think that
the strength of Java lies in a slightly different arena than that of C++. C++ is
a language that doesn’t try to fit a mold. Certainly it has been adapted
in a number of ways to solve particular problems. Some tools combine libraries,
component models and code generation tools to solve the problem of developing
windowed end-user applications (for Microsoft Windows). And yet, what do the
vast majority of Windows developers use? Microsoft’s Visual Basic (VB).
This despite the fact that VB produces the kind of code that becomes
unmanageable when the program is only a few pages long (and syntax that can be
positively mystifying). As successful and popular as VB is, from a language
design viewpoint it’s a mountain of hacks. It would be nice to have the
ease and power of VB without the resulting unmanageable code. And that’s
where I think Java will shine: as the “next VB.” You may or may not
shudder to hear this, but think about it: so much of Java is designed to make it
easy for the programmer to solve application-level problems like networking and
cross-platform UI, and yet it has a language design intended to allow the
creation of very large and flexible bodies of code. Add to this the fact that
Java has the most robust type checking and error-handling systems I’ve
ever seen in a language and you have the makings of a significant leap forward
in programming productivity.
Should you use Java instead of C++
for your project? Other than Web applets, there are two issues to consider.
First, if you want to use a lot of existing libraries (and you’ll
certainly get a lot of productivity gains there), or if you have an existing C
or C++ code base, Java might slow your development down rather than speeding it
up. If you’re developing all your code primarily from scratch, then the
simplicity of Java over C++ will shorten your development time.
The biggest issue is speed.
Interpreted Java has been slow, even 20 to 50 times slower than C in the
original Java interpreters. This has improved quite a bit over time, but it will
still remain an important number. Computers are about speed; if it wasn’t
significantly faster to do something on a computer then you’d do it by
hand. (I’ve even heard it suggested that you start with Java, to gain the
short development time, then use a tool and support libraries to translate your
code to C++, if you need faster execution speed.)
The key to making Java feasible for
most non-Web development projects is the appearance of speed improvements like
so-called “just-in time” (JIT) compilers and
possibly even native code compilers (two of which exist at this writing). Of
course, native-code compilers will eliminate the touted cross-platform execution
of the compiled programs, but they will also bring the speed of the executable
closer to that of C and C++. And cross compiling programs in Java should be a
lot easier than doing so in C or C++. (In theory, you just recompile, but that
promise has been made before for other languages.)
You can find comparisons of Java
and C++, observations about Java realities and practicality and coding
guidelines in the appendices.
[1]
Fortunately, this has changed significantly with the advent of third-party
libraries and the Standard C++ library.
[2]
See Multiparadigm Programming in Leda by Timothy Budd (Addison-Wesley
1995).
[3]
Some people make a distinction, stating that type determines the interface while
class is a particular implementation of that interface.
[4]
I’m indebted to my friend Scott Meyers for this term.
[5]
This uses the Unified Notation, which will primarily be used in this
book.
[6]
Note that this is true only for objects that are created on the heap, with
new. However, the problem described, and indeed any general programming
problem, requires objects to be created on the heap.
[7]
According to a technical reader for this book, one existing real-time Java
implementation (www.newmonics.com) has guarantees on garbage collector
performance.
[8]
The material in this section is adapted from an article by the author that
originally appeared on Mainspring, at www.mainspring.com. Used with
permission.
[9]
The best introduction is still Grady Booch’s Object-Oriented Design
with Applications, 2nd edition, Wiley & Sons 1996. His
insights are clear and his prose is straightforward, although his notations are
needlessly complex for most designs. (You can easily get by with a
subset.)
[10]
This is something like “rapid
prototyping,” where you were supposed to build a quick-and-dirty version
so that you could learn about the system, and then throw away your prototype and
build it right. The trouble with rapid prototyping is that people didn’t
throw away the prototype, but instead built upon it. Combined with the lack of
structure in procedural programming, this often leads to messy systems that are
expensive to maintain.