Creating a good
input/output (IO) system is one of the more difficult tasks for the language
designer.
This is evidenced by the number of
different approaches. The challenge seems to be in covering all eventualities.
Not only are there different kinds of IO that you want to communicate with
(files, the console, network connections), but you need to talk to them in a
wide variety of ways (sequential, random-access, binary, character, by lines, by
words, etc.).
The Java library designers attacked
the problem by creating lots of classes. In fact, there are so many classes for
Java’s IO system that it can be intimidating at first (ironically, the
Java IO design actually prevents an explosion of classes). There has also been a
significant change in the IO library between Java
1.0 and Java 1.1. Instead of
simply replacing the old library with a new one, the designers at Sun extended
the old library and added the new one alongside it. As a result you can
sometimes end up mixing the old and new libraries and creating even more
intimidating code.
This chapter will help you
understand the variety of IO classes in the standard Java library and how to use
them. The first portion of the chapter will introduce the “old” Java
1.0 IO stream library, since there is a significant
amount of existing code that uses that library. The remainder of the chapter
will introduce the new features in the Java 1.1 IO library. Note that when you
compile some of the code in the first part of the chapter with a Java 1.1
compiler you can get a “deprecated feature”
warning message at compile time. The code still works; the compiler is just
suggesting that you use certain new features that are described in the latter
part of this chapter. It is valuable, however, to see the difference between the
old and new way of doing things and that’s why it was left in – to
increase your understanding (and to allow you to read code written for Java
1.0).
The Java library classes for IO are
divided by input and output, as
you can see by looking at the online Java class hierarchy with your Web browser.
By inheritance, all classes derived from
InputStream have basic
methods called read( )
for reading a single byte or array of bytes. Likewise, all classes derived
from OutputStream have
basic methods called
write( ) for writing a
single byte or array of bytes. However, you won’t generally use these
methods; they exist so more sophisticated classes can use them as they provide a
more useful interface. Thus, you’ll rarely create your stream object by
using a single class, but instead will layer multiple objects together to
provide your desired functionality. The fact that you create more than one
object to create a single resulting stream is the primary reason that
Java’s stream library is confusing.
It’s helpful to categorize
the classes by their functionality. The library designers started by deciding
that all classes that had anything to do with input would be inherited from
InputStream and all classes that were associated with output would be
inherited from OutputStream.
InputStream’s job is
to represent classes that produce input from different sources. These sources
can be (and each has an associated subclass of InputStream):
In addition,
the FilterInputStream is also a type of InputStream, to provide a
base class for "decorator" classes that attach attributes or useful interfaces
to input streams. This is discussed later.
Class |
Function |
Constructor
Arguments |
---|---|---|
How to use it |
||
ByteArray-InputStream |
Allows a buffer in memory to be
used as an InputStream. |
The buffer from which to extract
the bytes. |
As a source of data. Connect it to
a FilterInputStream object to provide a useful
interface. |
||
StringBuffer-InputStream |
Converts a String into an
InputStream. |
A String. The underlying
implementation actually uses a StringBuffer. |
As a source of data. Connect it to
a FilterInputStream object to provide a useful
interface. |
||
File-InputStream |
For reading information from a
file. |
A String representing the
file name, or a File or FileDescriptor object. |
As a source of data. Connect it to
a FilterInputStream object to provide a useful
interface. |
This category includes the classes
that decide where your output will go: an array of bytes (no String,
however; presumably you can create one using the array of bytes), a file, or a
“pipe.”
In addition, the
FilterOutputStream provides a base class for "decorator" classes that
attach attributes or useful interfaces to output streams. This is discussed
later.
The use of layered objects to
dynamically and transparently add responsibilities to individual objects is
referred to as the
decorator pattern.
(Patterns[44]
are the subject of Chapter 16.) The decorator pattern specifies that all objects
that wrap around your initial object have the same interface, to make the use of
the decorators transparent – you send the same message to an object
whether it’s been decorated or not. This is the reason for the existence
of the “filter” classes in the Java IO library: the abstract
“filter” class is the base class for all the decorators. (A
decorator must have the same interface as the object it decorates, but the
decorator can also extend the interface, which occurs in several of the
“filter” classes).
Decorators are often used when
subclassing requires a large number of subclasses to support every possible
combination needed – so many that subclassing becomes impractical. The
Java IO library requires many different combinations of features which is why
the decorator pattern is a good approach. There is a drawback to the decorator
pattern, however. Decorators give you much more flexibility while you’re
writing a program (since you can easily mix and match attributes), but they add
complexity to your code. The reason that the Java IO library is awkward to use
is that you must create many classes – the “core” IO type plus
all the decorators – in order to get the single IO object that you want.
The classes that provide the
decorator interface to control a particular InputStream or
OutputStream are the FilterInputStream and
FilterOutputStream – which don’t have very intuitive names.
They are derived, respectively, from InputStream and OutputStream,
and they are abstract classes, in theory to provide a common interface for all
the different ways you want to talk to a stream. In fact,
FilterInputStream and FilterOutputStream simply mimic their base
classes, which is the key requirement of the
decorator.
The FilterInputStream
classes accomplish two significantly different things. DataInputStream
allows you to read different types of primitive data as well as String
objects. (All the methods start with “read,” such as
readByte( ), readFloat( ), etc.) This, along with its
companion DataOutputStream, allows you to move primitive data from one
place to another via a stream. These “places” are determined by the
classes in Table 10-1. If you’re reading data in blocks and parsing it
yourself, you won’t need DataInputStream, but in most other cases
you will want to use it to automatically format the data you
read.
The remaining classes modify the
way an InputStream behaves internally: whether it’s buffered or
unbuffered, if it keeps track of the lines it’s reading (allowing you to
ask for line numbers or set the line number), and whether you can push back a
single character. The last two classes look a lot like support for building a
compiler (that is, they were added to support the construction of the Java
compiler), so you probably won’t use them in general programming.
You’ll probably need to
buffer your input almost every time, regardless of the IO device you’re
connecting to, so it would have made more sense for the IO library to make a
special case for unbuffered input rather than buffered input.
Class |
Function |
Constructor
Arguments |
How to use it |
||
Data-InputStream |
Used in concert with
DataOutputStream, so you can read primitives (int, char, long, etc.) from
a stream in a portable fashion. |
InputStream |
Contains a full interface to allow
you to read primitive types. |
The complement to
DataInputStream is DataOutputStream, which formats each of the
primitive types and String objects onto a stream in such a way that any
DataInputStream, on any machine, can read them. All the methods start
with “write,” such as writeByte( ),
writeFloat( ), etc.
If you want to do true formatted
output, for example, to the console, use a PrintStream. This is the
endpoint that allows you to print all of the primitive data types and
String objects in a viewable format as opposed to
DataOutputStream, whose goal is to put them on a stream in a way that
DataInputStream can portably reconstruct them. The System.out
static object is a PrintStream.
The two important methods in
PrintStream are print( ) and println( ), which
are overloaded to print out all the various types. The difference between
print( ) and println( ) is that the latter adds a
newline when it’s done.
BufferedOutputStream is a
modifier and tells the stream to use buffering so you don’t get a physical
write every time you write to the stream. You’ll probably always want to
use this with files, and possibly console IO.
RandomAccessFile is used for
files containing records of known size so that you can move from one record to
another using
seek( ), then read
or change the records. The records don’t have to be the same size; you
just have to be able to determine how big they are and where they are placed in
the file.
At first it’s a little bit
hard to believe that RandomAccessFile is not part of the
InputStream or OutputStream hierarchy. It has no association with
those hierarchies other than that it happens to implement the
DataInput and
DataOutput interfaces
(which are also implemented by DataInputStream and
DataOutputStream). It doesn’t even use any of the functionality of
the existing InputStream or OutputStream classes –
it’s a completely separate class, written from scratch, with all of its
own (mostly native) methods. The reason for this may be that
RandomAccessFile has essentially different behavior than the other IO
types, since you can move forward and backward within a file. In any event, it
stands alone, as a direct descendant of Object.
Essentially, a
RandomAccessFile works like a DataInputStream pasted together with
a DataOutputStream and the methods getFilePointer( ) to find
out where you are in the file, seek( ) to move to a new point in the
file, and length( ) to determine the maximum size of the file. In
addition, the constructors require a second argument (identical to
fopen( ) in C) indicating whether you are just randomly reading
(“r”) or reading and writing (“rw”).
There’s no support for write-only files, which could suggest that
RandomAccessFile might have worked well if it were inherited from
DataInputStream.
What’s even more frustrating
is that you could easily imagine wanting to seek within other types of streams,
such as a ByteArrayInputStream, but the seeking methods are available
only in RandomAccessFile, which works for files only.
BufferedInputStream does allow you to
mark( ) a position
(whose value is held in a single internal variable) and
reset( ) to that
position, but this is limited and not too
useful.
The
File class has a
deceiving name – you might think it refers to a file, but it
doesn’t. It can represent either the name of a particular file or
the names of a set of files in a directory. If it’s a set of files,
you can ask for the set with the
list( ) method, and
this returns an array of String. It makes sense to return an array rather
than one of the flexible collection classes because the number of elements is
fixed, and if you want a different directory listing you just create a different
File object. In fact, “FilePath” would have been a better
name. This section shows a complete example of the use of this class, including
the associated
FilenameFilter
interface.
Suppose you’d like to see a
directory listing. The File object can be listed in two ways. If you call
list( ) with no arguments, you’ll get the full list that the
File object contains. However, if you want a restricted list, for
example, all of the files with an extension of .java, then you use a
“directory filter,” which is a class that tells how to select the
File objects for display.
Here’s the code for the
example:
//: DirList.java // Displays directory listing package c10; import java.io.*; public class DirList { public static void main(String[] args) { try { File path = new File("."); String[] list; if(args.length == 0) list = path.list(); else list = path.list(new DirFilter(args[0])); for(int i = 0; i < list.length; i++) System.out.println(list[i]); } catch(Exception e) { e.printStackTrace(); } } } class DirFilter implements FilenameFilter { String afn; DirFilter(String afn) { this.afn = afn; } public boolean accept(File dir, String name) { // Strip path information: String f = new File(name).getName(); return f.indexOf(afn) != -1; } } ///:~
The DirFilter class
“implements” the interface FilenameFilter. (Interfaces
were covered in Chapter 7.) It’s useful to see how simple the
FilenameFilter interface is:
public interface FilenameFilter { boolean accept(File dir, String name); }
It says that all that this type of
object does is provide a method called accept( ). The whole reason
behind the creation of this class is to provide the accept( ) method
to the list( ) method so that list( ) can call
back accept( ) to determine which file names should be included
in the list. Thus, this technique is often referred to as a
callback or sometimes a
functor (that is, DirFilter is a functor
because its only job is to hold a method). Because list( ) takes a
FilenameFilter object as its argument, it means that you can pass an
object of any class that implements FilenameFilter to choose (even at
run-time) how the list( ) method will behave. The purpose of a
callback is to provide flexibility in the behavior of code.
DirFilter shows that just
because an interface contains only a set of methods, you’re not
restricted to writing only those methods. (You must at least provide definitions
for all the methods in an interface, however.) In this case, the
DirFilter constructor is also created.
The accept( ) method
must accept a File object representing the directory that a particular
file is found in, and a String containing the name of that file. You
might choose to use or ignore either of these arguments, but you will probably
at least use the file name. Remember that the list( ) method is
calling accept( ) for each of the file names in the directory object
to see which one should be included – this is indicated by the
boolean result returned by accept( ).
To make sure that what you’re
working with is only the name and contains no path information, all you have to
do is take the String object and create a File object out of it,
then call getName( ) which strips away all the path information (in
a platform-independent way). Then accept( ) uses the
String class
indexOf( ) method to see if the search string afn appears
anywhere in the name of the file. If afn is found within the string, the
return value is the starting index of afn, but if it’s not found
the return value is -1. Keep in mind that this is a simple string search and
does not have regular expression “wildcard” matching such as
“fo?.b?r*” which is much more difficult to
implement.
The list( ) method
returns an array. You can query this array for its length and then move through
it selecting the array elements. This ability to easily pass an array in and out
of a method is a tremendous improvement over the behavior of C and
C++.
This example is ideal for rewriting
using an
anonymous
inner class (described in Chapter 7). As a first cut, a method filter( )
is created that returns a handle to a
FilenameFilter:
//: DirList2.java // Uses Java 1.1 anonymous inner classes import java.io.*; public class DirList2 { public static FilenameFilter filter(final String afn) { // Creation of anonymous inner class: return new FilenameFilter() { String fn = afn; public boolean accept(File dir, String n) { // Strip path information: String f = new File(n).getName(); return f.indexOf(fn) != -1; } }; // End of anonymous inner class } public static void main(String[] args) { try { File path = new File("."); String[] list; if(args.length == 0) list = path.list(); else list = path.list(filter(args[0])); for(int i = 0; i < list.length; i++) System.out.println(list[i]); } catch(Exception e) { e.printStackTrace(); } } } ///:~
Note that the argument to
filter( ) must be
final. This is required
by the anonymous inner class so that it can use an object from outside its
scope.
This design is an improvement
because the FilenameFilter class is now tightly bound to DirList2.
However, you can take this approach one step further and define the anonymous
inner class as an argument to list( ), in which case it’s even
smaller:
//: DirList3.java // Building the anonymous inner class "in-place" import java.io.*; public class DirList3 { public static void main(final String[] args) { try { File path = new File("."); String[] list; if(args.length == 0) list = path.list(); else list = path.list( new FilenameFilter() { public boolean accept(File dir, String n) { String f = new File(n).getName(); return f.indexOf(args[0]) != -1; } }); for(int i = 0; i < list.length; i++) System.out.println(list[i]); } catch(Exception e) { e.printStackTrace(); } } } ///:~
The argument to main( )
is now final, since the anonymous inner class uses args[0]
directly.
This shows you how anonymous inner
classes allow the creation of quick-and-dirty classes to solve problems. Since
everything in Java revolves around classes, this can be a useful coding
technique. One benefit is that it keeps the code that solves a particular
problem isolated together in one spot. On the other hand, it is not always as
easy to read, so you must use it judiciously.
Ah, you say that you want the file
names sorted? Since there’s no support for sorting in Java 1.0 or
Java 1.1 (although sorting is included in Java
1.2), it will have to be added into the program directly
using the SortVector created in Chapter 8:
//: SortedDirList.java // Displays sorted directory listing import java.io.*; import c08.*; public class SortedDirList { private File path; private String[] list; public SortedDirList(final String afn) { path = new File("."); if(afn == null) list = path.list(); else list = path.list( new FilenameFilter() { public boolean accept(File dir, String n) { String f = new File(n).getName(); return f.indexOf(afn) != -1; } }); sort(); } void print() { for(int i = 0; i < list.length; i++) System.out.println(list[i]); } private void sort() { StrSortVector sv = new StrSortVector(); for(int i = 0; i < list.length; i++) sv.addElement(list[i]); // The first time an element is pulled from // the StrSortVector the list is sorted: for(int i = 0; i < list.length; i++) list[i] = sv.elementAt(i); } // Test it: public static void main(String[] args) { SortedDirList sd; if(args.length == 0) sd = new SortedDirList(null); else sd = new SortedDirList(args[0]); sd.print(); } } ///:~
A few other improvements have been
made. Instead of creating path and list as local variables to
main( ), they are members of the class so their values can be
accessible for the lifetime of the object. In fact, main( ) is now
just a way to test the class. You can see that the constructor of the class
automatically sorts the list once that list has been created.
The sort is case-insensitive so you
don’t end up with a list of all the words starting with capital letters,
followed by the rest of the words starting with all the lowercase letters.
However, you’ll notice that within a group of file names that begin with
the same letter the capitalized words are listed first, which is still not quite
the desired behavior for the sort. This problem will be fixed in Java
1.2.
The File class is more than
just a representation for an existing directory path, file, or group of files.
You can also use a File object to create a new
directory or an entire directory
path if it doesn’t exist. You can also look at the
characteristics of files (size,
last modification date, read/write), see whether a File object represents
a file or a directory, and delete a file. This program shows the remaining
methods available with the File class:
//: MakeDirectories.java // Demonstrates the use of the File class to // create directories and manipulate files. import java.io.*; public class MakeDirectories { private final static String usage = "Usage:MakeDirectories path1 ...\n" + "Creates each path\n" + "Usage:MakeDirectories -d path1 ...\n" + "Deletes each path\n" + "Usage:MakeDirectories -r path1 path2\n" + "Renames from path1 to path2\n"; private static void usage() { System.err.println(usage); System.exit(1); } private static void fileData(File f) { System.out.println( "Absolute path: " + f.getAbsolutePath() + "\n Can read: " + f.canRead() + "\n Can write: " + f.canWrite() + "\n getName: " + f.getName() + "\n getParent: " + f.getParent() + "\n getPath: " + f.getPath() + "\n length: " + f.length() + "\n lastModified: " + f.lastModified()); if(f.isFile()) System.out.println("it's a file"); else if(f.isDirectory()) System.out.println("it's a directory"); } public static void main(String[] args) { if(args.length < 1) usage(); if(args[0].equals("-r")) { if(args.length != 3) usage(); File old = new File(args[1]), rname = new File(args[2]); old.renameTo(rname); fileData(old); fileData(rname); return; // Exit main } int count = 0; boolean del = false; if(args[0].equals("-d")) { count++; del = true; } for( ; count < args.length; count++) { File f = new File(args[count]); if(f.exists()) { System.out.println(f + " exists"); if(del) { System.out.println("deleting..." + f); f.delete(); } } else { // Doesn't exist if(!del) { f.mkdirs(); System.out.println("created " + f); } } fileData(f); } } } ///:~
In fileData( ) you can
see the various file investigation methods put to use to display information
about the file or directory path.
The first method that’s
exercised by main( ) is
renameTo( ), which
allows you to rename (or move) a file to an entirely new path represented by the
argument, which is another File object. This also works with directories
of any length.
If you experiment with the above
program, you’ll find that you can make a directory path of any complexity
because mkdirs( )
will do all the work for you. In Java 1.0, the -d
flag reports that the directory is deleted but it’s still there; in Java
1.1 the directory is actually
deleted.
Although there are a lot of IO
stream classes in the library that can be combined in many different ways, there
are just a few ways that you’ll probably end up using them. However, they
require attention to get the correct combinations. The following rather long
example shows the creation and use of typical IO
configurations so you can use it as a reference when writing your own code. Note
that each configuration begins with a commented number and title that
corresponds to the heading for the appropriate explanation that follows in the
text.
//: IOStreamDemo.java // Typical IO Stream Configurations import java.io.*; import com.bruceeckel.tools.*; public class IOStreamDemo { public static void main(String[] args) { try { // 1. Buffered input file DataInputStream in = new DataInputStream( new BufferedInputStream( new FileInputStream(args[0]))); String s, s2 = new String(); while((s = in.readLine())!= null) s2 += s + "\n"; in.close(); // 2. Input from memory StringBufferInputStream in2 = new StringBufferInputStream(s2); int c; while((c = in2.read()) != -1) System.out.print((char)c); // 3. Formatted memory input try { DataInputStream in3 = new DataInputStream( new StringBufferInputStream(s2)); while(true) System.out.print((char)in3.readByte()); } catch(EOFException e) { System.out.println( "End of stream encountered"); } // 4. Line numbering & file output try { LineNumberInputStream li = new LineNumberInputStream( new StringBufferInputStream(s2)); DataInputStream in4 = new DataInputStream(li); PrintStream out1 = new PrintStream( new BufferedOutputStream( new FileOutputStream( "IODemo.out"))); while((s = in4.readLine()) != null ) out1.println( "Line " + li.getLineNumber() + s); out1.close(); // finalize() not reliable! } catch(EOFException e) { System.out.println( "End of stream encountered"); } // 5. Storing & recovering data try { DataOutputStream out2 = new DataOutputStream( new BufferedOutputStream( new FileOutputStream("Data.txt"))); out2.writeBytes( "Here's the value of pi: \n"); out2.writeDouble(3.14159); out2.close(); DataInputStream in5 = new DataInputStream( new BufferedInputStream( new FileInputStream("Data.txt"))); System.out.println(in5.readLine()); System.out.println(in5.readDouble()); } catch(EOFException e) { System.out.println( "End of stream encountered"); } // 6. Reading/writing random access files RandomAccessFile rf = new RandomAccessFile("rtest.dat", "rw"); for(int i = 0; i < 10; i++) rf.writeDouble(i*1.414); rf.close(); rf = new RandomAccessFile("rtest.dat", "rw"); rf.seek(5*8); rf.writeDouble(47.0001); rf.close(); rf = new RandomAccessFile("rtest.dat", "r"); for(int i = 0; i < 10; i++) System.out.println( "Value " + i + ": " + rf.readDouble()); rf.close(); // 7. File input shorthand InFile in6 = new InFile(args[0]); String s3 = new String(); System.out.println( "First line in file: " + in6.readLine()); in6.close(); // 8. Formatted file output shorthand PrintFile out3 = new PrintFile("Data2.txt"); out3.print("Test of PrintFile"); out3.close(); // 9. Data file output shorthand OutFile out4 = new OutFile("Data3.txt"); out4.writeBytes("Test of outDataFile\n\r"); out4.writeChars("Test of outDataFile\n\r"); out4.close(); } catch(FileNotFoundException e) { System.out.println( "File Not Found:" + args[0]); } catch(IOException e) { System.out.println("IO Exception"); } } } ///:~
Of course, one common thing
you’ll want to do is print formatted output to the console, but
that’s already been simplified in the package com.bruceeckel.tools
created in Chapter 5.
Parts 1 through 4 demonstrate the
creation and use of input streams (although part 4 also shows the simple use of
an output stream as a testing tool).
To open a file for input, you use a
FileInputStream with a
String or a File object as the file name. For speed, you’ll
want that file to be buffered so you give the resulting handle to the
constructor for a
BufferedInputStream. To
read input in a formatted fashion, you give that resulting handle to the
constructor for a
DataInputStream, which is
your final object and the interface you read from.
In this example, only the
readLine( ) method
is used, but of course any of the DataInputStream methods are available.
When you reach the end of the file, readLine( ) returns null
so that is used to break out of the while loop.
The String s2 is used to
accumulate the entire contents of the file (including newlines that must be
added since readLine( ) strips them off). s2 is then used in
the later portions of this program. Finally, close( ) is called to
close the file. Technically, close( ) will be called when
finalize( ) is run, and this is supposed to happen (whether or not
garbage collection occurs) as the program exits. However, Java
1.0 has a rather important bug, so this doesn’t
happen. In Java 1.1 you must explicitly call
System.runFinalizersOnExit(true) to guarantee
that finalize( ) will be called for every object in the system. The
safest approach is to explicitly call
close( ) for
files.
This piece takes the String
s2 that now contains the entire contents of the file and uses it to create a
StringBufferInputStream.
(A String, not a StringBuffer, is required
as the constructor argument.) Then read( ) is used to read each
character one at a time and send it out to the console. Note that
read( ) returns the next byte as an int and thus it must be
cast to a char to print properly.
The interface for
StringBufferInputStream is limited, so you usually enhance it by wrapping
it inside a
DataInputStream. However,
if you choose to read the characters out a byte at a time using
readByte( ), any value is valid so the return value cannot be used
to detect the end of input. Instead, you can use the
available( ) method
to find out how many more characters are available. Here’s an example that
shows how to read a file one byte at a time:
//: TestEOF.java // Testing for the end of file while reading // a byte at a time. import java.io.*; public class TestEOF { public static void main(String[] args) { try { DataInputStream in = new DataInputStream( new BufferedInputStream( new FileInputStream("TestEof.java"))); while(in.available() != 0) System.out.print((char)in.readByte()); } catch (IOException e) { System.err.println("IOException"); } } } ///:~
Note that available( )
works differently depending on what sort of medium you’re reading from
– it’s literally “the number of bytes that can be read
without blocking.”
With a file this means the whole file, but with a different kind of stream this
might not be true, so use it thoughtfully.
You could also detect the end of
input in cases like these by catching an exception. However, the use of
exceptions for control flow is considered a misuse of that
feature.
This example shows the use of the
LineNumberInputStream to
keep track of the input line numbers. Here, you cannot simply gang all the
constructors together, since you have to keep a handle to the
LineNumberInputStream. (Note that this is not an inheritance
situation, so you cannot simply cast in4 to a
LineNumberInputStream.) Thus, li holds the handle to the
LineNumberInputStream, which is then used to create a
DataInputStream for easy reading.
This example also shows how to
write formatted data to a file. First, a
FileOutputStream is
created to connect to the file. For efficiency, this is made a
BufferedOutputStream,
which is what you’ll virtually always want to do, but you’re forced
to do it explicitly. Then for the formatting it’s turned into a
PrintStream. The data
file created this way is readable as an ordinary text file.
One of the methods that indicates
when a DataInputStream is
exhausted is
readLine( ), which
returns null when there are no more strings to read. Each line is printed
to the file along with its line number, which is acquired through
li.
You’ll see an explicit
close( ) for out1, which would make sense if the
program were to turn around and read the same file again. However, this program
ends without ever looking at the file IODemo.out. As mentioned before, if
you don’t call close( ) for all your output files, you might
discover that the buffers don’t get flushed so they’re
incomplete.
The two primary kinds of output
streams are separated by the way they write data: one writes it for human
consumption, and the other writes it to be re-acquired by a
DataInputStream. The
RandomAccessFile stands
alone, although its data format is compatible with the DataInputStream
and
DataOutputStream.
A
PrintStream formats data
so it’s readable by a human. To output data so that it can be recovered by
another stream, you use a DataOutputStream to write the data and a
DataInputStream to recover the data. Of course, these streams could be
anything, but here a file is used, buffered for both reading and
writing.
Note that the character string is
written using
writeBytes( ) and
not writeChars( ).
If you use the latter, you’ll be writing the 16-bit Unicode characters.
Since there is no complementary “readChars” method in
DataInputStream, you’re stuck pulling these characters off one at a
time with
readChar( ). So for
ASCII, it’s easier to write the characters as bytes followed by a newline;
then use readLine( )
to read back the bytes as a regular ASCII line.
The
writeDouble( )
stores the double number to the stream and the complementary
readDouble( )
recovers it. But for any of the reading methods to work correctly, you must know
the exact placement of the data item in the stream, since it would be equally
possible to read the stored double as a simple sequence of bytes, or as a
char, etc. So you must either have a fixed format for the data in the
file or extra information must be stored in the file that you parse to determine
where the data is located.
As previously noted, the
RandomAccessFile is almost totally isolated from the rest of the IO
hierarchy, save for the fact that it implements the DataInput and
DataOutput interfaces. So you cannot combine it with any of the aspects
of the InputStream and OutputStream subclasses. Even though it
might make sense to treat a ByteArrayInputStream as a random access
element, you can use RandomAccessFile to only open a file. You must
assume a RandomAccessFile is properly buffered since you cannot add
that.
The one option you have is in the
second constructor argument: you can open a RandomAccessFile to read
(“r”) or read and write
(“rw”).
Using a RandomAccessFile is
like using a combined DataInputStream and DataOutputStream
(because it implements the equivalent interfaces). In addition, you can see that
seek( ) is used to
move about in the file and change one of the
values.
Since there are certain canonical
forms that you’ll be using regularly with files, you may wonder why you
have to do all of that typing – this is one of the drawbacks of the
decorator pattern. This portion shows the creation and use of shorthand versions
of typical file reading and writing configurations. These shorthands are placed
in the package com.bruceeckel.tools that was begun in Chapter 5
(See page 196). To add each class to the library, simply place it in the
appropriate directory and add the package statement.
The creation of an object that
reads a file from a buffered DataInputStream can be encapsulated into a
class called InFile:
//: InFile.java // Shorthand class for opening an input file package com.bruceeckel.tools; import java.io.*; public class InFile extends DataInputStream { public InFile(String filename) throws FileNotFoundException { super( new BufferedInputStream( new FileInputStream(filename))); } public InFile(File file) throws FileNotFoundException { this(file.getPath()); } } ///:~
Both the String versions of
the constructor and the File versions are included, to parallel the
creation of a FileInputStream.
Now you can reduce your chances of
repetitive stress syndrome while creating files, as seen in the
example.
The same kind of approach can be
taken to create a PrintStream that writes to a buffered file.
Here’s the extension to com.bruceeckel.tools:
//: PrintFile.java // Shorthand class for opening an output file // for human-readable output. package com.bruceeckel.tools; import java.io.*; public class PrintFile extends PrintStream { public PrintFile(String filename) throws IOException { super( new BufferedOutputStream( new FileOutputStream(filename))); } public PrintFile(File file) throws IOException { this(file.getPath()); } } ///:~
Note that it is not possible for a
constructor to catch an exception that’s thrown by a base-class
constructor.
Finally, the same kind of shorthand
can create a buffered output file for data storage (as opposed to human-readable
storage):
//: OutFile.java // Shorthand class for opening an output file // for data storage. package com.bruceeckel.tools; import java.io.*; public class OutFile extends DataOutputStream { public OutFile(String filename) throws IOException { super( new BufferedOutputStream( new FileOutputStream(filename))); } public OutFile(File file) throws IOException { this(file.getPath()); } } ///:~
It is curious (and unfortunate)
that the Java library designers didn’t think to provide these conveniences
as part of their
standard.
Following the approach pioneered in
Unix of “standard input,” “standard output,” and
“standard error output,” Java has
System.in,
System.out, and
System.err. Throughout
the book you’ve seen how to write to standard output using
System.out, which is already pre-wrapped as a PrintStream object.
System.err is likewise a PrintStream, but System.in is a
raw InputStream, with no wrapping. This means that while you can use
System.out and System.err right away, System.in must be
wrapped before you can read from it.
Typically, you’ll want to
read input a line at a time using readLine( ), so you’ll want
to wrap System.in in a DataInputStream. This is the
“old” Java 1.0 way to do line input. A bit
later in the chapter you’ll see the Java 1.1
solution. Here’s an example that simply echoes each line that you type
in:
//: Echo.java // How to read from standard input import java.io.*; public class Echo { public static void main(String[] args) { DataInputStream in = new DataInputStream( new BufferedInputStream(System.in)); String s; try { while((s = in.readLine()).length() != 0) System.out.println(s); // An empty line terminates the program } catch(IOException e) { e.printStackTrace(); } } } ///:~
The reason for the try block
is that readLine( )
can throw an IOException. Note that System.in should also be
buffered, as with most streams
It’s a bit inconvenient that
you’re forced to wrap System.in in a DataInputStream in each
program, but perhaps it was designed this way to allow maximum
flexibility.
The
PipedInputStream and
PipedOutputStream have
been mentioned only briefly in this chapter. This is not to suggest that they
aren’t useful, but their value is not apparent until you begin to
understand multithreading, since the piped streams are used to communicate
between threads. This is covered along with an example in Chapter
14.
Although
StreamTokenizer is not
derived from InputStream or OutputStream, it works only with
InputStream objects, so it rightfully belongs in the IO portion of the
library.
The StreamTokenizer class is
used to break any InputStream into a sequence of
“tokens,” which are bits of text delimited by whatever you choose.
For example, your tokens could be words, and then they would be delimited by
white space and punctuation.
Consider a program to count the
occurrence of words in a text file:
//: SortedWordCount.java // Counts words in a file, outputs // results in sorted form. import java.io.*; import java.util.*; import c08.*; // Contains StrSortVector class Counter { private int i = 1; int read() { return i; } void increment() { i++; } } public class SortedWordCount { private FileInputStream file; private StreamTokenizer st; private Hashtable counts = new Hashtable(); SortedWordCount(String filename) throws FileNotFoundException { try { file = new FileInputStream(filename); st = new StreamTokenizer(file); st.ordinaryChar('.'); st.ordinaryChar('-'); } catch(FileNotFoundException e) { System.out.println( "Could not open " + filename); throw e; } } void cleanup() { try { file.close(); } catch(IOException e) { System.out.println( "file.close() unsuccessful"); } } void countWords() { try { while(st.nextToken() != StreamTokenizer.TT_EOF) { String s; switch(st.ttype) { case StreamTokenizer.TT_EOL: s = new String("EOL"); break; case StreamTokenizer.TT_NUMBER: s = Double.toString(st.nval); break; case StreamTokenizer.TT_WORD: s = st.sval; // Already a String break; default: // single character in ttype s = String.valueOf((char)st.ttype); } if(counts.containsKey(s)) ((Counter)counts.get(s)).increment(); else counts.put(s, new Counter()); } } catch(IOException e) { System.out.println( "st.nextToken() unsuccessful"); } } Enumeration values() { return counts.elements(); } Enumeration keys() { return counts.keys(); } Counter getCounter(String s) { return (Counter)counts.get(s); } Enumeration sortedKeys() { Enumeration e = counts.keys(); StrSortVector sv = new StrSortVector(); while(e.hasMoreElements()) sv.addElement((String)e.nextElement()); // This call forces a sort: return sv.elements(); } public static void main(String[] args) { try { SortedWordCount wc = new SortedWordCount(args[0]); wc.countWords(); Enumeration keys = wc.sortedKeys(); while(keys.hasMoreElements()) { String key = (String)keys.nextElement(); System.out.println(key + ": " + wc.getCounter(key).read()); } wc.cleanup(); } catch(Exception e) { e.printStackTrace(); } } } ///:~
It makes sense to present these in
a sorted form, but since Java 1.0 and Java
1.1 don’t have any sorting methods, that will have
to be mixed in. This is easy enough to do with a StrSortVector. (This was
created in Chapter 8, and is part of the package created in that chapter.
Remember that the starting directory for all the subdirectories in this book
must be in your class path for the program to compile
successfully.)
To open the file, a
FileInputStream is used, and to turn the file into words a
StreamTokenizer is created from the FileInputStream. In
StreamTokenizer, there is a default list of separators, and you can add
more with a set of methods. Here, ordinaryChar( ) is used to say
“This character has no significance that I’m interested in,”
so the parser doesn’t include it as part of any of the words that it
creates. For example, saying st.ordinaryChar('.') means that periods will
not be included as parts of the words that are parsed. You can find more
information in the online documentation that comes with Java.
In countWords( ), the
tokens are pulled one at a time from the stream, and the ttype
information is used to determine what to do with each token, since a token can
be an end-of-line, a number, a string, or a single character.
Once a token is found, the
Hashtable counts is queried to see if it already
contains the token as a key. If it does, the corresponding Counter object
is incremented to indicate that another instance of this word has been found. If
not, a new Counter is created – since the Counter
constructor initializes its value to one, this also acts to count the
word.
SortedWordCount is not a
type of Hashtable, so it wasn’t inherited. It performs a specific
type of functionality, so even though the keys( ) and
values( ) methods must be re-exposed, that still doesn’t mean
that inheritance should be used
since a number of Hashtable methods are inappropriate here. In addition,
other methods like getCounter( ), which get the Counter for a
particular String, and sortedKeys( ), which produces an
Enumeration, finish the change in the shape of
SortedWordCount’s interface.
In main( ) you can see
the use of a SortedWordCount to open and count the words in a file
– it just takes two lines of code. Then an enumeration to a sorted list of
keys (words) is extracted, and this is used to pull out each key and associated
Count. Note that the call to cleanup( ) is necessary to
ensure that the file is closed.
Although it isn’t part of the
IO library, the StringTokenizer has sufficiently similar functionality to
StreamTokenizer that it will be described here.
The
StringTokenizer returns the tokens within a
string one at a time. These tokens are consecutive characters delimited by tabs,
spaces, and newlines. Thus, the tokens of the string “Where is my
cat?” are “Where”, “is”, “my”, and
“cat?” Like the StreamTokenizer, you can tell the
StringTokenizer to break up the input in any way that you want, but with
StringTokenizer you do this by passing a second argument to the
constructor, which is a String of the delimiters you wish to use. In
general, if you need more sophistication, use a
StreamTokenizer.
You ask a StringTokenizer
object for the next token in the string using the nextToken( )
method, which either returns the token or an empty string to indicate that no
tokens remain.
As an example, the following
program performs a limited analysis of a sentence, looking for key phrase
sequences to indicate whether happiness or sadness is implied.
//: AnalyzeSentence.java // Look for particular sequences // within sentences. import java.util.*; public class AnalyzeSentence { public static void main(String[] args) { analyze("I am happy about this"); analyze("I am not happy about this"); analyze("I am not! I am happy"); analyze("I am sad about this"); analyze("I am not sad about this"); analyze("I am not! I am sad"); analyze("Are you happy about this?"); analyze("Are you sad about this?"); analyze("It's you! I am happy"); analyze("It's you! I am sad"); } static StringTokenizer st; static void analyze(String s) { prt("\nnew sentence >> " + s); boolean sad = false; st = new StringTokenizer(s); while (st.hasMoreTokens()) { String token = next(); // Look until you find one of the // two starting tokens: if(!token.equals("I") && !token.equals("Are")) continue; // Top of while loop if(token.equals("I")) { String tk2 = next(); if(!tk2.equals("am")) // Must be after I break; // Out of while loop else { String tk3 = next(); if(tk3.equals("sad")) { sad = true; break; // Out of while loop } if (tk3.equals("not")) { String tk4 = next(); if(tk4.equals("sad")) break; // Leave sad false if(tk4.equals("happy")) { sad = true; break; } } } } if(token.equals("Are")) { String tk2 = next(); if(!tk2.equals("you")) break; // Must be after Are String tk3 = next(); if(tk3.equals("sad")) sad = true; break; // Out of while loop } } if(sad) prt("Sad detected"); } static String next() { if(st.hasMoreTokens()) { String s = st.nextToken(); prt(s); return s; } else return ""; } static void prt(String s) { System.out.println(s); } } ///:~
For each string being analyzed, a
while loop is entered and tokens are pulled off the string. Notice the
first if statement, which says to continue (go back to the
beginning of the loop and start again) if the token is neither an
“I” nor an “Are.” This
means that it will get tokens until an “I” or an “Are”
is found. You might think to use the == instead of the
equals( ) method,
but that won’t work correctly, since == compares handle values
while equals( ) compares contents.
The logic of the rest of the
analyze( ) method is that the pattern that’s being searched
for is “I am sad,” “I am not happy,” or “Are you
sad?” Without the break statement, the code for this would be even
messier than it is. You should be aware that a typical parser (this is a
primitive example of one) normally has a table of these tokens and a piece of
code that moves through the states in the table as new tokens are
read.
You should think of the
StringTokenizer only as shorthand for a simple and specific kind of
StreamTokenizer. However, if you have a String that you want to
tokenize and StringTokenizer is too limited, all you have to do is turn
it into a stream with StringBufferInputStream and then use that to create
a much more powerful
StreamTokenizer.
At this point you might be
scratching your head, wondering if there is another design for IO streams that
could require more typing. Could someone have come up with an odder
design?” Prepare yourself: Java 1.1 makes some significant modifications
to the IO stream library. When you see the
Reader and
Writer classes your first
thought (like mine) might be that these were meant to replace the
InputStream and OutputStream classes. But that’s not the
case. Although some aspects of the original streams library are deprecated (if
you use them you will receive a warning from the compiler), the old streams have
been left in for backwards compatibility and:
As a
result there are situations in which you have more layers of wrapping
with the new IO stream library than with the old. Again, this is a drawback of
the decorator pattern – the price you pay for added
flexibility.
The most important reason for
adding the Reader and Writer hierarchies in Java
1.1 is for
internationalization. The old IO
stream hierarchy supports only 8-bit byte streams and doesn’t handle the
16-bit Unicode characters well. Since Unicode is used for internationalization
(and Java’s native char is 16-bit
Unicode), the Reader and
Writer hierarchies were added to support Unicode in all IO operations. In
addition, the new libraries are designed for faster operations than the
old.
As is the practice in this book, I
will attempt to provide an overview of the classes but assume that you will use
online documentation to determine all the details, such as the exhaustive list
of methods.
Almost all of the Java
1.0 IO stream classes have corresponding Java
1.1 classes to provide native Unicode manipulation. It
would be easiest to say “Always use the new classes, never use the old
ones,” but things are not that simple. Sometimes you are forced into using
the Java 1.0 IO stream classes because of the library design; in particular, the
java.util.zip libraries are new additions to the old stream library and
they rely on old stream components. So the most sensible approach to take is to
try to use the Reader and Writer classes whenever you can,
and you’ll discover the situations when you have to drop back into the old
libraries because your code won’t compile.
Here is a table that shows the
correspondence between the sources and sinks of information (that is, where the
data physically comes from or goes to) in the old and new libraries.
Sources &
Sinks: |
Corresponding Java 1.1
class |
InputStream |
Reader
|
OutputStream |
Writer
|
FileInputStream |
FileReader |
FileOutputStream |
FileWriter |
StringBufferInputStream |
StringReader |
(no corresponding
class) |
StringWriter |
ByteArrayInputStream |
CharArrayReader |
ByteArrayOutputStream |
CharArrayWriter |
PipedInputStream |
PipedReader |
PipedOutputStream |
PipedWriter |
In general, you’ll find that
the interfaces in the old library components and the new ones are similar if not
identical.
In Java
1.0, streams were adapted for particular needs using
“decorator” subclasses of FilterInputStream and
FilterOutputStream. Java 1.1 IO streams continues
the use of this idea, but the model of deriving all of the decorators from the
same “filter” base class is not followed. This can make it a bit
confusing if you’re trying to understand it by looking at the class
hierarchy.
In the following table, the
correspondence is a rougher approximation than in the previous table. The
difference is because of the class organization: while
BufferedOutputStream is a subclass of FilterOutputStream,
BufferedWriter is not a subclass of FilterWriter (which,
even though it is abstract, has no subclasses and so appears to have been
put in either as a placeholder or simply so you wouldn’t wonder where it
was). However, the interfaces to the classes are quite a close match and
it’s apparent that you’re supposed to use the new versions instead
of the old whenever possible (that is, except in cases where you’re forced
to produce a Stream instead of a Reader or Writer).
Filters: |
Corresponding Java 1.1
class |
---|---|
FilterInputStream |
FilterReader |
FilterOutputStream |
FilterWriter
(abstract class with no subclasses) |
BufferedInputStream |
BufferedReader |
BufferedOutputStream |
BufferedWriter |
DataInputStream |
use
DataInputStream |
PrintStream |
PrintWriter |
LineNumberInputStream |
LineNumberReader |
StreamTokenizer |
StreamTokenizer |
PushBackInputStream |
PushBackReader |
There’s one direction
that’s quite clear: Whenever you want to use readLine( ), you
shouldn’t do it with a DataInputStream any more (this is met with a
deprecation message at compile time), but instead use a BufferedReader.
Other than this, DataInputStream is still a “preferred”
member of the Java 1.1 IO library.
To make the transition to using a
PrintWriter easier, it has constructors that take any OutputStream
object. However, PrintWriter has no more support for formatting than
PrintStream does; the interfaces are virtually the
same.
Apparently, the Java library
designers felt that they got some of the classes right the first time so there
were no changes to these and you can go on using them as they are:
Java 1.0 classes without
corresponding Java 1.1 classes |
---|
DataOutputStream |
File |
RandomAccessFile |
SequenceInputStream |
The DataOutputStream, in
particular, is used without change, so for storing and retrieving data in a
transportable format you’re forced to stay in the InputStream and
OutputStream hierarchies.
To see the effect of the new
classes, let’s look at the appropriate portion of the
IOStreamDemo.java example modified to use the Reader and
Writer classes:
//: NewIODemo.java // Java 1.1 IO typical usage import java.io.*; public class NewIODemo { public static void main(String[] args) { try { // 1. Reading input by lines: BufferedReader in = new BufferedReader( new FileReader(args[0])); String s, s2 = new String(); while((s = in.readLine())!= null) s2 += s + "\n"; in.close(); // 1b. Reading standard input: BufferedReader stdin = new BufferedReader( new InputStreamReader(System.in)); System.out.print("Enter a line:"); System.out.println(stdin.readLine()); // 2. Input from memory StringReader in2 = new StringReader(s2); int c; while((c = in2.read()) != -1) System.out.print((char)c); // 3. Formatted memory input try { DataInputStream in3 = new DataInputStream( // Oops: must use deprecated class: new StringBufferInputStream(s2)); while(true) System.out.print((char)in3.readByte()); } catch(EOFException e) { System.out.println("End of stream"); } // 4. Line numbering & file output try { LineNumberReader li = new LineNumberReader( new StringReader(s2)); BufferedReader in4 = new BufferedReader(li); PrintWriter out1 = new PrintWriter( new BufferedWriter( new FileWriter("IODemo.out"))); while((s = in4.readLine()) != null ) out1.println( "Line " + li.getLineNumber() + s); out1.close(); } catch(EOFException e) { System.out.println("End of stream"); } // 5. Storing & recovering data try { DataOutputStream out2 = new DataOutputStream( new BufferedOutputStream( new FileOutputStream("Data.txt"))); out2.writeDouble(3.14159); out2.writeBytes("That was pi"); out2.close(); DataInputStream in5 = new DataInputStream( new BufferedInputStream( new FileInputStream("Data.txt"))); BufferedReader in5br = new BufferedReader( new InputStreamReader(in5)); // Must use DataInputStream for data: System.out.println(in5.readDouble()); // Can now use the "proper" readLine(): System.out.println(in5br.readLine()); } catch(EOFException e) { System.out.println("End of stream"); } // 6. Reading and writing random access // files is the same as before. // (not repeated here) } catch(FileNotFoundException e) { System.out.println( "File Not Found:" + args[1]); } catch(IOException e) { System.out.println("IO Exception"); } } } ///:~
In general, you’ll see that
the conversion is fairly straightforward and the code looks quite similar. There
are some important differences, though. First of all, since random access files
have not changed, section 6 is not repeated.
Section 1 shrinks a bit because if
all you’re doing is reading line input you need only to wrap a
BufferedReader around a FileReader. Section 1b shows the new way
to wrap System.in for
reading
console
input, and this expands because System.in is a DataInputStream and
BufferedReader needs a Reader argument, so
InputStreamReader is brought in to perform the
translation.
In section 2 you can see that if
you have a String and want to read from it you just use a
StringReader instead of a StringBufferInputStream and the rest of
the code is identical.
Section 3 shows a bug in the design
of the new IO stream library. If you have a String and you want to read
from it, you’re not supposed to use a
StringBufferInputStream any more. When you compile code involving a
StringBufferInputStream constructor, you get a deprecation message
telling you to not use it. Instead, you’re supposed to use a
StringReader. However, if you want to do formatted memory input as in
section 3, you’re forced to use a DataInputStream – there is
no “DataReader” to replace it – and a DataInputStream
constructor requires an InputStream argument. So you have no choice but
to use the deprecated StringBufferInputStream class. The compiler will
give you a deprecation message but there’s nothing you can do about
it.[45]
Section 4 is a reasonably
straightforward translation from the old streams to the new, with no surprises.
In section 5, you’re forced to use all the old streams classes because
DataOutputStream and DataInputStream require them and there are no
alternatives. However, you don’t get any deprecation messages at compile
time. If a stream is deprecated, typically its constructor produces a
deprecation message to prevent you from using the entire class, but in the case
of DataInputStream only the readLine( ) method is deprecated
since you’re supposed to use a BufferedReader for
readLine( ) (but a DataInputStream for all other formatted
input).
If you compare section 5 with that
section in IOStreamDemo.java, you’ll notice that in this
version, the data is written before the text. That’s because a bug
was introduced in Java 1.1, which is shown in the
following code:
//: IOBug.java // Java 1.1 (and higher?) IO Bug import java.io.*; public class IOBug { public static void main(String[] args) throws Exception { DataOutputStream out = new DataOutputStream( new BufferedOutputStream( new FileOutputStream("Data.txt"))); out.writeDouble(3.14159); out.writeBytes("That was the value of pi\n"); out.writeBytes("This is pi/2:\n"); out.writeDouble(3.14159/2); out.close(); DataInputStream in = new DataInputStream( new BufferedInputStream( new FileInputStream("Data.txt"))); BufferedReader inbr = new BufferedReader( new InputStreamReader(in)); // The doubles written BEFORE the line of text // read back correctly: System.out.println(in.readDouble()); // Read the lines of text: System.out.println(inbr.readLine()); System.out.println(inbr.readLine()); // Trying to read the doubles after the line // produces an end-of-file exception: System.out.println(in.readDouble()); } } ///:~
It appears that anything you write
after a call to writeBytes( ) is not recoverable. This is a rather
limiting bug, and we can hope that it will be fixed by the time you read this.
You should run the above program to test it; if you don’t get an exception
and the values print correctly then you’re out of the
woods.
Java 1.1
has added methods in class System that allow you to redirect the standard
input, output, and error IO streams using simple static method
calls:
Redirecting output is especially
useful if you suddenly start creating a large amount of output on your screen
and it’s scrolling past faster than you can read it. Redirecting input is
valuable for a command-line program in which you want to test a particular
user-input sequence repeatedly. Here’s a simple example that shows the use
of these methods:
//: Redirecting.java // Demonstrates the use of redirection for // standard IO in Java 1.1 import java.io.*; class Redirecting { public static void main(String[] args) { try { BufferedInputStream in = new BufferedInputStream( new FileInputStream( "Redirecting.java")); // Produces deprecation message: PrintStream out = new PrintStream( new BufferedOutputStream( new FileOutputStream("test.out"))); System.setIn(in); System.setOut(out); System.setErr(out); BufferedReader br = new BufferedReader( new InputStreamReader(System.in)); String s; while((s = br.readLine()) != null) System.out.println(s); out.close(); // Remember this! } catch(IOException e) { e.printStackTrace(); } } } ///:~
This program attaches standard
input to a file, and redirects standard output and standard error to another
file.
This is another example in which a
deprecation message is inevitable. The message you can get when compiling with
the -deprecation flag is:
Note: The constructor
java.io.PrintStream(java.io.OutputStream)
has been
deprecated.
However, both
System.setOut( ) and System.setErr( ) require a
PrintStream object as an argument, so you are forced to call the
PrintStream constructor. You might wonder, if Java
1.1 deprecates the entire PrintStream class by
deprecating the constructor, why the library designers, at the same time as they
added this deprecation, also add new methods to System that required a
PrintStream rather than a PrintWriter, which is the new and
preferred replacement. It’s a
mystery.
Java 1.1
has also added some classes to support reading and writing streams in a
compressed format. These are wrapped around existing IO classes to provide
compression functionality.
One aspect of these Java 1.1
classes stands out: They are not derived from the new Reader and
Writer classes, but instead are part of the InputStream and
OutputStream hierarchies. So you might be forced to mix the two types of
streams. (Remember that you can use InputStreamReader and
OutputStreamWriter to provide easy conversion between one type and
another.)
Java 1.1 Compression
class |
Function |
---|---|
CheckedInputStream |
GetCheckSum( ) produces
checksum for any InputStream (not just decompression) |
CheckedOutputStream |
GetCheckSum( ) produces
checksum for any OutputStream (not just compression) |
DeflaterOutputStream |
Base class for compression
classes |
ZipOutputStream |
A DeflaterOutputStream that
compresses data into the Zip file format |
GZIPOutputStream |
A DeflaterOutputStream that
compresses data into the GZIP file format |
InflaterInputStream |
Base class for decompression
classes |
ZipInputStream |
A DeflaterInputStream that
Decompresses data that has been stored in the Zip file format |
GZIPInputStream |
A DeflaterInputStream that
decompresses data that has been stored in the GZIP file format |
Although there are many compression
algorithms, Zip and GZIP are possibly the most commonly used. Thus you can
easily manipulate your compressed data with the many tools available for reading
and writing these formats.
The GZIP interface is simple and
thus is probably more appropriate when you have a single stream of data that you
want to compress (rather than a collection of dissimilar pieces of data).
Here’s an example that compresses a single file:
//: GZIPcompress.java // Uses Java 1.1 GZIP compression to compress // a file whose name is passed on the command // line. import java.io.*; import java.util.zip.*; public class GZIPcompress { public static void main(String[] args) { try { BufferedReader in = new BufferedReader( new FileReader(args[0])); BufferedOutputStream out = new BufferedOutputStream( new GZIPOutputStream( new FileOutputStream("test.gz"))); System.out.println("Writing file"); int c; while((c = in.read()) != -1) out.write(c); in.close(); out.close(); System.out.println("Reading file"); BufferedReader in2 = new BufferedReader( new InputStreamReader( new GZIPInputStream( new FileInputStream("test.gz")))); String s; while((s = in2.readLine()) != null) System.out.println(s); } catch(Exception e) { e.printStackTrace(); } } } ///:~
The use of the compression classes
is straightforward – you simply wrap your output stream in a
GZIPOutputStream or ZipOutputStream and your input stream in a
GZIPInputStream or ZipInputStream. All else is ordinary IO reading
and writing. This is, however, a good example of when you’re forced to mix
the old IO streams with the new: in uses the Reader classes,
whereas GZIPOutputStream’s constructor can accept only an
OutputStream object, not a Writer
object.
The Java
1.1 library that supports the Zip format is much more
extensive. With it you can easily store multiple files, and there’s even a
separate class to make the process of reading a Zip file easy. The library uses
the standard Zip format so that it works seamlessly with all the tools currently
downloadable on the Internet. The following example has the same form as the
previous example, but it handles as many command-line arguments as you want. In
addition, it shows the use of the Checksum
classes to calculate and verify the checksum for the file. There are two
Checksum types: Adler32 (which is faster)
and CRC32 (which is slower but slightly more
accurate).
//: ZipCompress.java // Uses Java 1.1 Zip compression to compress // any number of files whose names are passed // on the command line. import java.io.*; import java.util.*; import java.util.zip.*; public class ZipCompress { public static void main(String[] args) { try { FileOutputStream f = new FileOutputStream("test.zip"); CheckedOutputStream csum = new CheckedOutputStream( f, new Adler32()); ZipOutputStream out = new ZipOutputStream( new BufferedOutputStream(csum)); out.setComment("A test of Java Zipping"); // Can't read the above comment, though for(int i = 0; i < args.length; i++) { System.out.println( "Writing file " + args[i]); BufferedReader in = new BufferedReader( new FileReader(args[i])); out.putNextEntry(new ZipEntry(args[i])); int c; while((c = in.read()) != -1) out.write(c); in.close(); } out.close(); // Checksum valid only after the file // has been closed! System.out.println("Checksum: " + csum.getChecksum().getValue()); // Now extract the files: System.out.println("Reading file"); FileInputStream fi = new FileInputStream("test.zip"); CheckedInputStream csumi = new CheckedInputStream( fi, new Adler32()); ZipInputStream in2 = new ZipInputStream( new BufferedInputStream(csumi)); ZipEntry ze; System.out.println("Checksum: " + csumi.getChecksum().getValue()); while((ze = in2.getNextEntry()) != null) { System.out.println("Reading file " + ze); int x; while((x = in2.read()) != -1) System.out.write(x); } in2.close(); // Alternative way to open and read // zip files: ZipFile zf = new ZipFile("test.zip"); Enumeration e = zf.entries(); while(e.hasMoreElements()) { ZipEntry ze2 = (ZipEntry)e.nextElement(); System.out.println("File: " + ze2); // ... and extract the data as before } } catch(Exception e) { e.printStackTrace(); } } } ///:~
For each file to add to the
archive, you must call putNextEntry( ) and pass it a
ZipEntry object. The
ZipEntry object contains an extensive interface that allows you to get
and set all the data available on that particular entry in your Zip file: name,
compressed and uncompressed sizes, date, CRC checksum, extra field data,
comment, compression method, and whether it’s a directory entry. However,
even though the Zip format has a way to set a password, this is not supported in
Java’s Zip library. And although CheckedInputStream and
CheckedOutputStream support both Adler32 and CRC32
checksums, the ZipEntry class supports only an interface for CRC. This is
a restriction of the underlying Zip format, but it might limit you from using
the faster Adler32.
To extract files,
ZipInputStream has a getNextEntry( ) method that returns the
next ZipEntry if there is one. As a more succinct alternative, you can
read the file using a ZipFile object, which has a method
entries( ) to return an Enumeration to the
ZipEntries.
In order to read the checksum you
must somehow have access to the associated Checksum object. Here, a
handle to the CheckedOutputStream and CheckedInputStream objects
is retained, but you could also just hold onto a handle to the Checksum
object.
A baffling method in Zip streams is
setComment( ). As shown above, you can set a comment when
you’re writing a file, but there’s no way to recover the comment in
the ZipInputStream. Comments appear to be supported fully on an
entry-by-entry basis only via ZipEntry.
Of course, you are not limited to
files when using the GZIP or Zip libraries – you can
compress anything, including data to be sent through a network
connection.
The Zip format is also used in the
Java 1.1 JAR (Java ARchive) file
format, which is a way to collect a group of files into a single compressed
file, just like Zip. However, like everything else in Java, JAR files are
cross-platform so you don’t need to worry about platform issues. You can
also include audio and image files as well as class files.
JAR files are particularly helpful
when you deal with the Internet. Before JAR files, your Web browser would have
to make repeated requests of a Web server in order to download all of the files
that make up an applet. In addition, each of these files was uncompressed. By
combining all of the files for a particular applet into a single JAR file, only
one server request is necessary and the transfer is faster because of
compression. And each entry in a JAR file can be digitally signed for security
(refer to the Java documentation for details).
A JAR file consists of a single
file containing a collection of zipped files along with a
“manifest” that describes them. (You can
create your own manifest file; otherwise the jar program will do it for
you.) You can find out more about JAR manifests in the online
documentation.
The jar utility that comes
with Sun’s JDK automatically compresses the files of your choice. You
invoke it on the command line:
jar [options] destination [manifest] inputfile(s)
The options are simply a collection
of letters (no hyphen or any other indicator is necessary). These
are:
c |
Creates a new or empty archive.
|
t |
Lists the table of contents.
|
x |
Extracts all files |
x file |
Extracts the named file
|
f |
Says: “I’m going to
give you the name of the file.” If you don’t use this, jar
assumes that its input will come from standard input, or, if it is creating
a file, its output will go to standard output. |
m |
Says that the first argument will
be the name of the user-created manifest file |
v |
Generates verbose output describing
what jar is doing |
O |
Only store the files; doesn’t
compress the files (use to create a JAR file that you can put in your
classpath) |
M |
Don’t automatically create a
manifest file |
If a subdirectory is included in
the files to be put into the JAR file, that subdirectory is automatically added,
including all of its subdirectories, etc. Path information is also
preserved.
Here are some typical ways to
invoke jar:
jar cf myJarFile.jar *.class
This creates a JAR file called
myJarFile.jar that contains all of the class files in the current
directory, along with an automatically-generated manifest file.
jar cmf myJarFile.jar myManifestFile.mf *.class
Like the previous example, but
adding a user-created manifest file called
myManifestFile.mf.
jar tf myJarFile.jar
Produces a table of contents of the
files in myJarFile.jar.
jar tvf myJarFile.jar
Adds the “verbose” flag
to give more detailed information about the files in
myJarFile.jar.
jar cvf myApp.jar audio classes image
Assuming audio,
classes, and image are subdirectories, this combines all of the
subdirectories into the file myApp.jar. The “verbose” flag is
also included to give extra feedback while the jar program is
working.
If you create a JAR file using the
O option, that file can be placed in your CLASSPATH:
CLASSPATH="lib1.jar;lib2.jar;"
Then Java can search
lib1.jar and lib2.jar for class files.
The jar tool isn’t as
useful as a zip utility. For example, you can’t add or update files
to an existing JAR file; you can create JAR files only from scratch. Also, you
can’t move files into a JAR file, erasing them as they are moved. However,
a JAR file created on one platform will be transparently readable by the
jar tool on any other platform (a problem that sometimes plagues
zip utilities).
Java 1.1
has added an interesting feature called
object serialization that
allows you to take any object that implements the
Serializable interface and turn it into a
sequence of bytes that can later be restored fully into the original object.
This is even true across a network, which means that the serialization mechanism
automatically compensates for differences in operating systems. That is, you can
create an object on a Windows machine, serialize it, and send it across the
network to a Unix machine where it will be correctly reconstructed. You
don’t have to worry about the data representations on the different
machines, the byte ordering, or any other details.
By itself, object serialization is
interesting because it allows you to implement
lightweight
persistence. Remember that persistence means an object’s lifetime is
not determined by whether a program is executing – the object lives
in between invocations of the program. By taking a serializable
object and writing it to disk, then restoring that object when the program is
re-invoked, you’re able to produce the effect of persistence. The reason
it’s called “lightweight” is that you can’t simply
define an object using some kind of “persistent” keyword and let the
system take care of the details (although this might happen in the future).
Instead, you must explicitly serialize and de-serialize the objects in your
program.
Object serialization was added to
the language to support two major features. Java
1.1’s remote method invocation (RMI) allows
objects that live on other machines to behave as if they live on your machine.
When sending messages to remote objects, object serialization is necessary to
transport the arguments and return values. RMI is discussed in Chapter
15.
Object serialization is also
necessary for Java Beans, introduced in Java 1.1. When a
Bean is used, its state information is generally configured at design time. This
state information must be stored and later recovered when the program is
started; object serialization performs this task.
Serializing an object is quite
simple, as long as the object implements the Serializable interface (this
interface is just a flag and has no methods). In Java
1.1, many standard library classes have been changed so
they’re serializable, including all of the wrappers for the primitive
types, all of the collection classes, and many others. Even Class objects
can be serialized. (See Chapter 11 for the implications of
this.)
To serialize an object, you create
some sort of OutputStream object and then wrap it inside an
ObjectOutputStream
object. At this point you need only call
writeObject( ) and
your object is serialized and sent to the OutputStream. To reverse the
process, you wrap an InputStream inside an ObjectInputStream and
call readObject( ).
What comes back is, as usual, a handle to an upcast Object, so you must
downcast to set things straight.
A particularly clever aspect of
object serialization is that it not only saves an image of your object but it
also follows all the handles contained in your object and saves those
objects, and follows all the handles in each of those objects, etc. This is
sometimes referred to as the
“web of objects”
that a single object can be connected to, and it includes arrays of handles to
objects as well as member objects. If you had to maintain your own object
serialization scheme, maintaining the code to follow all these links would be a
bit mind–boggling. However, Java object serialization seems to pull it off
flawlessly, no doubt using an optimized algorithm that traverses the web of
objects. The following example tests the serialization mechanism by making a
“worm” of linked objects, each of which has a link to the next
segment in the worm as well as an array of handles to objects of a different
class, Data:
//: Worm.java // Demonstrates object serialization in Java 1.1 import java.io.*; class Data implements Serializable { private int i; Data(int x) { i = x; } public String toString() { return Integer.toString(i); } } public class Worm implements Serializable { // Generate a random int value: private static int r() { return (int)(Math.random() * 10); } private Data[] d = { new Data(r()), new Data(r()), new Data(r()) }; private Worm next; private char c; // Value of i == number of segments Worm(int i, char x) { System.out.println(" Worm constructor: " + i); c = x; if(--i > 0) next = new Worm(i, (char)(x + 1)); } Worm() { System.out.println("Default constructor"); } public String toString() { String s = ":" + c + "("; for(int i = 0; i < d.length; i++) s += d[i].toString(); s += ")"; if(next != null) s += next.toString(); return s; } public static void main(String[] args) { Worm w = new Worm(6, 'a'); System.out.println("w = " + w); try { ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream("worm.out")); out.writeObject("Worm storage"); out.writeObject(w); out.close(); // Also flushes output ObjectInputStream in = new ObjectInputStream( new FileInputStream("worm.out")); String s = (String)in.readObject(); Worm w2 = (Worm)in.readObject(); System.out.println(s + ", w2 = " + w2); } catch(Exception e) { e.printStackTrace(); } try { ByteArrayOutputStream bout = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(bout); out.writeObject("Worm storage"); out.writeObject(w); out.flush(); ObjectInputStream in = new ObjectInputStream( new ByteArrayInputStream( bout.toByteArray())); String s = (String)in.readObject(); Worm w3 = (Worm)in.readObject(); System.out.println(s + ", w3 = " + w3); } catch(Exception e) { e.printStackTrace(); } } } ///:~
To make things interesting, the
array of Data objects inside Worm are initialized with random
numbers. (This way you don’t suspect the compiler of keeping some kind of
meta-information.) Each Worm segment is labeled with a char
that’s automatically generated in the process of recursively generating
the linked list of Worms. When you create a Worm, you tell the
constructor how long you want it to be. To make the next handle it calls
the Worm constructor with a length of one less, etc. The final
next handle is left as null, indicating the end of the
Worm.
The point of all this was to make
something reasonably complex that couldn’t easily be serialized. The act
of serializing, however, is quite simple. Once the ObjectOutputStream is
created from some other stream, writeObject( ) serializes the
object. Notice the call to writeObject( ) for a String, as
well. You can also write all the primitive data types using the same methods as
DataOutputStream (they share the same interface).
There are two separate try
blocks that look similar. The first writes and reads a file and the second, for
variety, writes and reads a ByteArray. You can read and write an object
using serialization to any DataInputStream or DataOutputStream
including, as you will see in the networking chapter, a network. The output from
one run was:
Worm constructor: 6 Worm constructor: 5 Worm constructor: 4 Worm constructor: 3 Worm constructor: 2 Worm constructor: 1 w = :a(262):b(100):c(396):d(480):e(316):f(398) Worm storage, w2 = :a(262):b(100):c(396):d(480):e(316):f(398) Worm storage, w3 = :a(262):b(100):c(396):d(480):e(316):f(398)
You can see that the deserialized
object really does contain all of the links that were in the original
object.
Note that no constructor, not even
the default constructor, is called in the process of deserializing a
Serializable object. The entire object is restored by recovering data
from the InputStream.
Object serialization is another
Java 1.1 feature that is not part of the new
Reader and Writer hierarchies, but instead uses the old
InputStream and OutputStream hierarchies. Thus you might encounter
situations in which you’re forced to mix the two
hierarchies.
You might wonder what’s
necessary for an object to be recovered from its serialized state. For example,
suppose you serialize an object and send it as a file or through a network to
another machine. Could a program on the other machine reconstruct the object
using only the contents of the file?
The best way to answer this
question is (as usual) by performing an experiment. The following file goes in
the subdirectory for this chapter:
//: Alien.java // A serializable class import java.io.*; public class Alien implements Serializable { } ///:~
The file that creates and
serializes an Alien object goes in the same directory:
//: FreezeAlien.java // Create a serialized output file import java.io.*; public class FreezeAlien { public static void main(String[] args) throws Exception { ObjectOutput out = new ObjectOutputStream( new FileOutputStream("file.x")); Alien zorcon = new Alien(); out.writeObject(zorcon); } } ///:~
Rather than catching and handling
exceptions, this program takes the quick and dirty approach of passing the
exceptions out of main( ), so they’ll be reported on the
command line.
Once the program is compiled and
run, copy the resulting file.x to a subdirectory called xfiles,
where the following code goes:
//: ThawAlien.java // Try to recover a serialized file without the // class of object that's stored in that file. package c10.xfiles; import java.io.*; public class ThawAlien { public static void main(String[] args) throws Exception { ObjectInputStream in = new ObjectInputStream( new FileInputStream("file.x")); Object mystery = in.readObject(); System.out.println( mystery.getClass().toString()); } } ///:~
This program opens the file and
reads in the object mystery successfully. However, as soon as you try to
find out anything about the object – which requires the Class
object for Alien – the Java Virtual Machine (JVM) cannot find
Alien.class (unless it happens to be in the Classpath, which it
shouldn’t be in this example). You’ll get a
ClassNotFoundException. (Once again, all evidence of alien life vanishes
before proof of its existence can be verified!)
If you expect to do much after
you’ve recovered an object that has been serialized, you must make sure
that the JVM can find the associated .class file either in the local
class path or somewhere on the
Internet.
As you can see, the default
serialization mechanism is trivial to use. But what if you have special needs?
Perhaps you have special security issues and you don’t want to serialize
portions of your object, or perhaps it just doesn’t make sense for one
sub-object to be serialized if that part needs to be created anew when the
object is recovered.
You can
control the process of
serialization by implementing the
Externalizable interface
instead of the
Serializable interface.
The Externalizable interface extends the Serializable interface
and adds two methods,
writeExternal( ) and
readExternal( ),
that are automatically called for your object during serialization and
deserialization so that you can perform your special
operations.
The following example shows simple
implementations of the Externalizable interface methods. Note that
Blip1 and Blip2 are nearly identical except for a subtle
difference (see if you can discover it by looking at the code):
//: Blips.java // Simple use of Externalizable & a pitfall import java.io.*; import java.util.*; class Blip1 implements Externalizable { public Blip1() { System.out.println("Blip1 Constructor"); } public void writeExternal(ObjectOutput out) throws IOException { System.out.println("Blip1.writeExternal"); } public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException { System.out.println("Blip1.readExternal"); } } class Blip2 implements Externalizable { Blip2() { System.out.println("Blip2 Constructor"); } public void writeExternal(ObjectOutput out) throws IOException { System.out.println("Blip2.writeExternal"); } public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException { System.out.println("Blip2.readExternal"); } } public class Blips { public static void main(String[] args) { System.out.println("Constructing objects:"); Blip1 b1 = new Blip1(); Blip2 b2 = new Blip2(); try { ObjectOutputStream o = new ObjectOutputStream( new FileOutputStream("Blips.out")); System.out.println("Saving objects:"); o.writeObject(b1); o.writeObject(b2); o.close(); // Now get them back: ObjectInputStream in = new ObjectInputStream( new FileInputStream("Blips.out")); System.out.println("Recovering b1:"); b1 = (Blip1)in.readObject(); // OOPS! Throws an exception: //! System.out.println("Recovering b2:"); //! b2 = (Blip2)in.readObject(); } catch(Exception e) { e.printStackTrace(); } } } ///:~
The output for this program
is:
Constructing objects: Blip1 Constructor Blip2 Constructor Saving objects: Blip1.writeExternal Blip2.writeExternal Recovering b1: Blip1 Constructor Blip1.readExternal
The reason that the Blip2
object is not recovered is that trying to do so causes an exception. Can you see
the difference between Blip1 and Blip2? The constructor for
Blip1 is public, while the constructor for Blip2 is not,
and that causes the exception upon recovery. Try making Blip2’s
constructor public and removing the //! comments to see the
correct results.
When b1 is recovered, the
Blip1 default constructor is called. This is different from recovering a
Serializable object, in which the object is constructed entirely from its
stored bits, with no constructor calls. With an Externalizable object,
all the normal default construction behavior occurs (including the
initializations at the point of field definition), and then
readExternal( ) is called. You need to be aware of this – in
particular the fact that all the default construction always takes place –
to produce the correct behavior in your Externalizable
objects.
Here’s an example that shows
what you must do to fully store and retrieve an Externalizable
object:
//: Blip3.java // Reconstructing an externalizable object import java.io.*; import java.util.*; class Blip3 implements Externalizable { int i; String s; // No initialization public Blip3() { System.out.println("Blip3 Constructor"); // s, i not initialized } public Blip3(String x, int a) { System.out.println("Blip3(String x, int a)"); s = x; i = a; // s & i initialized only in non-default // constructor. } public String toString() { return s + i; } public void writeExternal(ObjectOutput out) throws IOException { System.out.println("Blip3.writeExternal"); // You must do this: out.writeObject(s); out.writeInt(i); } public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException { System.out.println("Blip3.readExternal"); // You must do this: s = (String)in.readObject(); i =in.readInt(); } public static void main(String[] args) { System.out.println("Constructing objects:"); Blip3 b3 = new Blip3("A String ", 47); System.out.println(b3.toString()); try { ObjectOutputStream o = new ObjectOutputStream( new FileOutputStream("Blip3.out")); System.out.println("Saving object:"); o.writeObject(b3); o.close(); // Now get it back: ObjectInputStream in = new ObjectInputStream( new FileInputStream("Blip3.out")); System.out.println("Recovering b3:"); b3 = (Blip3)in.readObject(); System.out.println(b3.toString()); } catch(Exception e) { e.printStackTrace(); } } } ///:~
The fields s and i
are initialized only in the second constructor, but not in the default
constructor. This means that if you don’t initialize s and i
in readExternal, it will be null (since the storage for the
object gets wiped to zero in the first step of object creation). If you comment
out the two lines of code following the phrases “You must do this”
and run the program, you’ll see that when the object is recovered,
s is null and i is zero.
If you are inheriting from an
Externalizable object, you’ll typically call the base-class
versions of writeExternal( ) and readExternal( ) to
provide proper storage and retrieval of the base-class
components.
So to make things work correctly
you must not only write the important data from the object during the
writeExternal( ) method (there is no default behavior that writes
any of the member objects for an Externalizable object), but you must
also recover that data in the readExternal( ) method. This can be a
bit confusing at first because the default construction behavior for an
Externalizable object can make it seem like some kind of storage and
retrieval takes place automatically. It does not.
When you’re controlling
serialization, there might be a particular subobject that you don’t want
Java’s serialization mechanism to automatically save and restore. This is
commonly the case if that subobject represents sensitive information that you
don’t want to serialize, such as a password. Even if that information is
private in the object, once it’s serialized it’s possible for
someone to access it by reading a file or intercepting a network
transmission.
One way to prevent sensitive parts
of your object from being serialized is to implement your class as
Externalizable, as shown previously. Then nothing is automatically
serialized and you can explicitly serialize only the necessary parts inside
writeExternal( ).
If you’re working with a
Serializable object, however, all serialization happens automatically. To
control this, you can turn off serialization on a field-by-field basis using the
transient
keyword, which says “Don’t bother saving or restoring this –
I’ll take care of it.”
For example, consider a Login
object that keeps information about a particular login session. Suppose
that, once you verify the login, you want to store the data, but without the
password. The easiest way to do this is by implementing
Serializable and marking the password
field as transient. Here’s what it looks like:
//: Logon.java // Demonstrates the "transient" keyword import java.io.*; import java.util.*; class Logon implements Serializable { private Date date = new Date(); private String username; private transient String password; Logon(String name, String pwd) { username = name; password = pwd; } public String toString() { String pwd = (password == null) ? "(n/a)" : password; return "logon info: \n " + "username: " + username + "\n date: " + date.toString() + "\n password: " + pwd; } public static void main(String[] args) { Logon a = new Logon("Hulk", "myLittlePony"); System.out.println( "logon a = " + a); try { ObjectOutputStream o = new ObjectOutputStream( new FileOutputStream("Logon.out")); o.writeObject(a); o.close(); // Delay: int seconds = 5; long t = System.currentTimeMillis() + seconds * 1000; while(System.currentTimeMillis() < t) ; // Now get them back: ObjectInputStream in = new ObjectInputStream( new FileInputStream("Logon.out")); System.out.println( "Recovering object at " + new Date()); a = (Logon)in.readObject(); System.out.println( "logon a = " + a); } catch(Exception e) { e.printStackTrace(); } } } ///:~
You can see that the date
and username fields are ordinary (not transient), and thus are
automatically serialized. However, the password is transient, and
so is not stored to disk; also the serialization mechanism makes no attempt to
recover it. The output is:
logon a = logon info: username: Hulk date: Sun Mar 23 18:25:53 PST 1997 password: myLittlePony Recovering object at Sun Mar 23 18:25:59 PST 1997 logon a = logon info: username: Hulk date: Sun Mar 23 18:25:53 PST 1997 password: (n/a)
When the object is recovered, the
password field is null. Note that toString( ) must
check for a null value of password because if you try to assemble
a String object using the overloaded ‘+’ operator, and
that operator encounters a null handle, you’ll get a
NullPointerException. (Newer versions of Java might contain code to avoid
this problem.)
You can also see that the
date field is stored to and recovered from disk and not generated
anew.
Since Externalizable objects
do not store any of their fields by default, the transient keyword is for
use with Serializable objects only.
If you’re not keen on
implementing the Externalizable interface,
there’s another approach. You can implement the Serializable
interface and add (notice I say “add” and not
“override” or “implement”) methods called
writeObject( ) and
readObject( ) that
will automatically be called when the object is serialized and deserialized,
respectively. That is, if you provide these two methods they will be used
instead of the default serialization.
The methods must have these exact
signatures:
private void writeObject(ObjectOutputStream stream) throws IOException; private void readObject(ObjectInputStream stream) throws IOException, ClassNotFoundException
From a design standpoint, things
get really weird here. First of all, you might think that because these methods
are not part of a base class or the Serializable interface, they ought to
be defined in their own interface(s). But notice that they are defined as
private, which means they are to be called only by other members of this
class. However, you don’t actually call them from other members of this
class, but instead the writeObject( ) and readObject( )
methods of the ObjectOutputStream and ObjectInputStream objects
call your object’s writeObject( ) and
readObject( ) methods. (Notice my tremendous restraint in not
launching into a long diatribe about using the same method names here. In a
word: confusing.) You might wonder how the ObjectOutputStream and
ObjectInputStream objects have access to private methods of your
class. We can only assume that this is part of the serialization
magic.
In any event, anything defined in
an interface is automatically public so if
writeObject( ) and readObject( ) must be private,
then they can’t be part of an interface. Since you must follow the
signatures exactly, the effect is the same as if you’re implementing an
interface.
It would appear that when you call
ObjectOutputStream.writeObject( ), the Serializable object
that you pass it to is interrogated (using reflection, no doubt) to see if it
implements its own writeObject( ). If so, the normal serialization
process is skipped and the writeObject( ) is called. The same sort
of situation exists for readObject( ).
There’s one other twist.
Inside your writeObject( ), you can choose to perform the default
writeObject( ) action by calling defaultWriteObject( ).
Likewise, inside readObject( ) you can call
defaultReadObject( ). Here is a simple example that demonstrates how
you can control the storage and retrieval of a Serializable
object:
//: SerialCtl.java // Controlling serialization by adding your own // writeObject() and readObject() methods. import java.io.*; public class SerialCtl implements Serializable { String a; transient String b; public SerialCtl(String aa, String bb) { a = "Not Transient: " + aa; b = "Transient: " + bb; } public String toString() { return a + "\n" + b; } private void writeObject(ObjectOutputStream stream) throws IOException { stream.defaultWriteObject(); stream.writeObject(b); } private void readObject(ObjectInputStream stream) throws IOException, ClassNotFoundException { stream.defaultReadObject(); b = (String)stream.readObject(); } public static void main(String[] args) { SerialCtl sc = new SerialCtl("Test1", "Test2"); System.out.println("Before:\n" + sc); ByteArrayOutputStream buf = new ByteArrayOutputStream(); try { ObjectOutputStream o = new ObjectOutputStream(buf); o.writeObject(sc); // Now get it back: ObjectInputStream in = new ObjectInputStream( new ByteArrayInputStream( buf.toByteArray())); SerialCtl sc2 = (SerialCtl)in.readObject(); System.out.println("After:\n" + sc2); } catch(Exception e) { e.printStackTrace(); } } } ///:~
In this example, one String
field is ordinary and the other is transient, to prove that the
non-transient field is saved by the
defaultWriteObject( )
method and the transient field is saved and restored explicitly. The
fields are initialized inside the constructor rather than at the point of
definition to prove that they are not being initialized by some automatic
mechanism during deserialization.
If you are going to use the default
mechanism to write the non-transient parts of your object, you must call
defaultWriteObject( ) as the first operation in
writeObject( ) and
defaultReadObject( )
as the first operation in readObject( ). These are strange method
calls. It would appear, for example, that you are calling
defaultWriteObject( ) for an ObjectOutputStream and passing
it no arguments, and yet it somehow turns around and knows the handle to your
object and how to write all the non-transient parts.
Spooky.
The storage and retrieval of the
transient objects uses more familiar code. And yet, think about what
happens here. In main( ), a SerialCtl object is created, and
then it’s serialized to an ObjectOutputStream. (Notice in this case
that a buffer is used instead of a file – it’s all the same to the
ObjectOutputStream.) The serialization occurs in the
line:
o.writeObject(sc);
The writeObject( )
method must be examining sc to see if it has its own
writeObject( ) method. (Not by checking the interface – there
isn’t one – or the class type, but by actually hunting for the
method using reflection.) If it does, it uses that. A similar approach holds
true for readObject( ). Perhaps this was the only practical way that
they could solve the problem, but it’s certainly strange.
It’s possible that you might
want to change the version of a serializable class (objects of the original
class might be stored in a database, for example). This is supported but
you’ll probably do it only in special cases, and it requires an extra
depth of understanding that we will not attempt to achieve here. The JDK1.1 HTML
documents downloadable from Sun (which might be part of your Java
package’s online documents) cover this topic quite
thoroughly.
It’s quite appealing to use
serialization technology to store some of the state of
your program so that you can easily restore the program to the current state
later. But before you can do this, some questions must be answered. What happens
if you serialize two objects that both have a handle to a third object? When you
restore those two objects from their serialized state, do you get only one
occurrence of the third object? What if you serialize your two objects to
separate files and deserialize them in different parts of your
code?
Here’s an example that shows
the problem:
//: MyWorld.java import java.io.*; import java.util.*; class House implements Serializable {} class Animal implements Serializable { String name; House preferredHouse; Animal(String nm, House h) { name = nm; preferredHouse = h; } public String toString() { return name + "[" + super.toString() + "], " + preferredHouse + "\n"; } } public class MyWorld { public static void main(String[] args) { House house = new House(); Vector animals = new Vector(); animals.addElement( new Animal("Bosco the dog", house)); animals.addElement( new Animal("Ralph the hamster", house)); animals.addElement( new Animal("Fronk the cat", house)); System.out.println("animals: " + animals); try { ByteArrayOutputStream buf1 = new ByteArrayOutputStream(); ObjectOutputStream o1 = new ObjectOutputStream(buf1); o1.writeObject(animals); o1.writeObject(animals); // Write a 2nd set // Write to a different stream: ByteArrayOutputStream buf2 = new ByteArrayOutputStream(); ObjectOutputStream o2 = new ObjectOutputStream(buf2); o2.writeObject(animals); // Now get them back: ObjectInputStream in1 = new ObjectInputStream( new ByteArrayInputStream( buf1.toByteArray())); ObjectInputStream in2 = new ObjectInputStream( new ByteArrayInputStream( buf2.toByteArray())); Vector animals1 = (Vector)in1.readObject(); Vector animals2 = (Vector)in1.readObject(); Vector animals3 = (Vector)in2.readObject(); System.out.println("animals1: " + animals1); System.out.println("animals2: " + animals2); System.out.println("animals3: " + animals3); } catch(Exception e) { e.printStackTrace(); } } } ///:~
One thing that’s interesting
here is that it’s possible to use object serialization to and from a byte
array as a way of doing a “deep copy” of any object that’s
Serializable. (A deep copy means that you’re duplicating the entire
web of objects, rather than just the basic object and its handles.) Copying is
covered in depth in Chapter 12.
Animal objects contain
fields of type House. In main( ), a Vector of these
Animals is created and it is serialized twice to one stream and then
again to a separate stream. When these are deserialized and printed, you see the
following results for one run (the objects will be in different memory locations
each run):
animals: [Bosco the dog[Animal@1cc76c], House@1cc769 , Ralph the hamster[Animal@1cc76d], House@1cc769 , Fronk the cat[Animal@1cc76e], House@1cc769 ] animals1: [Bosco the dog[Animal@1cca0c], House@1cca16 , Ralph the hamster[Animal@1cca17], House@1cca16 , Fronk the cat[Animal@1cca1b], House@1cca16 ] animals2: [Bosco the dog[Animal@1cca0c], House@1cca16 , Ralph the hamster[Animal@1cca17], House@1cca16 , Fronk the cat[Animal@1cca1b], House@1cca16 ] animals3: [Bosco the dog[Animal@1cca52], House@1cca5c , Ralph the hamster[Animal@1cca5d], House@1cca5c , Fronk the cat[Animal@1cca61], House@1cca5c ]
Of course you expect that the
deserialized objects have different addresses from their originals. But notice
that in animals1 and animals2 the same addresses appear, including
the references to the House object that both share. On the other hand,
when animals3 is recovered the system has no way of knowing that the
objects in this other stream are aliases of the objects in the first stream, so
it makes a completely different web of objects.
As long as you’re serializing
everything to a single stream, you’ll be able to recover the same web of
objects that you wrote, with no accidental duplication of objects. Of course,
you can change the state of your objects in between the time you write the first
and the last, but that’s your responsibility – the objects will be
written in whatever state they are in (and with whatever connections they have
to other objects) at the time you serialize them.
The safest thing to do if you want
to save the state of a system is to serialize as an “atomic”
operation. If you serialize some things, do some other work, and serialize some
more, etc., then you will not be storing the system safely. Instead, put all the
objects that comprise the state of your system in a single collection and simply
write that collection out in one operation. Then you can restore it with a
single method call as well.
The following example is an
imaginary computer-aided design (CAD) system that demonstrates the approach. In
addition, it throws in the issue of static fields – if you look at
the documentation you’ll see that Class is Serializable, so
it should be easy to store the static fields by simply serializing the
Class object. That seems
like a sensible approach, anyway.
//: CADState.java // Saving and restoring the state of a // pretend CAD system. import java.io.*; import java.util.*; abstract class Shape implements Serializable { public static final int RED = 1, BLUE = 2, GREEN = 3; private int xPos, yPos, dimension; private static Random r = new Random(); private static int counter = 0; abstract public void setColor(int newColor); abstract public int getColor(); public Shape(int xVal, int yVal, int dim) { xPos = xVal; yPos = yVal; dimension = dim; } public String toString() { return getClass().toString() + " color[" + getColor() + "] xPos[" + xPos + "] yPos[" + yPos + "] dim[" + dimension + "]\n"; } public static Shape randomFactory() { int xVal = r.nextInt() % 100; int yVal = r.nextInt() % 100; int dim = r.nextInt() % 100; switch(counter++ % 3) { default: case 0: return new Circle(xVal, yVal, dim); case 1: return new Square(xVal, yVal, dim); case 2: return new Line(xVal, yVal, dim); } } } class Circle extends Shape { private static int color = RED; public Circle(int xVal, int yVal, int dim) { super(xVal, yVal, dim); } public void setColor(int newColor) { color = newColor; } public int getColor() { return color; } } class Square extends Shape { private static int color; public Square(int xVal, int yVal, int dim) { super(xVal, yVal, dim); color = RED; } public void setColor(int newColor) { color = newColor; } public int getColor() { return color; } } class Line extends Shape { private static int color = RED; public static void serializeStaticState(ObjectOutputStream os) throws IOException { os.writeInt(color); } public static void deserializeStaticState(ObjectInputStream os) throws IOException { color = os.readInt(); } public Line(int xVal, int yVal, int dim) { super(xVal, yVal, dim); } public void setColor(int newColor) { color = newColor; } public int getColor() { return color; } } public class CADState { public static void main(String[] args) throws Exception { Vector shapeTypes, shapes; if(args.length == 0) { shapeTypes = new Vector(); shapes = new Vector(); // Add handles to the class objects: shapeTypes.addElement(Circle.class); shapeTypes.addElement(Square.class); shapeTypes.addElement(Line.class); // Make some shapes: for(int i = 0; i < 10; i++) shapes.addElement(Shape.randomFactory()); // Set all the static colors to GREEN: for(int i = 0; i < 10; i++) ((Shape)shapes.elementAt(i)) .setColor(Shape.GREEN); // Save the state vector: ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream("CADState.out")); out.writeObject(shapeTypes); Line.serializeStaticState(out); out.writeObject(shapes); } else { // There's a command-line argument ObjectInputStream in = new ObjectInputStream( new FileInputStream(args[0])); // Read in the same order they were written: shapeTypes = (Vector)in.readObject(); Line.deserializeStaticState(in); shapes = (Vector)in.readObject(); } // Display the shapes: System.out.println(shapes); } } ///:~
The Shape class
implements Serializable, so anything that is
inherited from Shape is automatically Serializable as well. Each
Shape contains data, and each derived Shape class contains a
static field that determines the color of all of those types of
Shapes. (Placing a static field in the base class would result in
only one field, since static fields are not duplicated in derived
classes.) Methods in the base class can be overridden to set the color for the
various types (static methods are not dynamically bound, so these are
normal methods). The randomFactory( ) method creates a different
Shape each time you call it, using random values for the Shape
data.
Circle and Square are
straightforward extensions of Shape; the only difference is that
Circle initializes color at the point of definition and
Square initializes it in the constructor. We’ll leave the
discussion of Line for later.
In main( ), one
Vector is used to hold the Class objects and the other to hold the
shapes. If you don’t provide a command line argument the shapeTypes
Vector is created and the Class objects are added, and then the
shapes Vector is created and Shape objects are added. Next,
all the static color values are set to GREEN, and
everything is serialized to the file CADState.out.
If you provide a command line
argument (presumably CADState.out), that file is opened and used to
restore the state of the program. In both situations, the resulting
Vector of Shapes is printed out. The results from one run
are:
>java CADState [class Circle color[3] xPos[-51] yPos[-99] dim[38] , class Square color[3] xPos[2] yPos[61] dim[-46] , class Line color[3] xPos[51] yPos[73] dim[64] , class Circle color[3] xPos[-70] yPos[1] dim[16] , class Square color[3] xPos[3] yPos[94] dim[-36] , class Line color[3] xPos[-84] yPos[-21] dim[-35] , class Circle color[3] xPos[-75] yPos[-43] dim[22] , class Square color[3] xPos[81] yPos[30] dim[-45] , class Line color[3] xPos[-29] yPos[92] dim[17] , class Circle color[3] xPos[17] yPos[90] dim[-76] ] >java CADState CADState.out [class Circle color[1] xPos[-51] yPos[-99] dim[38] , class Square color[0] xPos[2] yPos[61] dim[-46] , class Line color[3] xPos[51] yPos[73] dim[64] , class Circle color[1] xPos[-70] yPos[1] dim[16] , class Square color[0] xPos[3] yPos[94] dim[-36] , class Line color[3] xPos[-84] yPos[-21] dim[-35] , class Circle color[1] xPos[-75] yPos[-43] dim[22] , class Square color[0] xPos[81] yPos[30] dim[-45] , class Line color[3] xPos[-29] yPos[92] dim[17] , class Circle color[1] xPos[17] yPos[90] dim[-76] ]
You can see that the values of
xPos, yPos, and dim were all stored and recovered
successfully, but there’s something wrong with the retrieval of the
static information. It’s all ‘3’ going in, but it
doesn’t come out that way. Circles have a value of 1 (RED,
which is the definition), and Squares have a value of 0 (remember, they
are initialized in the constructor). It’s as if the statics
didn’t get serialized at all! That’s right – even though class
Class is Serializable, it doesn’t do what you expect. So if
you want to serialize statics, you must do it yourself.
This is what the
serializeStaticState( ) and deserializeStaticState( )
static methods in Line are for. You can see that they are
explicitly called as part of the storage and retrieval process. (Note that the
order of writing to the serialize file and reading back from it must be
maintained.) Thus to make CADState.java run correctly you must (1) Add a
serializeStaticState( ) and deserializeStaticState( ) to
the shapes, (2) Remove the Vector shapeTypes and all code related to it,
and (3) Add calls to the new serialize and deserialize static methods in the
shapes.
Another issue you might have to
think about is security, since serialization also saves private data. If
you have a security issue, those fields should be marked as transient.
But then you have to design a secure way to store that information so that when
you do a restore you can reset those private variables.
The Java IO stream library does
seem to satisfy the basic requirements: you can perform reading and writing with
the console, a file, a block of memory, or even across the Internet (as you will
see in Chapter 15). It’s possible (by inheriting from InputStream
and OutputStream) to create new types of input and output objects. And
you can even add a simple extensibility to the kinds of objects a stream will
accept by redefining the toString( ) method that’s
automatically called when you pass an object to a method that’s expecting
a String (Java’s limited “automatic type conversion”).
There are questions left unanswered
by the documentation and design of the IO stream library. For example, it would
have been nice if you could say that you want an exception thrown if you try to
overwrite a file when opening it for output – some programming systems
allow you to specify that you want to open an output file, but only if it
doesn’t already exist. In Java, it appears that you are supposed to use a
File object to determine whether a file exists, because if you open it as
an FileOutputStream or FileWriter it will always get overwritten.
By representing both files and directory paths, the File class also
suggests poor design by violating the maxim “Don’t try to do too
much in a single class.”
The IO stream library brings up
mixed feelings. It does much of the job and it’s portable. But if you
don’t already understand the decorator pattern, the design is
non-intuitive, so there’s extra overhead in learning and teaching it.
It’s also incomplete: there’s no support for the kind of output
formatting that almost every other language’s IO package supports. (This
was not remedied in Java 1.1, which missed the opportunity to change the library
design completely, and instead added even more special cases and
complexity.) The Java 1.1 changes to the IO library
haven’t been replacements, but rather additions, and it seems that the
library designers couldn’t quite get straight which features are
deprecated and which are preferred, resulting in annoying deprecation messages
that show up the contradictions in the library design.
However, once you do
understand the decorator pattern and begin using the library in situations that
require its flexibility, you can begin to benefit from this design, at which
point its cost in extra lines of code may not bother you as
much.
[44]
In Design Patterns, Erich Gamma et al., Addison-Wesley 1995.
Described later in this book.
[45]
Perhaps by the time you read this, the bug will be fixed.