A Taste of Java's I/O Package: Streams, Files, and So Much More
java.io
package. From a programmer's point of view, the user is a peripheral that types when you issue a read request.
—Peter Williams
The Java platform includes a number of packages that are concerned with the movement of data into and out of programs. These packages differ in the kinds of abstractions they provide for dealing with I/O (input/output).
The java.io package defines I/O in terms of streams. Streams are ordered sequences of data that have a source (input streams) or destination (output streams). The I/O classes isolate programmers from the specific details of the underlying operating system, while enabling access to system resources through files and other means. Most stream types (such as those dealing with files) support the methods of some basic interfaces and abstract classes, with few (if any) additions. The best way to understand the I/O package is to start with the basic interfaces and abstract classes.
The java.nio package and its subpackages define I/O in terms of buffers and channels. Buffers are data stores (similar to arrays) that can be read from or written to. Channels represent connections to entities capable of performing I/O operations, including buffers, files, and sockets. The "n" in nio is commonly understood as meaning "new" (the nio package predates the original stream-based io package), but it originally stood for "non-blocking" because one of the key differences between channel-based I/O and stream-based I/O is that channels allow for non-blocking I/O operations, as well as interruptible blocking operations. This is a powerful capability that is critical in the design of high throughput server-style applications.
The java.net package provides specific support for network I/O, based around the use of sockets, with an underlying stream or channel-based model.
This chapter is mainly concerned with the stream-based model of the java.io package. A short introduction to some of the capabilities of the java.nio package is given in "A Taste of New I/O" on page 565, but the use of non-blocking I/O and the java.net network I/O are advanced topics, beyond the scope of this book.
20.1 Streams Overview
The package java.io has two major parts: character streams and byte streams. Characters are 16-bit UTF-16 characters, whereas bytes are (as always) 8 bits. I/O is either text-based or data-based (binary). Text-based I/O works with streams of human-readable characters, such as the source code for a program. Data-based I/O works with streams of binary data, such as the bit pattern for an image. The character streams are used for text-based I/O, while byte streams are used for data-based I/O. Streams that work with bytes cannot properly carry characters, and some character-related issues are not meaningful with byte streams—though the byte streams can also be used for older text-based protocols that use 7- or 8-bit characters. The byte streams are called input streams and output streams, and the character streams are called readers and writers. For nearly every input stream there is a corresponding output stream, and for most input or output streams there is a corresponding reader or writer character stream of similar functionality, and vice versa.
Because of these overlaps, this chapter describes the streams in fairly general terms. When we talk simply about streams, we mean any of the streams. When we talk about input streams or output streams, we mean the byte variety. The character streams are referred to as readers and writers. For example, when we talk about the Buffered streams we mean the entire family of BufferedInputStream, BufferedOutputStream, BufferedReader, and BufferedWriter. When we talk about Buffered byte streams we mean both BufferedInputStream and BufferedOutputStream. When we talk about Buffered character streams, we mean BufferedReader and BufferedWriter.
The classes and interfaces in java.io can be broadly split into five groups:
- The general classes for building different types of byte and character streams—input and output streams, readers and writers, and classes for converting between them—are covered in Section 20.2 through to Section 20.4.
- A range of classes that define various types of streams—filtered streams, buffered streams, piped streams, and some specific instances of those streams, such as a line number reader and a stream tokenizer—are discussed in Section 20.5.
- The data stream classes and interfaces for reading and writing primitive values and strings are discussed in Section 20.6.
- Classes and interfaces for interacting with files in a system independent manner are discussed in Section 20.7.
- The classes and interfaces that form the object serialization mechanism, which transforms objects into byte streams and allows objects to be reconstituted from the data read from a byte stream, are discussed in Section 20.8.
Some of the output streams provide convenience methods for producing formatted output, using instances of the java.util.Formatter class. You get formatted input by binding an input stream to a java.util.Scanner object. Details of formatting and scanning are covered in Chapter 22.
The IOException class is used by many methods in java.io to signal exceptional conditions. Some extended classes of IOException signal specific problems, but most problems are signaled by an IOException object with a descriptive string. Details are provided in Section 20.9 on page 563. Any method that throws an IOException will do so when an error occurs that is directly related to the stream. In particular, invoking a method on a closed stream may result in an IOException. Unless there are particular circumstances under which the IOException will be thrown, this exception is not documented for each individual method of each class.
Similarly, NullPointerException and IndexOutOfBoundsException can be expected to be thrown whenever a null reference is passed to a method, or a supplied index accesses outside of an array. Only those situations where this does not occur are explicitly documented.
All code presented in this chapter uses the types in java.io, and every example has imported java.io.* even when there is no explicit import statement in the code.