20.10 A Taste of New I/O
The java.nio package ("New I/O") and its subpackages give you access to high performance I/O, albeit with more complexity. Instead of a simple stream model you have control over buffers, channels, and other abstractions to let you get maximum speed for your I/O needs. This is recommended only for those who have a demonstrated need.
The model for rapid I/O is to use buffers to walk through channels of primitive types. Buffers are containers for data and are associated with channels that connect to external data sources. There are buffer types for all primitive types: A FloatBuffer works with float values, for example. The ByteBuffer is more general; it can handle any primitive type with methods such as getFloat and putLong. MappedByteBuffer helps you map a large file into memory for quick access. You can use character set decoders and encoders to translate buffers of bytes to and from Unicode.
Channels come from objects that access external data, namely files and sockets. FileInputStream has a getChannel method that returns a channel for that stream, as do RandomAccessFile, java.net.Socket, and others.
Here is some code that will let you efficiently access a large text file in a specified encoding:
public static int count(File file, String charSet, char ch) throws IOException { Charset charset = Charset.forName(charSet); CharsetDecoder decoder = charset.newDecoder(); FileInputStream fis = new FileInputStream(file); FileChannel fc = fis.getChannel(); // Get the file's size and then map it into memory long size = fc.size(); MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, size); CharBuffer cb = decoder.decode(bb); int count = 0; for (int i = 0; i < size && i < Integer.MAX_VALUE; i++) if (cb.charAt(i) == ch) count++; fc.close(); return count; }
We use a FileInputStream to get a channel for the file. Then we create a mapped buffer for the entire file. What a "mapped buffer" does may vary with the platform, but for large files (greater than a few tens of kilobytes) you can assume that it will be at least as efficient as streaming through the data, and nearly certainly much more efficient. We then get a decoder for the specified character set, which gives us a CharBuffer from which to read. [4]
The CharBuffer not only lets you read (decoded) characters from the file, it also acts as a CharSequence and, therefore, can be used with the regular expression mechanism.
In addition to high-performance I/O, the new I/O package also provides a different programming model that allows for non-blocking I/O operations to be performed. This is an advanced topic well beyond the scope of this book, but suffice it to say that this allows a small number of threads to efficiently manage a large number of simultaneous I/O connections.
There is also a reliable file locking mechanism: You can lock a FileChannel and receive a java.nio.channels.FileLock object that represents either a shared or exclusive lock on a file. You can release the FileLock when you are done with it.
Nothing has really happened until it has been recorded.
—Virginia Woolf