20.7 Working with Files
The java.io package provides a number of classes that help you work with files in the underlying system. The File stream classes allow you to read from and write to files and the FileDescriptor class allows the system to represent underlying file system resources as objects. RandomAccessFile lets you deal with files as randomly accessed streams of bytes or characters. Actual interaction with the local file system is through the File class, which provides an abstraction of file pathnames, including path component separators, and useful methods to manipulate file names.
20.7.1 File Streams and FileDescriptor
The File streams—FileInputStream, FileOutputStream, FileReader, and FileWriter—allow you to treat a file as a stream for input or output. Each type is instantiated with one of three constructors:
- A constructor that takes a String that is the name of the file.
- A constructor that takes a File object that refers to the file (see Section 20.7.3 on page 543).
- A constructor that takes a FileDescriptor object (see below).
If a file does not exist, the input streams will throw a FileNotFoundException. Accessing a file requires a security check and a SecurityException is thrown if you do not have permission to access that file—see "Security" on page 677.
With a byte or character output stream, the first two constructor types create the file if it does not exist, or truncate it if it does exist. You can control truncation by using the overloaded forms of these two constructors that take a second argument: a boolean that, if true, causes each individual write to append to the file. If this boolean is false, the file will be truncated and new data added. If the file does not exist, the file will be created and the boolean will be ignored.
The byte File streams also provide a getChannel method for integration with the java.nio facilities. It returns a java.nio.channels.FileChannel object for accessing the file.
A FileDescriptor object represents a system-dependent value that describes an open file. You can get a file descriptor object by invoking getFD on a File byte stream—you cannot obtain the file descriptor from File character streams. You can test the validity of a FileDescriptor by invoking its boolean valid method—file descriptors created directly with the no-arg constructor of FileDescriptor are not valid.
FileDescriptor objects create a new File stream to the same file as another stream without needing to know the file's pathname. You must be careful to avoid unexpected interactions between two streams doing different things with the same file. You cannot predict what happens, for example, when two threads write to the same file using two different FileOutputStream objects at the same time.
The flush method of FileOutputStream and FileWriter guarantees that the buffer is flushed to the underlying file. It does not guarantee that the data is committed to disk—the underlying file system may do its own buffering. You can guarantee that the data is committed to disk by invoking the sync method on the file's FileDescriptor object, which will either force the data to disk or throw a SyncFailedException if the underlying system cannot fulfill this contract.
20.7.2 RandomAccessFile
The RandomAccessFile class provides a more sophisticated file mechanism than the File streams do. A random access file behaves like a large array of bytes stored in the file system. There is a kind of cursor, or index into the implied array, called the file pointer; input operations read bytes starting at the file pointer and advance the file pointer past the bytes read. If the random access file is created in read/write mode, then output operations are also available; output operations write bytes starting at the file pointer and advance the file pointer past the bytes written.
RandomAccessFile is not a subclass of InputStream, OutputStream, Reader, or Writer because it can do both input and output and can work with both characters and bytes. The constructor has a parameter that declares whether the stream is for input or for both input and output.
RandomAccessFile supports read and write methods of the same names and signatures as the byte streams. For example, read returns a single byte. RandomAccessFile also implements the DataInput and DataOutput interfaces (see page 537) and so can be used to read and write data types supported in those interfaces. Although you don't have to learn a new set of method names and semantics for the same kinds of tasks you do with the other streams, you cannot use a RandomAccessFile where any of the other streams are required.
The constructors for RandomAccessFile are
-
public
RandomAccessFile(String name, String mode)
throws FileNotFoundException
- Creates a random access file stream to read from, and optionally write to, a file with the specified name. The basic mode can be either "r" or "rw" for read or read/write, respectively. Variants of "rw" mode provide additional semantics: "rws" mode specifies that on each write the file contents and metadata (file size, last modification time, etc.) are written synchronously through to the disk; "rwd" mode specifies that only the file contents are written synchronously to the disk. Specifying any other mode will get you an IllegalArgumentException. If the mode contains "rw" and the file does not exist, it will be created or, if that fails, a FileNotFoundException is thrown.
-
public
RandomAccessFile(File file, String mode)
throws FileNotFoundException
- Creates a random access file stream to read from, and optionally write to, the file specified by the File argument. Modes are the same as for the String-based constructor.
Since accessing a file requires a security check, these constructors could throw a SecurityException if you do not have permission to access the file in that mode—see "Security" on page 677.
The "random access" in the name of the class refers to the ability to set the read/write file pointer to any position in the file and then perform operations. The additional methods in RandomAccessFile to support this functionality are:
-
public long
getFilePointer()
throws IOException
- Returns the current location of the file pointer (in bytes) from the beginning of the file.
-
public void
seek(long pos)
throws IOException
- Sets the file pointer to the specified number of bytes from the beginning of the file. The next byte written or read will be the pos th byte in the file, where the initial byte is the 0 th . If you position the file pointer beyond the end of the file and write to the file, the file will grow.
-
public int
skipBytes(int count)
throws IOException
- Attempts to advance the file pointer count bytes. Any bytes skipped over can be read later after seek is used to reposition the file pointer. Returns the actual number of bytes skipped. This method is guaranteed never to throw an EOFException. If count is negative, no bytes are skipped.
-
public long
length()
throws IOException
- Returns the file length.
-
public void
setLength(long newLength)
throws IOException
- Sets the length of the file to newLength. If the file is currently shorter, the file is grown to the given length, filled in with any byte values the implementation chooses. If the file is currently longer, the data beyond this position is discarded. If the current position (as returned by getFilePointer) is greater than newLength, the position is set to newLength.
You can access the FileDescriptor for a RandomAccessFile by invoking its getFD method. You can obtain a FileChannel for a RandomAccessFile by invoking its getChannel method.
Exercise 20.8 : Write a program that reads a file with entries separated by lines starting with %% and creates a table file with the starting position of each such entry. Then write a program that uses that table to print a random entry (see the Math.random method described in "Math and StrictMath" on page 657).
20.7.3 The File Class
The File class (not to be confused with the file streams) provides several common manipulations that are useful with file names. It provides methods to separate pathnames into subcomponents and to ask the file system about the file a pathname refers to.
A File object actually represents a path, not necessarily an underlying file. For example, to find out whether a pathname represents an existing file, you create a File object with the pathname and then invoke exists on that object.
A path is separated into directory and file parts by a char stored in the static field separatorChar and available as a String in the static field separator. The last occurrence of this character in the path separates the pathname into directory and file components. (Directory is the term used on most systems; some systems call such an entity a "folder" instead.)
File objects are created with one of four constructors:
-
public
File(String path)
- Creates a File object to manipulate the specified path.
-
public
File(String dirName, String name)
- Creates a File object for the file name in the directory named dirName. If dirName is null, only name is used. If dirName is an empty string, name is resolved against a system dependent default directory. Otherwise, this is equivalent to using File(dirName + File.separator + name).
-
public
File(File fileDir, String name)
- Creates a File object for the file name in the directory named by the File object fileDir. Equivalent to using File(fileDir.getPath(), name).
-
public
File(java.net.URI uri)
- Creates a File object for the pathname represented by the given file: URI (Uniform Resource Identifier). If the given URI is not a suitable file URI then IllegalArgumentException is thrown.
Five "get" methods retrieve information about the components of a File object's pathname. The following code invokes each of them after creating a File object for the file "FileInfo.java" in the "ok" subdirectory of the parent of the current directory (specified by ".."):
File src = new File(".." + File.separator + "ok", "FileInfo.java"); System.out.println("getName() = " + src.getName()); System.out.println("getPath() = " + src.getPath()); System.out.println("getAbsolutePath() = " + src.getAbsolutePath()); System.out.println("getCanonicalPath() = " + src.getCanonicalPath()); System.out.println("getParent() = " + src.getParent());
And here is the output:
getName() = FileInfo.java getPath() = ../ok/FileInfo.java getAbsolutePath() = /vob/java_prog/src/../ok/FileInfo.java getCanonicalPath() = /vob/java_prog/ok/FileInfo.java getParent() = ../ok
The canonical path is defined by each system. Usually, it is a form of the absolute path with relative components (such as ".." to refer to the parent directory) renamed and with references to the current directory removed. Unlike the other "get" methods, getCanonicalPath can throw IOException because resolving path components can require calls to the underlying file system, which may fail.
The methods getParentFile, getAbsoluteFile, and getCanonicalFile are analogous to getParent, getAbsolutePath, and getCanonicalPath, but they return File objects instead of strings.
You can convert a File to a java.net.URL or java.net.URI object by invoking toURL or toURI, respectively.
The overriding method File.equals deserves mention. Two File objects are considered equal if they have the same path, not if they refer to the same underlying file system object. You cannot use File.equals to test whether two File objects denote the same file. For example, two File objects may refer to the same file but use different relative paths to refer to it, in which case they do not compare equal. Relatedly, you can compare two files using the compareTo method, which returns a number less than, equal to, or greater than zero as the current file's pathname is lexicographically less than, equal to, or greater than the pathname of the argument File. The compareTo method has two overloaded forms: one takes a File argument and the other takes an Object argument and so implements the Comparable interface.
Several boolean tests return information about the underlying file:
- exists returns true if the file exists in the file system.
- canRead returns true if a file exists and can be read.
- canWrite returns true if the file exists and can be written.
- isFile returns true if the file is not a directory or other special type of file.
- isDirectory returns true if the file is a directory.
- isAbsolute returns true if the path is an absolute pathname.
- isHidden returns true if the path is one normally hidden from users on the underlying system.
All the methods that inspect or modify the actual file system are security checked and can throw SecurityException if you don't have permission to perform the operation. Methods that ask for the filename itself are not security checked.
File objects have many other methods for manipulating files and directories. There are methods to inspect and manipulate the current file:
-
public long
lastModified()
- Returns a long value representing the time the file was last modified or zero if the file does not exist.
-
public long
length()
- Returns the file length in bytes, or zero if the file does not exist.
-
public boolean
renameTo(File newName)
- Renames the file, returning true if the rename succeeded.
-
public boolean
delete()
- Deletes the file or directory named in this File object, returning true if the deletion succeeded. Directories must be empty before they are deleted.
There are methods to create an underlying file or directory named by the current File:
-
public boolean
createNewFile()
- Creates a new empty file, named by this File. Returns false if the file already exists or if the file cannot be created. The check for the existence of the file and its subsequent creation is performed atomically with respect to other file system operations.
-
public boolean
mkdir()
- Creates a directory named by this File, returning true on success.
-
public boolean
mkdirs()
- Creates all directories in the path named by this File, returning true if all were created. This is a way to ensure that a particular directory is created, even if it means creating other directories that don't currently exist above it in the directory hierarchy. Note that some of the directories may have been created even if false is returned.
However, files are usually created by FileOutputStream or FileWriter objects or RandomAccessFile objects, not using File objects.
Two methods let you change the state of the underlying file, assuming that one exists:
-
public boolean
setLastModified(long time)
- Sets the "last modified" time for the file or returns false if it cannot do so.
-
public boolean
setReadOnly()
- Makes the underlying file unmodifiable in the file system or returns false if it cannot do so. The file remains unmodifiable until it is deleted or externally marked as modifiable again—there is no method for making it modifiable again.
There are methods for listing the contents of directories and finding out about root directories:
-
public String[]
list()
- Lists the files in this directory. If used on something that isn't a directory, it returns null. Otherwise, it returns an array of file names. This list includes all files in the directory except the equivalent of "." and ".." (the current and parent directory, respectively).
-
public String[]
list(FilenameFilter filter)
- Uses filter to selectively list files in this directory (see FilenameFilter described in the next section).
-
public static File[]
listRoots()
- Returns the available filesystem roots, that is, roots of local hierarchical file systems. Windows platforms, for example, have a root directory for each active drive; UNIX platforms have a single / root directory. If none are available, the array has zero elements.
The methods listFiles()and listFiles(FilenameFilter) are analogous to list()and list(FilenameFilter), but return arrays of File objects instead of arrays of strings. The method listFiles(FileFilter) is analogous to the list that uses a FilenameFilter.
Three methods relate primarily to temporary files (sometimes called "scratch files")—those files you need to create during a run of your program for storing data, or to pass between passes of your computation, but which are not needed after your program is finished.
-
public static File
createTempFile(String prefix, String suffix, File directory)
throws IOException
- Creates a new empty file in the specified directory, using the given prefix and suffix strings to generate its name. If this method returns successfully then it is guaranteed that the file denoted by the returned abstract pathname did not exist before this method was invoked, and neither this method nor any of its variants will return the same abstract pathname again in the current invocation of the virtual machine. The prefix argument must be at least three characters long, otherwise an IllegalArgumentException is thrown. It is recommended that the prefix be a short, meaningful string such as "hjb" or "mail". The suffix argument may be null, in which case the suffix ".tmp" will be used. Note that since there is no predefined separator between the file name and the suffix, any separator, such as '.', must be part of the suffix. If the directory argument is null then the system-dependent default temporary-file directory will be used. The default temporary-file directory is specified by the system property java.io.tmpdir.
-
public static File
createTempFile(String prefix, String suffix)
throws IOException
- Equivalent to createTempFile(prefix,suffix, null).
-
public void
deleteOnExit()
- Requests the system to remove the file when the virtual machine terminates—see "Shutdown" on page 672. This request only applies to a normal termination of the virtual machine and cannot be revoked once issued.
When a temporary file is created, the prefix and the suffix may first be adjusted to fit the limitations of the underlying platform. If the prefix is too long then it will be truncated, but its first three characters will always be preserved. If the suffix is too long then it too will be truncated, but if it begins with a period (.) then the period and the first three characters following it will always be preserved. Once these adjustments have been made the name of the new file will be generated by concatenating the prefix, five or more internally generated characters, and the suffix. Temporary files are not automatically deleted on exit, although you will often invoke deleteOnExit on File objects returned by createTempFile.
Finally, the character File.pathSeparatorChar and its companion string File.pathSeparator represent the character that separates file or directory names in a search path. For example, UNIX separates components in the program search path with a colon, as in ".:/bin:/usr/bin", so pathSeparatorChar is a colon on UNIX systems.
Exercise 20.9 : Write a method that, given one or more pathnames, will print all the information available about the file it represents (if any).
Exercise 20.10 : Write a program that uses a StreamTokenizer object to break an input file into words and counts the number of times each word occurs in the file, printing the result. Use a HashMap to keep track of the words and counts.
20.7.4 FilenameFilter and FileFilter
The FilenameFilter interface provides objects that filter unwanted files from a list. It supports a single method:
-
boolean
accept(File dir, String name)
- Returns true if the file named name in the directory dir should be part of the filtered output.
Here is an example that uses a FilenameFilter object to list only directories:
import java.io.*; class DirFilter implements FilenameFilter { public boolean accept(File dir, String name) { return new File(dir, name).isDirectory(); } public static void main(String[] args) { File dir = new File(args[0]); String[] files = dir.list(new DirFilter()); System.out.println(files.length + " dir(s):"); for (String file : files) System.out.println("\t" + file); } }
First we create a File object to represent a directory specified on the command line. Then we create a DirFilter object and pass it to list. For each name in the directory, list invokes the accept method on the filtering object and includes the name in the list if the filtering object returns true. For our accept method, true means that the named file is a directory.
The FileFilter interface is analogous to FilenameFilter, but works with a single File object:
-
boolean
accept(File pathname)
- Returns true if the file represented by pathname should be part of the filtered output.
Exercise 20.11 : Using FilenameFilter or FileFilter, write a program that takes a directory and a suffix as parameters and prints all files it can find that have that suffix.