3.6 lseek Function
Every open file has an associated "current file offset." This is a nonnegative integer that measures the number of bytes from the beginning of the file. (We describe some exceptions to the "nonnegative" qualifier later in this section.) Read and write operations normally start at the current file offset and cause the offset to be incremented by the number of bytes read or written. By default, this offset is initialized to 0 when a file is opened, unless the O_APPEND option is specified.
An open file can be explicitly positioned by calling lseek.
#include <sys/types.h> #include <unistd.h> off_t lseek(int filedes, off_t offset, int whence); Returns: new file offset if OK, -1 on error
The interpretation of the offset depends on the value of the whence argument.
-
If whence is SEEK_SET, the file's offset is set to offset bytes from the beginning of the file.
-
If whence is SEEK_CUR, the file's offset is set to its current value plus the offset. The offset can be positive or negative.
-
If whence is SEEK_END, the file's offset is set to the size of the file plus the offset. The offset can be positive or negative.
Since a successful call to lseek returns the new file offset, we can seek zero bytes from the current position to determine the current offset.
off_t currpos; currpos = lseek(fd, 0, SEEK_CUR);
This technique can also be used to determine if the referenced file is capable of seeking: if the file descriptor refers to a pipe or FIFO, lseek returns -1 and sets errno to EPIPE.
The three symbolic constants, SEEK_SET, SEEK_CUR, and SEEK_END, were introduced with System V. Before System V whence was specified as 0 (absolute), 1 (relative to current offset), or 2 (relative to end of file). Much software still exists with these numbers hard coded.
The character l in the name lseek means "long integer." Before the introduction of the off_t data type, the offset argument and the return value were long integers. lseek was introduced with Version 7 when long integers were added to C. (Similar functionality was provided in Version 6 by the functions seek and tell.)
Example
Program 3.1 tests its standard input to see if it is capable of seeking.
Program 3.1 Test if standard input is capable of seeking.
#include <sys/types.h> #include "ourhdr.h" int main(void) { if (lseek(STDIN_FILENO, 0, SEEK_CUR) == -1) printf("cannot seek\n"); else printf("seek OK\n"); exit(0); }
If we invoke this program interactively, we get
$ a.out < /etc/motd seek OK $ cat < /etc/motd | a.out cannot seek $ a.out < /var/spool/cron/FIFO cannot seek
Normally a file's current offset must be a nonnegative integer. It is possible, however, that certain devices could allow negative offsets. But for regular files the offset must be nonnegative. Since negative offsets are possible, we should be careful to compare the return value from lseek as being equal to or not equal to -1 and not test if it's less than 0.
The /dev/kmem device on SVR4 for the 80386 supports negative offsets.
Since the offset (off_t) is a signed data type (Figure 2.8), we lose a factor of 2 in the maximum file size. For example, if off_t is a 32-bit integer, the maximum file size is 231 bytes.
lseek only records the current file offset within the kernel—it does not cause any I/O to take place. This offset is then used by the next read or write operation.
The file's offset can be greater than the file's current size, in which case the next write to the file will extend the file. This is referred to as creating a hole in a file and is allowed. Any bytes in a file that have not been written are read back as 0.
Example
Program 3.2 creates a file with a hole in it.
Program 3.2 Create a file with a hole in it.
#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include "ourhdr.h" char buf1[] = "abcdefghij"; char buf2[] = "ABCDEFGHIJ"; int main(void) { int fd; if ( (fd = creat("file.hole", FILE_MODE)) < 0) err_sys("creat error"); if (write(fd, buf1, 10) != 10) err_sys("buf1 write error"); /* offset now = 10 */ if (lseek(fd, 40, SEEK_SET) == -1) err_sys("lseek error"); /* offset now = 40 */ if (write(fd, buf2, 10) != 10) err_sys("buf2 write error"); /* offset now = 50 */ exit(0); }
Running this program gives us
$ a.out $ ls -l file.hole check its size -rw-r--r-- 1 stevens 50 Jul 31 05:50 file.hole $ od -c file.hole let's look at the actual contents 0000000 a b c d e f g h i j \0 \0 \0 \0 \0 \0 0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000040 \0 \0 \0 \0 \0 \0 \0 \0 A B C D E F G H 0000060 I J 0000062
We use the od(1) command to look at the actual contents of the file. The -c flag tells it to print the contents as characters. We can see that the 30 unwritten bytes in the middle are read back as zero. The seven-digit number at the beginning of each line is the byte offset in octal. In this example we call the write function (Section 3.8). We'll have more to say about files with holes in Section 4.12.