- The Theory
- The UNIX Way
- Message Passing
- Join the Queue
- Plug It In
- Dont Tie Me Down
The UNIX Way
In UNIX, everything is a file, even when that approach doesn’t make sense. A file, to a UNIX system, is a stream of bytes. Therefore, it’s no surprise that the simplest form of UNIX interprocess communication (IPC) resembles a file. If you’ve spent any time on a *NIX command line, almost certainly you’ll have seen this metaphor. For example, the standard way of listing the contents of a directory, one screen at a time, is this:
$ ls | more
The ls command writes the directory listing to a file descriptor identified symbolically as stdout. In this case, however, stdout is not a file, it’s a pipe connected to stdin on the more command. If you want to split your programs into a series of commands that process data sequentially, you can join them together in this way with shell scripts.
This technique works, but has some limitations. The most obvious is that POSIX defines only three standard streams, one for input and two for output. If a piece of your code needs two kinds of input, this setup won’t work. Fortunately, POSIX provides a very simple mechanism for creating pipes, as shown in Listing 1. The pipe(2) system call takes an array of two integers as an argument, and inserts a pair of file descriptors into them. The first points to the "out" end of the pipe; the second points to the "in" end.
Listing 1 pipe.c.
#include <stdio.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #define IN 0 #define OUT 1 int * pipepair() { int * pipes = calloc(2, sizeof(int)); if(pipe(pipes) != 0) { free(pipes); pipes = NULL; } return pipes; } int main(void) { int * pipes = pipepair(); if(pipes == NULL) { return -1; } pid_t pid = fork(); if(pid == 0) { char buffer[1024]; read(pipes[IN], buffer, 1024); printf("Child process received: %s\n",buffer); } else { char * message = "A simple message"; printf("Parent process sending %s\n", message); write(pipes[OUT], message, strlen(message)+1); } return 0; }
The example shown here contains a few things that you should never do in production code. The first is reading data from a pipe into a buffer on the stack. If you’re careful with your sizes, you’ll be fine, but it takes only a small typo to add a buffer overflow. Generally it’s better to avoid this possibility by keeping buffers on the heap. If you’re really paranoid, you can allocate buffers over two memory pages and mark the second one as read-only, so your program will crash if you go over the end. Some malloc(3) implementations will do this for you. The second obvious flaw comes from passing raw data to printf(3). In this program, I know that everything received will be null-terminated, but in the real world it’s more difficult to be certain.
Note that it’s possible to use existing programs in this way from within your own code. Just as the shell redirects stdin and stdout, so can you. To do this, you take advantage of the fact that UNIX reuses file descriptor numbers. If you close stdin and then open another file, this second file will become stdin. For doing this with pipes, POSIX provides the dup2(2) call, which closes one file descriptor and clones another in a single operation (see Listing 2).
Listing 2 ls_child.c.
#include <stdio.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #define IN 0 #define OUT 1 int * pipepair() { int * pipes = calloc(2, sizeof(int)); if(pipe(pipes) != 0) { free(pipes); pipes = NULL; } return pipes; } int main() { int * pipes = pipepair(); if(pipes == NULL) { return -1; } pid_t pid = fork(); if(pid == 0) { // Make the pipe’s input stdout (fd 1) dup2(pipes[OUT], 1); // Close the old file descriptor, since we aren’t using it anymore. close(pipes[OUT]); char *arg[2] = {"/bin/ls", 0}; execv(arg[0], arg); } else { unsigned int i=0; char buffer[1024]; // Turn the file descriptor into a FILE FILE * ls_pipe = fdopen(pipes[IN], "r"); // Close the out end of the pipe close(pipes[OUT]); while(fgets(buffer, 1024, ls_pipe)) { printf("%d: %s",i++, buffer); } } return 0; }
This example spawns a child process that then runs the ls command. The output from this command is read by the parent process, which prepends a line number and prints it. Notice that I had to close the "out" end of the pipe in the parent. If I didn’t, the pipe would still exist when the child terminated, and the loop that read from it wouldn’t end.
Note that the same effect can be achieved by using the popen(3) function, although on some implementations this provides only unidirectional access.