The Node.js Way of Doing Things
To understand how Node.js changes the method demonstrated in the preceding section into a nonblocking, asynchronous model, first look at the setTimeout function in JavaScript. This function takes a function to call and a timeout after which it should be called:
// blah setTimeout(function () { console.log("I've done my work!"); }, 2000); console.log("I'm waiting for all my work to finish.");
If you run the preceding code, you see the following output:
I'm waiting for all my work to finish. I've done my work!
I hope this is not a surprise to you: The program sets the timeout for 2000ms (2s), giving it the function to call when it fires, and then continues with execution, which prints out the “I’m waiting...” text. Two seconds later, you see the “I’ve done...” message, and the program then exits.
Now, look at a world where any time you call a function that needs to wait for some external resource (database server, network request, or file system read/write operation), it has a similar signature. That is, instead of calling fopen(path, mode) and waiting, you would instead call fopen(path, mode, function callback(file_handle) { ... }).
Now rewrite the preceding synchronous script using the new asynchronous functions. You can actually enter and run this program with node from the command line. Just make sure you also create a file called info.txt that can be read.
var fs = require('fs'); // this is new, see explanation var file; var buf = new Buffer(100000); fs.open( 'info.txt', 'r', function (handle) { file = handle; } ); fs.read( // this will generate an error. file, buffer, 0, 100000, null, function () { console.log(buf.toString()); file.close(file, function () { /* don't care */ }); } );
The first line of this code is something you haven’t seen just yet: The require function is a way to include additional functionality in your Node.js programs. Node comes with a pretty impressive set of modules, each of which you can include separately as you need functionality. You will work further with modules frequently from now on; you learn about consuming them and writing your own in Chapter 5, “Modules.”
If you run this program as it is, it throws an error and terminates. How come? Because the fs.open function runs asynchronously; it returns immediately, before the file has been opened, and you have the handle value returned to you. The file variable is not set until the file has been opened and the handle to it has been received in the callback specified as the third parameter to the fs.open function. Thus, it is still undefined when you try to call the fs.read function with it immediately afterward.
Fixing this program is easy:
var fs = require('fs'); fs.open( 'info.txt', 'r', function (err, handle) { // we'll see more about the err param in a bit var buf = new Buffer(100000); fs.read( handle, buf, 0, 100000, null, function (err, length) { console.log(buf.toString('utf8', 0, length)); fs.close(handle, function () { /* don't care */ }); } ); } );
The key way to think of how these asynchronous functions work is something along the following lines:
- Check and validate parameters.
- Tell the Node.js core to queue the call to the appropriate function for you (in the preceding example, the operating system open or the read function), and to notify (call) the provided callback function when there is a result.
- Return to the caller.
You might be asking: If the open function returns right away, why doesn’t the node process exit immediately after that function has returned? The answer is that Node operates with an event queue; if there are pending events for which you are awaiting a response, it does not exit until your code has finished executing and there are no events left on that queue. If you are waiting for a response (either to the open or the read function calls), it waits. See Figure 3.2 for an idea of how this scenario looks conceptually.
Figure 3.2. As long as there is code executing or somebody is waiting for something, Node runs.