Taming Networks with JavaScript and Node.js
You've probably had some exposure to using JavaScript in the browser, perhaps in the days when scrolling marquees were all the rage; or more recently, as libraries like jQuery have made client-side JavaScript more palatable. Many experienced programmers, particularly those with a computer science background, still see JavaScript as a "toy" language for showing and hiding parts of web pages. But with JavaScript becoming the language of the Web, and the emergence of Node.js, I'd suggest that their opinion is wrong.
What Is Node.js?
Node.js is a networking programming platform that Ryan Dahl created on top of V8, the engine that powers Google's Chrome web browser. It offers developers JavaScript as the API to program networks. Dahl wanted to create servers that supported many concurrent users, without the need for lots of hardware or complicated programming techniques. As Perl has a reputation for excelling at dealing with data, Node.js is developing a reputation for moving data around networks—and doing it fast.
Node.js is not a framework like Ruby on Rails or Django, although some frameworks have been created using it. Rather, Node.js is a programming platform that gives developers a toolkit for dealing with networked data. This might mean working with a third-party APIs, combining multiple data sources, or connecting many individual clients.
Programming Around Events
JavaScript is an event-driven language. Rather than stepping through procedural code, you structure code around events, adding event handlers to respond to these events. For example, your program might "listen" for a user to do any of the following in a browser:
- Click a button
- Scroll the page
- Hover over a list item
It's almost impossible to know when these events will occur, so you create event handlers that will be triggered when the specified event is fired. In the following example, an alert is displayed when an element with the ID trigger is clicked:
var trigger = document.getElementById('trigger'); trigger.onclick = function() { alert("trigger clicked"); };
When a user clicks the trigger, the onclick event is fired, and a function is called that shows an alert. This is a simple example of programming around events.
Node.js runs on a single process, meaning that it can do only one thing at a time. For programmers coming from languages or platforms that support threading, the immediate reaction is that this is a weakness, particularly in the context of network programming. This design has resulted in some strong reactions to Node.js. But the constraint of programming on a single process and using events has some strong upsides. A single process uses very few resources, so Node.js offers the ability to run high-traffic sites on low-grade hardware. By the same token, Node.js is not a good fit for operations that are computationally expensive.
Using Callbacks
If Node.js is running on a single process, how can you ensure that it doesn't lock up? By using a key design pattern known as a callback. Callbacks are an attractive feature of JavaScript, made possible by the fact that functions are first-class objects that can be passed as arguments to other functions. This may be confusing at first, but it's actually quite simple to use, and if you've ever used jQuery you probably are already familiar with this pattern. In this jQuery example, any paragraph in the Document Object Model (DOM) will be hidden and an alert will be shown when the paragraph has finished hiding:
$('p').hide('slow', function() { alert("The paragraph is now hidden"); });
The hide method accepts an optional function as an argument that will be called once hiding is complete. Essentially, a callback allows you to say, "When you're finished doing that, do this." In Node.js, the pattern is the same. In the following example, a file is read from disk. When the file is finished reading, the callback function is fired, allowing the data to be used or an error to be thrown (if one exists):
var fs = require('fs'); fs.readFile('somefile.txt', function (err, data) { if (err) throw err; console.log(data); });
Moving Data Quickly with Streams
Streams are another important feature of Node.js. They support moving data really quickly and using the data as soon as it's received.
To help you understand streams, let's consider an example. Suppose you want to get some water from a tap. You can choose one of two approaches:
- Place a bucket below the tap and turn on the water.
- Attach a hose or affix a pipe to the tap and turn on the water.
Which method is better?
- Using a bucket, you get all the water you want, but you have to wait for the water to finish pouring into the bucket before you can use it.
- Using a hose or pipe, you get water immediately, and it continues to pour out the end of the hose or pipe until you turn off the tap. This approach lets you get on with your work immediately.
Streams in Node.js allow developers to use data as soon as it's ready. There's no need to wait until data has been fully buffered before using it.
You may not have realized it, but streams are everywhere in Node.js. Here's an example of fetching the Google home page:
var http = require('http'); var options = { host: 'http://www.google.com' }; http.get(options, function(res){ var data = ''; res.on('data', function (chunk){ data += chunk; }); res.on('end',function(){ console.log(data); }) });
This short script creates an HTTP client to fetch the Google home page. Notice that whenever data is received the program fires a data event, meaning that the data can be used immediately. In this example, the chunks of data are concatenated into a variable so that the data can be used once all the data has been received—but you can start to parse chunks of data the moment that they're received, without needing to buffer the entire response. Streams also offer the same capability for sending data, so you can send and receive chunks of data very quickly.
Node.js as Network Glue
So what can you do with Node.js? In callbacks and streams, we've seen two pieces of Node.js architecture that support connecting networks and help move data around quickly without expensive hardware. If you need to move data around the Web (or any other network) and deal with high volumes of requests, chances are that Node.js will be a great fit for you. Some examples of good use cases:
- Proxy servers
- JSON APIs
- URL shorteners
- Services that mash up third-party APIs
- Browser-based games
Because JavaScript is at the center of Node.js, it's trivial to expose data to the browser and to create applications that also send data back to the browser. This leads to the possibility of creating highly scalable browser-based games and applications that can send data bidirectionally using JavaScript. To achieve this goal on the browser historically, you would most likely need to use a language that supports threading, and you'd need to deal with locking issues. Node.js dramatically reduces the amount of code that you need to write for this problem space.
Summary
There has never been a better time to be a JavaScript developer. The browser is emerging as the primary delivery point for software, and Node.js gives you a first-class platform for programming servers to power those applications. If you still think JavaScript is a "toy" language, it's time to think again.