Network Protocols
At the base of a network system is the physical topology. On top of that is the logical topology. And on top of the logical topology are protocols. If the idea of "on top of" or "beneath" doesn't make sense, don't worry; it's based on a system for describing how networks work called the OSI model, which is described in the following section.
Network protocols consist of sets of rules for sending and receiving data across a physical network and the software that puts these rules into practice. Logical topologies instruct the hardware how to packetize (or "frame") and transmit data across the physical topology; protocols handle the translation of data from applications (that is, software) to the logical topology.
If that all sounds confusing, don't worry. The next couple of pages discuss how protocols work, what some of the most popular protocols are, and how they're organized. Here is a list of the protocols you are most likely to run across:
TCP/IP
IPX
NetBIOS/NetBEUI
To understand what network protocols are, you have to understand what they do and their function in relation to the rest of the network. To begin, let's examine the most popular theoretical model of networking: the OSI model.
The OSI Model (And Why You Should Be Familiar with It)
During the 1980s, a group called Open Systems Interconnect, or OSI for short, attempted to create a logical arrangement for the various parts that make up a network. In the long term, their efforts were futile (practically no one runs OSI protocols), but they did create a great model to explain how a network should work. The model is called the OSI seven-layer model, and it's a tremendously useful theoretical picture of how network functions should be distributed among the various bits of a network. The OSI model is useful to know, and it's worth memorizing because its theoretical model is useful for debugging network problems ranging from design issues to snafus with connections (see Figure 3.3).
FIGURE 3.3 The OSI model shows how data is moved in a network.
The OSI model is not particularly complicated. The trick is to remember that as the OSI layer numbers increase from 1 to 7, so does the level of abstraction. The lower the layer, the less abstract and more concrete it is. Each layer communicates only with the layer directly above or below it while moving data from electrical impulses on a wire into data on your screen. If we return to the postal metaphor from Hour 2, "Why Build a Network?" as an analogy, OSI becomes even easier to understand:
Layer 7 (Application) is the software applications that you use on your screen. Layer 7 is concerned with file access and file transfer. If you have ever used applications such as FTP or Telnet, you have interacted with an example of Layer 7. In the postal model, the Application layer corresponds to writing a letter. This is where applications such as Microsoft Word and Excel run.
Layer 6 (Presentation) deals with the way different systems represent data. For example, Layer 6 defines what happens when it tries to display UNIX-style data on an MS-DOS screen.
Layer 6 doesn't really have an analogue in the postal model, but if it did, it would be like rewriting the letter so that anyone could read it (which, as you can see, doesn't make much sense in a physical context). Probably the best analogy is to a translator; using the postal model again, assume that your letter is being sent to Mexico. A translator (equivalent to Presentation-layer software) can translate the data in your envelope into the local lingua mexicana. Like the letter in the example, data is mutable and protean and can be rearranged to fit the kind of computer on which it needs to run.
Layer 5 (Session) handles the actual connections between systems. Layer 5 handles the order of data packets and bidirectional (two-way) communications. In a postal metaphor, the Session layer is similar to breaking a single large document into several smaller documents, packaging them, and labeling the order in which the packages should be opened. This is where streams of data get turned into packets.
Layer 4 (Transport) is like the registered-mail system. Layer 4 is concerned with ensuring that mail gets to its destination. If a packet fails to get to its destination, Layer 4 handles the process of notifying the sender and requesting that another packet be sent. In effect, Layer 4 ensures that the three layers below it (that is, Layers 1, 2, and 3) are doing their jobs properly. If they are not, Layer 4 software can step in and handle error correction, such as resending packets if they are corrupted or missing. (Dropped is the usual term for a lost packet.) For what it's worth, this is the layer where the TCP in TCP/IP works.
Layer 3 (Network) provides an addressing scheme. If you send someone a letter, you use a street address that contains a ZIP code because that's what the post office understands. When a computer sends a data packet, it sends the packet to a logical address, which is like a street address. This layer is where Internet Protocol, the IP in TCP/IP, and Novell's internetwork packet exchange, or IPX, work.
Layer 3 works with Layer 2 to translate data packets' logical network addresses (these are similar to IP addresses, about which you'll learn in a few pages) into hardware-based MAC addresses (which are similar to ZIP codes), data frame numbers, and so forth and move the packets toward their destination. Layer 3 is similar to the mail-sorting clerks at the post office who aren't concerned with ensuring that mail gets to its destination, per se. Instead, the clerks' concern is to sort mail so that it keeps getting closer to its destination. Layer 3 is also the lowest layer that typically is part of an operating system.
Layer 2 (DataLink) is a set of rules burned into chips on network interface cards, hubs, switches, routers, and whatever else works on the network. In our postal model, this layer represents a set of rules governing the actual delivery of physical mailpick up here, drop off here, and so forth. This is where the rules for ethernet, token ring, FDDI, ATM, and so on are stored. It's concerned with finding a way for Layer-1 stuff (the cards, hubs, wire, and so forth) to talk to Layer 3. Layer 2 is where network card addresses become important. Layer 2 also re-packetizes data inside frames, which are for all intents and purposes the packet type used by hardware devices to send and receive below the Layer 3 threshold.
Layer 1 (Physical) is similar to the trucks, trains, planes, rails, and whatnot that move the mail. From a network perspective, this layer is concerned only with the physical aspects of the networkthe cards, wire, and concentrators that move data packets. Layer 1 specifies what the physical aspects are, what they must be capable of doing, and (basically) how they accomplish those things. This condenses to cable specs, physical jack specifications, and so on.
If you refer back to the description of packet data in Hour 2, you'll realize that if data packets are to pass over the network, the network (like the postal service) has to accomplish several tasks successfully:
It has to be capable of transmitting data across a physical medium (copper wire, optical fiber, orin the case of wireless networksair).
It must route data to the correct location by MAC address (Media Access Control address, a unique 48-bit address assigned to each network device).
It must be capable of decoding the type of data received when it arrives at the destination.
It must be capable of checking the correctness of the transmitted data.
It must be capable of sending messages to acknowledge that a particular packet has been received.
It must be capable of interacting with users through an interface that displays the data.
As you can see, the various layers of the OSI model accomplish these goals admirably. OSI, however, has seldom been implemented as a network protocol; instead, the existing protocolsmostly TCP/IPwere refined using the powerful OSI reference model as a guide.
TCP/IP
If you've read anything about the Internet that's deeper than a newsweekly's puff piece, you've probably heard of TCP/IP, or Transmission Control Protocol/Internet Protocol. TCP/IP is the protocol that carries data traffic over the Internet. Of all the network protocols in the marketplace, TCP/IP is far and away the most popular.
The reasons for TCP/IP's success, however, do not stem from the popularity of the Internet. Even before the current Internet boom, TCP/IP was gaining popularity among business networkers, college computer-science majors, and scientific organizations. TCP/IP has gained popularity because it is an open standardno single company controls it. Instead, TCP/IP is part of a set of standards created by a body called the Internet Engineering Task Force (IETF). IETF standards are created by committees and are submitted to the networking community through a set of documents called Requests for Comments (RFCs).
RFCs are draft documents freely available on the Internet that explain a standard to the networking community. All RFCs are considered "draft" documents because any document can be superseded by a newer RFC. The reason for this focus on RFCs is that they form a large part of the basis for the various standards that make up Internet networking today, including TCP/IP.
If you're interested in examining an RFC to see how standards are defined, just do a Google search for "RFC1918" (a common RFC referring to private networks); it will generally lead you to a list of other RFCs. They aren't always easy reading, but if you understand the fundamentals, they offer a tremendous amount of information. And who knows? You might see a way they can be improvedthere's no reason why you can't suggest changes if you see the need.
TCP/IP Defined
But what exactly is TCP/IP? It is many things. For one thing, the name TCP/IP is a bit misleadingTCP/IP is just shorthand notation for a full protocol suite, or set of protocols that have standard ways of interacting with each other. TCP and IP share the name of the whole protocol suite because they form the bedrock of the whole protocol suite; they are respectively the transport (OSI Layer 4, which regulates traffic) and the network (OSI Layer 3, which handles addressing) layers of the TCP/IP protocol suite. The suite includes, but is by no means limited to, the ways of transmitting data across networks listed in Table 3.1.
Table 3.1 Some TCP/IP Suite Members and Their Functions
Name |
Function |
TCP |
Transmission Control Protocol. Ensures that connections are made and maintained between computers. |
IP |
Internet Protocol. Handles software computer addresses. |
ARP |
Address Resolution Protocol. Relates IP addresses with hardware (MAC) addresses. RARP, or Reverse ARP, does the opposite. |
RIP |
Routing Information Protocol. Finds the quickest route between two computers. Offers a maximum of 16 "hops" between routers before deciding that a packet is undeliverable. Good for smallish networks, not so good on the Internet. |
OSPF |
Open Shortest Path First. A descendant of RIP that increases its speed and reliability. Much used on the Internet as it accepts 256 "hops" between routers before it decides that a packet is undeliverable. |
ICMP |
Internet Control Message Protocol. Handles errors and sends error messages for TCP/IP. The most common use of this is the ping command, which is used to determine whether one network device can communicate with another network device. |
BGP/EGP |
Border Gateway Protocol/Exterior Gateway Protocol. Handles how data is passed between networks. Used at the edge of networks. |
SNMP |
Simple Network Management Protocol. Allows network administrators to connect to and manage network devices. |
PPP |
Point-to-Point Protocol. Provides for dial-up networked connections to networks. PPP is commonly used by Internet service providers as the dial-up protocol customers use to connect to their networks. |
SMTP |
Simple Mail Transport Protocol. How email is passed between servers on a TCP/IP network. |
POP3/IMAP4 |
Post Office Protocol version 3/Internet Message Advertising Protocol version 4. Both set up ways for clients to connect to servers and collect email. |
As you can see, there are quite a few pieces in the TCP/IP protocol suite, and this is just the beginningthere are a whole bunch more that we're not going to discuss here. If you want to see the whole of TCP/IP in all its glory, read the RFCs. All these protocols are necessary at some point or another to ensure that data gets where it's supposed to be going. The pieces listed in Table 3.1 are standards at this point, but the process of defining standards is far from over.
In contrast to the OSI reference model's seven layers, TCP/IP uses only four layerssome of which amalgamate several OSI layer functions into one TCP/IP layer. Table 3.2 compares OSI and TCP/IP layers.
Table 3.2 Contrast Between TCP/IP and the OSI Model
OSI Layer |
TCP/IP Layer |
TCP/IP Applications and Protocols Running at This Level |
7 (Application) |
TCP Layer 4 (Application) |
FTP (File Transfer Program) |
6 (Presentation) |
|
Telnet (terminal program) |
4 (Transport) |
TCP Layer 3 (also called Host-to-Host; a host is any system running TCP/IP) |
TCP (Transmission Control Protocol) |
3 (Network) |
TCP Layer 2 (Internet) |
IP (Internet Protocol) |
2 (DataLink) |
TCP Layer 1 (Network Interface) |
Hardware (network cards, cables, concentrators, and so on) |
From this table, you can see that TCP/IP accomplishes the functions required in the OSI reference model.
IP Addresses
TCP/IP got its start as part of the UNIX operating system in the mid-1970s. Networkers who had previously relied on UUCP (UNIX-to-UNIX copy) to copy files and mail between computers decided that there had to be a better, more interactive way to network, and TCP/IP was born. Given the academic heritage of placing material in front of the academic community for critical review and discussion, it was a natural progression to include TCP/IP in the RFC process, where its standards have been set ever since.
The original specification for TCP/IP was open endedor so the designers thought. They created an address space, or standard way of writing addresses, which set up 2 to the 32nd power addresses (4,294,967,296 separate addresses). In the days when TCP/IP was still young, the thought that four billion computers could exist was a bit of a stretch, especially because computerseven cheap onescost $5,00010,000 each. However, with the increased popularity of the Internet, these IP addresses have been disappearing at a tremendous clip.
More Space!
Why are IP addresses so important? Well, in the postal-mail metaphor we've been using for our network, every person has a unique name and address. The Internet likewise requires unique names and addresses; and once the current IP address space of four billion-plus addresses are used up, there won't be any more addresses. That's why the next generation of Internet Protocol, called IPv6, is so importantit increases the number of addresses to such a great number that it will be a while before we're in danger of running out of addresses again.
IP addresses have disappeared so fast because of the way the addressing scheme is designed. All IP addresses are written in dotted decimal notation, with one byte (eight bits) between each dot. A dotted decimal IP address looks like this:
Because each number is described by one byte, and because each byte is 8 bits (or binary 1s and 0s), each number can have a value of anything from 0 to 255. Because there are 4 numbers with 8 bits each, the total address space is said to be 32 bits long (4*8 = 32). So the preceding address, in binary, looks like this:
There are 32 characters in the binary address, divided into four eight-bit groups, or octets. This is where the 32-bit title comes fromit's literally 32 binary ones and zeroes. T.S. Eliot wrote the famous Four Quartets; networkers have the not-so-famous, but undeniably more useful four octets in every IP address that exists.
With a 32-bit address space that can handle four billion addresses, you might think that the Internet would never run out of IP addresses (or that it would take a while at any rate). Unfortunately, that's not the case. IP addresses are allocated to organizations that request them in what are called address blocks. Address blocks come in three sizes, based on the class of address. And once you've read about IP address allocation in the following sections, you'll agree that the present method of allocating IP addresses is inefficient given the way the Internet has grown.
Class A Addresses
Class A addresses, of which there are very few (if any) left unused, have up to 16,777,216 addresses. It uses 24 of the 32 bits in the address space read left to right. A Class A address looks like this:
The number represented by the X is one fixed number from 0 to 126. The first octet (that X again) in a Class A address always begins with binary 0. This number is used as the first number before the leftmost dot by all the IP addresses in a Class A address space.
The other three octets, or all the numbers represented by the 0s in the preceding example, can range from 0 to 255. Because three of the four available numbers are used to create unique IP addresses, and three-quarters of 32 is 24, a Class A network has a 24-bit address space. Collectively, Class A addresses use up 50 percent of the available addresses of the IPv4 address space, or 2,147,483,648 of the 4,294,967,296 total available addresses.
Class B Addresses
Class A addresses provide 16 million IP addresses per network. The next increment, Class B, has a total of 65,536 IP addresses per network. A Class B address looks like this:
All Class B addresses begin with a binary 10 in the first octet. Class B addresses compose 25 percent of the available IP address space. This means that Class B addresses account for 1,073,741,824 of the 4,294,967,296 available IP addresses.
The numbers represented by the Xs are fixed numbers ranging from 0 to 255. The numbers represented by the 0s (the other two octets) can range from 0 to 255. Because the two rightmost dotted numbers are used to create unique IP addresses, and because one-half of 32 is 16, a Class B network has a 16-bit address space.
Class C Addresses
The smallest increment of IP addresses available to an organization is Class C. In a Class C network, only the rightmost dotted decimal number can be used for a total of 256 IP addresses.
The first octet of all Class C addresses begins with a binary 110. Class B addresses compose 12.5 percent of the available IP address space. This means that Class B addresses account for 536,870,912 of the 4,294,967,296 available IP addresses.
Here's an example of a Class C address:
As with the Class A and B examples just presented, the numbers represented by the Xs are fixed numbers that range from 0 to 255; the rest of the octets, represented by the 0, can range from 0 to 255.
Other Network Classes
In addition to Classes A, B, and C, there are two other network classes:
Class D. The leftmost address always begins with binary 1110. Class D addresses are used for multicasting, or sending messages to many systems at once. This isn't commonly used, but there are applications in which many computers need to receive the same data in order to provide redundant systems. There are 911 systems that use multicast because it helps ensure that all systems receive all messages and thereby leads to greater uptime and redundant behavior (albeit at greater cost).
Class E. The leftmost address always begins with binary 1111. Class E addresses are reserved for experimental purposes.
Why IP Address Allocation Is Wasteful
Under the current 32-bit Internet address scheme, organizations must select a network class that will provide enough IP addresses for their needs.
The few remaining Class A addresses could potentially be assigned to organizations that need more than 65,536 (Class B-size) IP addresses, even if the organization doesn't require anywhere close to 16 million addresses.
Class B addresses are likewise assigned to organizations that require more than 256 IP addresses, whether or not they require anywhere near 65,536 addresses.
Class C addresses are, fortunately, available for small networks. However, keep in mind that if you take a full Class C, you have 256 addresses, even if you require only 20 addresses.
Fortunately, several solutions are on the horizon. The first is CIDR, or Classless Inter-Domain Routing, which enables several Class C addresses to be combined. As an example, using CIDR, if you need a thousand network addresses, you can get four 256-address Class Cs and combine them for a total of 1,024 addresses (256*4=1024), rather than tying up a whole Class B address of 65,536 addresses. CIDR, or supernetting, as it's been called, has become a means of efficiently allocating network addresses without wasting large chunks of class B address space.
Also on the horizon (and getting closer, but not being implemented as fast as we'd like) is IPv6, or the next generation IP protocol. IPv6, in contrast to current IP (IPv4), has a 128-bit address space (versus a 32-bit address space for IPv4) and is laid out in a slightly different way than IPv4. The following listing compares an IPv4 address with an IPv6 address:
IPv4 Address: |
X.X.X.X |
Each X represents 8 bits in dotted decimal notation (1 through 255) |
IPv6 Address: |
x:x:x:x:x:x:x:x |
Each X represents 16 bits, written in hex notation (0 through F) |
Binary Versus Hex
IPv6 addresses are written in hexadecimal, or base-16, numbers. Hex is used because if each 16-bit number between the colons were written out in decimal, the address would be huge. (Remember that 16 bits can represent any number from 0 to 65,536.)
If you're not familiar with hex notation, don't worry. There's an old conundrum that will mnemonically help you remember how it works:
How does a programmer count to 16? Zero, One, Two, Three, Four, Five, Six, Seven, Eight, Nine, A, B, C, D, E, F.
By using hex notation, it's possible to represent an IPv6 number in something approaching comprehensibility: FEDC:BA98:7654:3210: FEDC:BA98:7654:3210. Fortunately, IPv6 also does a lot of self-configuration, so you won't have to worry about this.
IPv6 will essentially eradicate the address-space problem with IPv4. Recall that 32 bits represent an address space of over 4 billion different addresses. Now, let's extrapolate. If 32 bits is equal to 4,294,967,296 different addresses, we add one more bit (to make a total of 33 bits) and we have 8,589,934,592 addresses. Make it 34 bits, and you have 17,179,869,184 addresses.... Now keep doubling that number with each additional bit until you get to 128 bits, and you'll see that the number continues to get larger very quickly. To kill the suspense, I'll tell you that if you continue doubling the quantity represented by a string of bits until you hit 128 bits, you'll wind up with 340 billion billion billion billion (340 times 10 to the 38th power). This means that there would be 67 billion billion addresses for every square centimeter on the earth's surface. In other words, we're not going to run out of addresses anytime soon if we use IPv6.
Currently, the most popular method of getting around the IP address squeeze is Network Address Translation, or NAT. With NAT, an organization doesn't need a lot of Internet addresses, but rather needs addresses for systems that must be accessed from the Internet. The NAT device translates internal addresses (which might be on private network spaces, discussed later) to Internet addresses, and acts as the Internet face of the internal network. If you have a cable or DSL router that uses the 192.168.1.0 network for the computers on the "inside" (non-Internet) side of the network, your router is doing NAT. Over time, NATin conjunction with firewalls for securityhas proven to be the easiest way to stretch the limited address space of the Internet.
IPv4 is currently the world's most popular protocol. It's the backbone of the Internet, and most large networks rely on its standardization, interoperability, and reliability. If you elect to run your network on it, there will initially be an added dimension of complexity. However, once your network is set up to use IP, it will be capable of talking to any other computer of any typefrom a personal computer to a mainframethat can speak TCP/IP. It's the universal solvent of networking.
IPX
Internetwork Packet Exchange (IPX) is Novell's answer to the complexity of IP. Novell designed IPX in the early 1980s before the current furor over IP and the Internet, and it shows. IPX is a relatively efficient protocol that does several things for which network administrators are duly grateful:
Unlike IP, IPX can configure its own address. This is very useful, particularly when there are a lot of systems to install.
IPX is a "chatty" protocol. That is, it advertises its presence on the network. This characteristic is okay on networks with finite boundaries because the bandwidth it uses is not too bad. On a huge network (a WAN, for example), the chatty nature of IPX can become quite troublesome because it can overwhelm low-bandwidth WAN connections.
On the whole, IPX is easy to install and simple to use. Unfortunately, it's not an open standard; it's controlled by Novell. In spite of its ease of use, even Novell has acknowledged that IPX will eventually bow out in favor of IP.
IPX has lost in the face of the IP onslaught. The only network operating system that continues to use IPX is Novell's NetWare, and even NetWare now supports IP natively.
NetBIOS and NetBEUI
Network Basic Input/Output System (NetBIOS) and Network BIOS Extended User Interface (NetBEUI) are single-site network protocols. NetBEUI is based on a way of passing data called server message block (SMB), which relies on computer names to resolve destination addresses.
Although NetBios is still an integral part of Windows operating systems, NetBEUI has really fallen by the wayside and Microsoft no longer supports this network protocol in the newer releases of its desktop OS and NOS (Windows XP and Windows Server 2003 respectively). You can, however, still implement NetBEUI (although it isn't supported) by downloading the protocol from Microsoft. NetBEUI is not the most secure of the network protocols. For small LANs, even Windows peer-to-peer LANs that don't require Internet access, you can implement NWlink (IPX/SPX) or use a set of private IP addresses. (For information about configuring TCP/IP see Hour 14, "TCP/IP.")