Dissecting the Various Packets
The Internet Protocol offers several packet protocols that range from very fast to very reliable. All of them rest on the lowest layerthe basic IP packet. However, each layer has evolved to solve specific problems. To select the correct packet type, you must know about what you're transmitting.
The packet types most likely to be of interest are TCP, UDP, ICMP, and raw. Knowing the advantages and disadvantages of each type can help you choose the most appropriate for your application. Each packet type has different benefits, as summarized in Table 3.2.
Table 3.2 Packet Type Benefits
|
Raw |
ICMP |
UDP |
TCP |
Overhead (bytes) |
2060 |
2060+[4] |
2060+[8] |
2060 +[2060] |
Message Size (bytes) |
65,535 |
65,535 |
65,535 |
(unlimited) |
Reliability |
Low |
Low |
Low |
High |
Message Type |
Datagram |
Datagram |
Datagram |
Stream |
Throughput |
High |
High |
Medium |
Low |
Data Integrity |
Low |
Low |
Medium |
High |
Fragmentation |
Yes |
Yes |
Yes |
Low |
In this table, notice that each packet type contains comparisons. A reliability of Low value only means that you cannot rely on the protocol to help reliability. While the differences may seem extreme, remember that they are merely comparisons.
Considering the Packet's Issues
Each protocol addresses issues in the transmission. The following sections define each issue and associated category from Table 3.2. This information can help you see why certain protocols implement some features and skip others.
Protocol Overhead
Protocol overhead includes both the header size in bytes and the amount of interaction the protocol requires. High packet overhead can reduce throughput, because the network has to spend more time moving headers and less time reading data.
Strong protocol synchronization and handshaking increase interaction overhead. This is more expensive on WANs because of the propagation delays. Table 3.2 does not include this measurement.
Protocol Message Size
To calculate network throughput, you need to know the packet size and the protocol's overhead. The transmission size gives you the maximum size of a sent message. Since all but TCP use a single-shot message, this limitation is typically due to the limits of IP packet (65,535 bytes). The amount of data your program transmits per packet is the transmission size less the headers.
Protocol Reliability
Part of the problem with networks is the possibility of lost messages. A message could be corrupted or dropped as it moves from one host or router to another, or the host or router could crash or fail. In each case, a message may simply be lost, and your program may need to follow up.
Also, you may need to make sure that the destination processes the packets in the correct order. For example, you may compose a message that does not fit in one packet. If the second packet arrives before the first, the receiver must know how to recognize and correct the problem. However, the order is not important when each message is independent and self- contained.
The packet's reliability indicates the certainty of safe arrival of messages and their order. Low reliability means that the protocol can't guarantee that the packet gets to the destination or that the packets are in order.
Protocol Message Type
Some messages are self-contained and independent from other messages. Pictures, documents, email messages, and so on are a few examples that may fit the size of the packet. Others are more in the form of a flowing stream, such as Telnet sessions, HTTP's open channel [RFC2616], large documents, pictures, or files. The message type defines which style best fits each protocol.
HTTP's Protocol
HTTP 1.0 could effectively use UDP for transferring messages instead of TCP. The client simply sends the request for a specific document, and the server replies with the file. Effectively, no conversation occurs between client and server.
Protocol Throughput
The most noticeable aspect of data transmission is network throughput. Getting the most out of your network is the best way to make your users happy. To get the best performance, you need to know the throughput. Often, the bits-per-second is a small part of the whole equation; it only indicates how the network could perform under ideal circumstances.
The protocol throughput measures how much real data the originator can send to the destination within a period of time. If the headers are large and the data small, the result is low throughput. Requiring acknowledgment for each message dramatically reduces throughput. By default, high reliability and integrity result in low throughput and vice versa.
Protocol Data Integrity
The networking technology currently has a lot of safeguards for data integrity. Some network interfaces include a checksum or cyclical redundancy check (CRC) for each low-level message. They also include special hardware technology that can filter out noise and get to the real message. Additionally, each protocol includes measures to detect errors in the data. These errors may or may not be important to you.
The importance of data integrity depends on the data; that is, some data requires very careful oversight, while less important data is less critical. Here are some types of data:
-
Fault-IntolerantLife-critical data. Anything that can affect public or private health/life. For example, life signs and vital signs from medical equipment and missile launch commands.
-
CriticalImportant and reliable data. Data that if out of sequence or faulty can cause harm to property or security. For example, financial transactions, credit cards, PIN numbers, digital signatures, electronic money, trade secrets, virus scanner updates, and product updates.
-
ImportantData that requires proper functionality. Any loss can cause malfunction. For example, X11 connections, FTP downloads, Web pages, server/router addresses, and Telnet connections.
-
InformationalData that can be less than 100% reliable for proper functionality. For example, email, news feeds, advertisements, and Web pages.
-
TemporalData that is date/time bound. Unless the program uses the information within a specific time, its importance lessens. For example, weather data, surveillance data, and time.
-
LossyData that can degrade without loss of usefulness. These are typically audio or visual. For example, movies, audio files, photos, and spam (of course).
Prior to choosing the packet type or protocol, try to categorize data according to this list. Also include the additional (or external) constraints of the program. These may be regulatory constraints as well.
Protocol Fragmentation
Large messages on slow networks can frustrate other users. All networks place a maximum frame size so those large messages don't dominate the network. Keep in mind that the routing host may still carve up, or fragment, large messages that go through a constricted network.
Each protocol has a different likelihood of fragmentation. Since reassembling fragmented messages is part of IP, the reassembly may be transparent to the higher protocols. Certain circumstances, however, may require the packet's wholeness. This is particularly important for network performance. When routers carve up the packet into smaller chunks, the router has to take the time to chop up the message, and the resulting packet overhead increases. By blocking fragmentation, the network drops the packet and returns a message-too-big error to your program.
Packet Types
The following sections describe each packet, showing its statistics and header definition (if there is one). Each section uses a quick-reference style to help you quickly see the features of each protocol. Use this style to help you choose the right packet for your applications.
The Raw Packet
A raw packet has direct access to an IP packet and header. It is useful in writing special or custom protocols. Its attributes are listed in Table 3.3.
Table 3.3 Raw Packet Attributes
Message Size (bytes) |
65,535 (65,515 max data payload) |
Overhead (bytes) |
2060 |
Reliability |
Low (network may drop or rearrange packets) |
Message Type |
Datagram |
Throughput |
High (low system overhead) |
Data Integrity |
Low (system does not validate message) |
Fragmentation |
Yes |
Linux provides the option to work with different layers in the Internet Protocol stack (refer to Chapter 5, "Understanding the Network Layering Model," for a complete definition of the layers and IP stack). The most basic TCP/IP message is the raw IP message. It has no information other than the most basic.
You can use the IP packet itself to create the most basic layer to create your own custom protocols. Access the IP packet by selecting SOCK_RAW in the socket() system. For security, you must have root privileges to run a raw socket program.
The raw socket lets you play with the guts of the IP packet. You can configure the socket to work on two levels of detail: data only or data and header manipulation. Data manipulation is like UPD data transfers but does not support ports. In contrast, header manipulation lets you set the header fields directly.
Using this message has both advantages and disadvantages. As a datagram message, it offers no guarantees of arrival or data integrity. However, you can send and receive messages nearly at network speed. For more information on raw packet manipulation, see Chapter 18, "The Power of Raw Sockets."
IP Control and Error Messaging (ICMP)
The Internet Control Message Protocol (ICMP) is one of the layers built on top of the basic IP packet. All Internet-connected computers (hosts, clients, servers, and routers) use ICMP for control or error messages. It is used for sending error or control messages. Some user programs also employ this protocol, such as traceroute and ping. ICMP's attributes are listed in Table 3.4.
Table 3.4 ICMP's Attributes
Message Size (bytes) |
65,535 (65,511 max data payload) |
Overhead (bytes) |
2464 |
Reliability |
Low (same as raw IP) |
Message Type |
Datagram |
Throughput |
High (same as raw IP) |
Data Integrity |
Low (same as raw IP) |
Fragmentation |
Yes (but unlikely) |
You can reuse your socket to send messages to different hosts without reopening the socket if you employ the ICMP in your own program. Send messages using the sendmsg() or sendto() system call (described in the next chapter). These calls require an address of the destination. With a single socket, you can send messages to as many peers as you want.
The advantages and disadvantages of an ICMP packet are essentially the same as raw IP (and other datagrams). However, the packet includes a checksum for data validation. Also, the likelihood that the network may fragment an ICMP packet is very small. The reason is because of the nature of ICMP messages: They are for statuses, errors, or control. The message is not going to be very large, so it may never require reassembly.
While you can use the ICMP for your own messages, it is usually for error messages and control. All networking errors travel the network through an ICMP packet. The packet has a header that holds the error codes, and the data part may contain a more specific message describing the error.
Part of the IP protocol, ICMP gets an IP header and adds its own header. Listing 3.2 shows a definition of the structure.
Listing 3.2 ICMP Structure Definition
/************************************************************/ /*** ICMP structure definition ***/ /*** Formal definition in netinet/ip_icmp.h ***/ /************************************************************/ typedef unsigned char ui8; typedef unsigned short int ui16; struct ICMP_header { ui8 type; /* Error type */ ui8 code; /* Error code */ ui16 checksum; /* Message checksum */ uchar msg[ ]; /* Additional data description */ };
Type and code define what error occurred. msg can be any additional information to help detail what went wrong. For a complete list of types and codes, see Appendix A.
User Datagram Protocol (UDP)
The User Datagram Protocol (UDP) is used mostly for connectionless (independent messages) communications. It can send messages to different destinations without re-creating new sockets and is currently the most common connectionless protocol. UDP's attributes are listed in Table 3.5.
Table 3.5 UDP Attributes
Message Size (bytes) |
65,535 (65,507 max data payload) |
Overhead (bytes) |
2868 |
Reliability |
Low |
Message Type |
One-shot |
Throughput |
Medium |
Data Integrity |
Medium |
Fragmentation |
Yes |
Each layer up the IP stack provides more focus on data and less on the network. UDP hides some of the details about error messages and how the kernel transmits messages. Also, it reassembles a fragmented message.
A message you send via UDP is like an email message: The destination, origin, and data are all the information it needs. The kernel takes the message and drops it on the network but does not verify its arrival. As with the ICMP packet, you can send to multiple destinations from a single socket, using different send system calls. However, without the verification, you can experience near-maximum throughput.
Without arrival verification, the network can lose data reliability. The network can lose packets or fragments and corrupt the message. Programs that use UDP either track the message themselves or don't care if something gets lost or corrupted. (Please note that, while datagrams are unreliable, it does not mean that something will go wrong. It just means that the protocol makes no guarantees.)
Of the different data types (previously defined), Informational, Temporal, and Lossy best fit the UDP services. The primary reason is their tolerance for loss. If your Web camera fails to update every client, the end user is unlikely to either notice or care. Another possible use is a correct time service. Because correct time is Temporal, a host may drop a couple of clock ticks without losing integrity.
UDP offers the advantage of high speed. Moreover, you can increase its reliability yourself in the following ways:
-
Break up large packets. Take each message and divide it into portions and assign a number (such as 2 of 5). The peer on the other end reassembles the message. Bear in mind that more overhead and less sent data decrease throughput.
-
Track each packet. Assign a unique number to each packet. Force the peer to acknowledge each packet because, without an acknowledgment, your program resends the last message. If the peer does not get an expected packet, it requests a resend with the last message number or sends a restart message.
-
Add a checksum or CRC. Verify the data of each packet with a data summation. A CRC is more reliable than a checksum, but the checksum is easier to calculate. If the peer discovers that data is corrupted, it asks your program to resend the message.
-
Use timeouts. You can assume that an expired timeout means failure. Your originator could retransmit the message, and your receiver could send a reminder to the sender.
The Critical and Important data types require the reliability found in TCP or better. Fault-Intolerant requires much more than any of these protocols offer. These outlined steps mimic the reliability of TCP.
UDP relies on IP's features and services. Each UDP datagram packet receives an IP and a UDP header. Listing 3.3 defines how the UDP structure appears.
Listing 3.3 UDP Structure Definition
/************************************************************/ /*** UDP (datagram) structure definition ***/ /*** (Formal definition in netinet/udp.h) ***/ /************************************************************/ typedef unsigned char ui8; typedef unsigned short int ui16; struct UDP_header { ui16 src_port; /* Originator's port number */ ui16 dst_port; /* Destination's port number */ ui16 length; /* Message length */ ui16 checksum; /* Message checksum */ uchar data[ ]; /* Data message */ };
UDP creates a virtual network receptacle for each message in the form of ports. With the port, IP can rapidly shuffle the messages to the correct owner. Even if you don't define a port with bind(), the IP subsystem creates a temporary one for you from the ephemeral port list (see Chapter 2).
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP) is the most common socket protocol used on the Internet. It can use read() and write() and requires re-creating a socket for each connection. TCP's attributes are listed in Table 3.6.
Table 3.6VTCP Attributes
Message Size (bytes) |
(unlimited) |
Overhead (bytes) |
40120 |
Reliability |
High (data receipt checked) |
Message Type |
Stream |
Throughput |
Low (compared to other protocols) |
Data Integrity |
High (includes checksums) |
Fragmentation |
Unlikely |
Taking reliability one step further requires ensuring that the destination gets the exact message the originator sent. UDP has the speed but does not have the reliability that many programs require. TCP solves the reliability problem.
The network, however, has several fundamental problems that make it unreliable. These problems are not limitations. Instead, they are inherent in the design of the network. To get reliable, streamable messages through the tangled Web, TCP/IP has to incorporate many of the ideas to improve reliability suggested in the section on UDP. The Internet has three hurdles: dynamic connections, data loss, and constricted paths, as discussed in the following sections.
Dynamic Connections
One host sends a message to another host. That message travels the networks, going through various routers and gateways. Each message sent may use a different path. Networking segments (connections between computers) often appear and disappear as servers come up and go down. The power of the Internet is its capability to adapt to these changes and route the information accordingly.
Adaptability is one of the driving forces behind the Internet. Your computer can make a request, and the network tries possible avenues to fill the order. Unfortunately, this advantage means that the path between your computer and the server or peer can change, lengthening and shortening the distance.
As the path lengthens, propagation times increase. This means that your program could send successive messages and many would arrive at different times, often out of order.
TCP ensures that the destination has correctly received the last message before it sends the next message. Compare this to a series of numbered messages (this is really how TCP works). Your program may send 10 messages in succession. TCP takes each message, attaches a unique number, and sends it off. The destination accepts the message and replies with an acknowledgment. Upon receiving the acknowledgment, TCP lets your program send the next message.
Sliding Window Protocol
TCP uses a better technique than the send/wait (or ACK/NACK) protocol, which is too slow for anyone's patience. Instead, it uses a sliding window: It gauges when and how often to reply with an ACK. Slower or dirtier connections may increase the acknowledge messages. Connections that are faster and lose less allow more messages to ship before expecting acknowledgments. This is part of the Nagle Algorithm. You can disable this using socket options (see Chapter 9).
Data Loss
When the destination gets your message, it determines the integrity of the data. The data may travel along less-than-optimal communication paths that may drop or corrupt message bits. Remember that the network sends every message one bit at a time. TCP sends with the message a checksum to verify the data. TCP is the last layer that can detect and remedy corrupted data.
If the destination detects any errors, it sends back an error, requesting a retransmittal from your program. Likewise, if your computer does not get an acknowledgment within a specific amount of time, the TCP subsystem automatically resends the message without your program's intervention.
Constricted Paths
Going back to the single message sent to a particular host, suppose that the message is too long for the intervening segments. The problem that the packet encounters as it passes through the network is the different technologies and transmission carriers. Some networked computers permit lengthy packets; others place limits on the size.
UDP tries to send the largest message that it can. This can be a problem with the constricted data paths. The IP algorithms anticipate that the routers may fragment data. Likewise, IP expects that it has to reassemble the incoming message.
TCP, on the other hand, limits every packet to small chucks. TCP breaks up longer messages, before the network has the chance to touch them. The size TCP chooses is one that a majority of networks can accept intact. By default, TCP uses 536 bytes and typically negotiates up to 1,500. To increase that size manually, set the MSS (maximum segment size) TCP socket option (see Chapter 9).
The receiver may find that the message's packets are out of order. TCP reorders them before handing the message to your program.
Solving all these network problems adds protocol and header overhead to TCP's algorithm. Of course, the added overhead of all TCP's techniques slows performance noticeably.
The TCP Header Definition
TCP had to add a lot of information to its header to support all the features that it offers you. The size, in bytes, of the TCP header is about three times that of the UDP header. See Listing 3.4 for a definition of the structure.
Listing 3.4 TCP Structure Definition
/************************************************************/ /*** TCP (streaming socket) structure definition ***/ /*** (Formal definition in netinet/tcp.h) ***/ /************************************************************/ typedef unsigned char ui8; typedef unsigned short int ui16; typedef unsigned int ui32; typedef unsigned int uint; struct TCP_header { ui16 src_port; /* Originator's port number */ ui16 dst_port; /* Destination's port number */ ui32 seq_num; /* Sequence number */ ui32 ack_num; /* Acknowledgment number */ uint data_off:4; /* Data offset */ uint __res:6; /* (reserved) */ uint urg_flag:1; /* Urgent, out-of-band message */ uint ack_flag:1; /* Acknowledgment field valid */ uint psh_flag:1; /* Immediately push message to process */ uint rst_flag:1; /* Reset connection due to errors */ uint syn_flag:1; /* Open virtual connection (pipe) */ uint fin_flag:1; /* Close connection */ ui16 window; /* How many bytes receiver allows */ ui16 checksum; /* Message checksum */ ui16 urg_pos; /* Last byte of an urgent message */ ui8 options[ ]; /* TCP options */ ui8 __padding[ ]; /* (Needed for aligning data[ ]) */ uchar data[ ]; /* Data message */ };
The header may have a variable size, so the data_off field points to the beginning of the data. To save header space, this field acts like the IP's header_len field: It gives the count of 32-bit words that physically precede your data.
TCP uses some of the fields exclusively for opening a connection, flow control, and connection closure. During a communication session, some of the header is empty. The following paragraphs describe a few interesting fields.
The TCP header uses the same port number found in UDP. But seq_num and ack_num provide traceability to the stream. When you send a message, the IP subsystem attaches a sequence number (seq_num). The receiver replies that it got the message with an acknowledgment number (ack_num) that is 1 greater than the sequence number. This feature lets acknowledgment packets carry data as well.
Looking at the TCP Interactions
When you open a streaming connection, your program and server exchange the messages listed in Table 3.6.
Table 3.6 The Three-Way Handshake
Client Sends |
Server Sends |
Description |
SYN=1 (syn_flag) |
|
Request a virtual connection (pipe). |
ACK=0 (ack_flag) |
|
Set sequence number. |
|
SYN=1 (syn_flag) |
Permit and acknowledge a virtual connection. |
ACK=1 (ack_flag) |
|
|
SYN=0 (syn_flag) |
|
|
ACK=1 (ack_flag) |
|
Establish a virtual connection. |
This is called the three-way handshake. During the transfers, the client and server specify the buffer size of their receiving buffers (windows).
On the other hand, closing a connection is not as simple as it may appear, because there may be data in transit. When your client closes a connection, the interaction shown in Table 3.7 may occur.
Table 3.7 TCP Connection Closure
Client |
Server |
Description |
FIN=1 (fin_flag) |
Transmits data Receives data |
Client requests close. |
ACK=1 |
Transmits more Receives more |
Server channels flushed. |
ACK=1 |
FIN=1 |
Close accepted. Server closes and awaits client ACK. |
ACK=1 |
|
Client closes its side. |
Closing the TCP connection makes it impossible to reuse the socket for other connections. For example, if you connect to a server, the only way to sever the connection is to close the channel, which closes the socket as well. If you then want to connect to another server, you must create a new socket. The other protocols do not have that limitation.