Compression Techniques
Various types of compression algorithms are in use in the world today.
For compression, a scope needs to be set ahead of time. There are compression methods for data (entire files), links (data that travels between routers), hard drives (data stored on a hard drive), and so on. This section of the chapter focuses on compression across WAN links (links).
It is also true that too much compression is a bad thing. If data is already compressed when WAN links begin to process it, the ability of the router to further compress that data is affected. Data that is already compressed can actually become larger by recompressing it. The discussion in this chapter focuses on what happens at the WAN interface, regardless of the type of data being transported.
Compression is only one technique for squeezing every possible bit of bandwidth from an existing internetwork deployment. Compression, like queuing, is meant to provide critical time to plan and deploy network upgrades and to reduce overall utilization of a WAN link. However, nothing is free. The execution of the compression algorithm adds a significant number of CPU cycles. Unfortunately, the additional load on the CPU might not be something it can handle.
With compression enabled, CPU utilization of the router increases considerably. On the bright side, the WAN link utilization drops considerably. Thus, compression is a trade-off, because all that has been accomplished is the displacement of utilization from the WAN to the router. Obviously, the effects of compression vary based on the algorithm implemented.
As technology advances, compression will move from a software function to a hardware function. This is already a reality in some router models, with the addition of newly available modules specifically geared toward performing data compression in hardware. Not only is this much faster than software compression, it is less costly for the CPU (generally).
The effects of compression must be taken into account prior to any implementation. If the routers are already running at 80 percent or more CPU utilization (show cpu process command), it is not a good idea to implement compression. Doing so can result in the router literally running out of CPU to do any further processing.
Data compression makes efficient use of bandwidth and increases WAN throughput by reducing the size of the frame being transported. Compression is best utilized on slower WAN links. There comes a point when the router can send the data faster if it is uncompressed than if the router compresses the data, sends it, and then decompresses the data at the other end.
Cisco supports a number of compression types:
Link
Payload
TCP Header
Microsoft Point-to-Point
Microsoft Point-to-Point compression, which is an algorithm, is beyond the scope of this book and thus is not discussed further.
Figure 15-8 illustrates where various types of compression make transmission more efficient.
Figure 15-8 Compression Methods
Link Compression
Link compression (also known as per-interface compression) compresses the entire frame (Layer 2 through Layer 7) as it leaves the interface. In other words, both the header and the data are compressed. Because the entire frame is compressed, it must be uncompressed immediately upon receipt at the other end of the link so that the header information can be used to further forward the frame. Thus, a link-compressed frame is only compressed for the trip across one link.
Link compression is not dependent on any particular protocol function. Every frame that leaves the interface that is configured for compression is reduced in sizeno questions asked, no exceptions to the rule. Cisco supports two algorithms on its router chassis to compress traffic: Stac and Predictor. For HDLC links, Stac is the only available choice.
Because the header and data are unusable after being compressed, link compression should be used only for data transmission over point-to-point dedicated connections.
It is also important to remember that if compression is used on one side of a link, compression must also be enabled on the other side, and the same compression algorithm must be used at both ends. Compression can be compared to a language. If English is spoken on one end, then English must be present on the other end to decipher the communication.
Stac
Stac is based on an algorithm known as Lempel-Ziv (LZ), which searches the data stream for redundant strings and replaces them with a token. The token is an information pointer that is significantly shorter than the string it replaces. If LZ cannot find any duplicated strings in the data, no compression occurs and transmission occurs as if the link had no compression activated. Stac is an open-standard compression algorithm and is used by many different vendors.
There are cases, such as the sending of encrypted data or data that has already been compressed, in which compression actually expands the size of a transmission. In such cases, the original transmission is sent untouched. The Stac compression algorithm tends to be quite CPU intensive and should not be implemented on routers with an already high CPU utilization. Stac might also be a poor selection on smaller routers that do not have much CPU to begin with.
Predictor
The Predictor compression method is rightly named. This Cisco-proprietary algorithm attempts to predict the incoming character sequences by implementing an indexing system that is based on a compression dictionary. It is essentially a code book that is based on possible data sequences. If a character string is found that matches an entry in the dictionary, the string is replaced with the dictionary entry. That entry is a much shorter sequence of characters. At the remote end, the incoming characters are compared to the data dictionary once again to be decoded. The incoming dictionary strings are replaced with the appropriate (and original) information.
The Predictor compression method is like sign language. Rather than spelling out each individual word (no compression), a single hand motion utilizes an entire word or concept (compression). Because both parties understand the hand motions, successful communication occurs. Conversely, when one of the people involved in the communication does not understand sign language, communication does not occur.
Predictor-like algorithms are used in some voice-compression standards. For example, G.729 and G.729a (CSA-CELP) implementations compress a 64-kbps voice stream into an 8-kbps data stream. These implementations are directly based on the code book/dictionary prediction methodology. One big difference between Predictor (link compression) and voice-compression algorithms (payload compression) is that the voice-compression routines compress only the voice data (payload), not the entire frame.
Remember that Stac is CPU intensive. Predictor, on the other hand, tends to be extremely memory intensive. If the router has not been outfitted with a good amount of RAM, Predictor should not be implemented. However, if RAM is plentiful, Predictor is a compression consideration that can be beneficial.
Payload Compression
Payload compression is exactly what its name implies. Also known as per-VC compression, payload compression compresses only the data portion of the transmission. All L2 headers are left intact.
It cannot be assumed that customer WAN links are all dedicated point-to-point connections (PPP or HDLC). For such circuits, link compression can be used because the provider's WAN switches do not examine any portion of the data being transmitted.
However, payload compression is needed if the WAN switches must examine the data sent from a customer location. WAN technologies such as Frame Relay require that the L2 header information be untouched so that the provider's switches can read it and make forwarding decisions based on it. Any implementation of VCs disallows link compression. In these cases, payload compression is appropriate.
TCP Header Compression
RFC 1144 defines the Van Jacobson algorithm. In doing so, it also defines the algorithm for TCP/IP header compression. The 20-byte IP header and the 20-byte TCP header combination (a total of 40 bytes) is compressed to 2 or 4 (typically 4) bytes to reduce overhead across the network. The L2 header remains intact so that it can be utilized by the appropriate L2 transport.
This type of compression is most beneficial when used with implementations that transmit small packets, such as Voice over IP, Telnet, and so forth. This type of compression can be done on just about any WAN implementation (X.25, Frame Relay, ISDN, and so on).
TCP header compression, as suggested in the name, compresses only the TCP and IP headers. If the data payload is 1000 bytes, then the total package (excluding the L2 frame) would be 1000 + 40 (data + TCP + IP) bytes. TCP header compression would bring this down to 1000 + 2 bytes. However, if the payload is 20 bytes, then 20 + 40 becomes 20 + 2 (a big improvement). Table 15-4 summarizes the benefits. This table does not consider the L2 headers.
Table 15-4 TCP Header Compression
Data |
TCP Header |
IP Header |
Total Without Compression |
% Overhead |
Total With Compression |
% Overhead |
1000 |
20 |
20 |
1040 |
3.8% |
1002 |
.2% |
20 |
20 |
20 |
60 |
66.7% |
22 |
9.1% |
Note that for packets with larger data portions, the compression provides noticeable improvement (3.8 percent overhead compared to .2 percent). However, for smaller data portions, the improvement is dramatic (66.7 percent overhead compared to 9.1 percent).
As with other forms of compression, TCP header compression must be configured on both ends of the link to ensure a connection. Because the L2 headers are not compressed, TCP header compression can be used on any serial interface, and across WAN clouds that must be able to read the L2 headers during transit.
If any form of compression is used in a Cisco router, the exact same form of compression must be implemented on the other end of the link. Failure to apply the same compression algorithm on each causes all data to fail across the link.
Compression Issues
Compression is not a feature that is simply turned on or off. When selecting the algorithm that is to be utilized for a particular deployment, you should consider the following:
Modem compressionSome modems implement compression. Modems that use MNP5 and V.42bis are not compatible. Although each offers 2 and 4 times compression, they cannot communicate with each other. If you use modem compression, make sure that the modems at both ends of the connection are using a common protocol. If compression is being performed by the modem, do not attempt to configure compression at the router level.
***no bullet***
If modem compression is successfully enabled, then data compression (from the router for example) should not be enabled. Remember that compressing a compressed string might increase the size. Conversely, if compression is performed on the router, then the modem should not attempt any further compression.
Data encryptionEncryption occurs at the network layer where compression is an L2 function. The main purpose of encryption is security. Encryption removes common patterns in data streams. In other words, when Stac tries to find redundant strings, there are none. When Predictor looks into the dictionary for common patterns, there are none. Therefore, the compression is unsuccessful and can actually expand the traffic it was attempting to compress. In such a case, the traffic is sent uncompressed.
CPU and memorySome algorithms are memory intensive and some are CPU intensive. Thus, before you plan or implement compression, you must know the physical configuration of your router (that is, its RAM and CPU) before ordering additional hardware.
Configuring Compression
To configure compression, there are several commands. Most are technology-specific and fairly intuitive. The compress configuration command is used at the interface level (normally a slow serial interface) to select the link-compression algorithm. Remember to configure the same compression type on both ends of the point-to-point link.
Router(config-if)# compress [predictor | stac | mppc]
For Frame Relay connections, use the frame-relay payload-compress interface-level configuration command to enable Stac compression on an interface or a subinterface (payload compression). There are no additional configuration parameters for use with this command, as shown by the following command structure:
Router(config-if)# frame-relay payload-compress
To enable TCP header compression for a given interface, use the ip tcp header-compression command. The command structure is as follows:
Router(config-if)# ip tcp header-compression [passive]
The passive keyword at the end of the command specifies that compression be performed only if packets received on that interface are compressed on arrival.