- TCP Tuning Domains
- TCP State Model
- TCP Congestion Control and Flow Control Sliding Windows
- TCP and RDMA Future Data Center Transport Protocols
TCP and RDMA Future Data Center Transport Protocols
TCP is ideally suited for reliable end-to-end communications over disparate distances. However, it is less than ideal for intra-data center networking primarily because over-conservative reliability processing drains CPU and memory resources, thus impacting performance. During the last few years, networks have grown faster in terms of speed and reduced cost. This implies that the computing systems are now the bottlenecknot the networkwhich was not the case prior to the mid-1990s. Two issues have resulted due to multi-gigabit network speeds:
-
Interrupts generated to the CPU The CPU must be fast enough to service all incoming interrupts to prevent losing any packets. Multi-CPU machines can be used to scale. However, the PCI bus then introduces some limitations. It turns out that the real bottleneck is memory.
-
Memory Speed An incoming packet must be written and read from the NIC to the operating system kernel address space to the user address. You can reduce the number of memory-to-memory copies to achieve zero copy TCP by using workarounds such as page flipping, direct data placement, and scatter-gather I/O. However, as we approach 10-gigabit Ethernet interfaces, memory speed continues to be a source of performance issues. The main problem is that over the last few years, memory densities have increased, but not speed. Dynamic random access memory (DRAM) is cheap but slow. Static random access memory (SRAM) is fast but expensive. New technologies such as reduced latency DRAM (RLDRAM) show promise, but these seem to be dwarfed by the increases in network speeds.
To address this concern, there have been some innovative approaches to increase the speed and reduce the network protocol processing latencies in the area of remote direct memory access (RDMA) and infiniband. New startup companies such as Topspin are developing high-speed server interconnect switches based on infiniband and network cards with drivers and libraries that support RDMA, Direct Access Programming Library (DAPL), and Sockets Direct Protocol (SDP). TCP was originally designed for systems where the networks were relatively slow as compared to the CPU processing power. As networks grew at a faster rate than CPUs, TCP processing became a bottleneck. RDMA fixes some of the latency.
FIGURE 3-14 shows the difference between the current network stack and the new-generation stack. The main bottleneck in the traditional TCP stack is the number of memory copies. Memory access for DRAM is approximately 50 ns for setup and then 9 ns for each subsequent write or read cycle. This is orders of magnitude longer than the CPU processing cycle time, so we can neglect the TCP processing time. Saving one memory access on every 64 bits results in huge savings in message transfers. Infiniband is well suited for data center local networking architectures, as both sides must support the same RDMA technology.
Figure 3-14 Increased Performance of InfiniBand/RDMA Stack