Home > Articles > Operating Systems, Server > Solaris

Understanding Gigabit Ethernet Performance on Sun Fire Servers

Aug 8, 2003

📄 Contents

␡

Overview
Categorizing TCP Traffic
Gigabit Ethernet Latency on a Sun Fire 6800 Server
Bulk Transfer Traffic Performance
Small Packet TCP Traffic Performance
Summary
About the Author
References
Ordering Sun Documents
Accessing Sun Documentation Online

⎙ Print

< Back Page 4 of 10 Next >

Like this article? We recommend 

Capacity Planning for Internet Services

Learn More Buy

Bulk Transfer Traffic Performance

This section investigates the performance of bulk transfer traffic using the Sun GigaSwift Ethernet MMF adapters hardware (driver name ce). The goal is to achieve maximum throughput and reduce the overhead associated with data transfer.

Experiment Setup

Before the data is discussed, let's look at the hardware and software environment for this study. The system under test (SUT) is a domain of a Sun Fire 6800 midframe server. It is configured as follows:

CPU Power—Eight 900 MHz UltraSPARC^_ III+ processors. Eight gigabytes of memory
Operating System—Solaris 8 OE Release 02/02
Network Interface—Sun GigaSwift Ethernet MMF adapters hardware using 66 MHz PCI interface

Five client machines equipped as follows drive the workload:

Client 1—A second domain of the Sun Fire 6800 server with eight 900 MHz UltraSparc III+ processors using Sun GigaSwift Ethernet MMF adapters hardware.
Client 2—A Sun Enterprise 6500 with twelve 400 MHz UltraSPARC II processors using Sun Gigabit™ Ethernet adapters P2.0 hardware
Client 3—A Sun Enterprise 4500 server with eight 400 MHz UltraSparc II processors using Sun Gigabit Ethernet adapters P2.0 hardware
Client 4—A Sun Enterprise 4500 server with eight 400 MHz UltraSparc II processors using Sun Gigabit Ethernet adapters P2.0 hardware
Client 5—A Sun Enterprise 450 server with four 400 MHz UltraSparc II processors using Sun™ Gigabit Ethernet adapters P2.0 hardware

All of the gigabit Ethernet cards on client machines use the 66 MHz PCI bus interface. Solaris 8 OE Release 02/02 is running on all of the client machines. Clients are connected to the server using a gigabit switch. MC-Netperf v0.6.1 [3] developed internally is used to generate the workload. MC-Netperf extends the publicly available Netperf [2] to handle synchronous multiconnection measurements using multicast-based synchronization. Two types of experiments were conducted:

Single-connection test. The second domain of Sun Fire 6800 (Client 1) is used as the client.
10-connection test. Each of the five clients drive two connections.

Runs of 10 minutes are carried out to obtain the numbers discussed in the following section.

Getting an Appropriate Window Size

As described in "Bulk Transfer Traffic Performance" in a switch-based LAN the size of the TCP receive window determines the amount of unacknowledged data the pipe can hold at any given time. The bandwidth delay product determines the pipe size. Since the bandwidth is known to be one Gbps, and the minimal latency is calculated to be 70 to 150 microseconds in "Gigabit Ethernet Latency on a Sun Fire 6800 Server," a pipe must hold at least 1000000000 * 0.000150/8 = 18,750 bytes in transit to achieve one gigabit per second (or 964 Mbps excluding Ethernet, IP, and TCP headers) for packets with a 1,460-byte payload. However, whether or not this 964 Mbps number can be achieved depends on a lot of other factors, and the issue of the receive window size is definitely one of the first that needs to be addressed.

TCP Receive Window Tuning Parameters

When a TCP connection is established, both parties advertise the maximum receive window. The current receive window, which is the actual receive window during the middle of a transmission, is adjusted dynamically based on the receiver's capability. The current send window, although initially small to avoid congestion, ramps up to match the current receive window after the slow start process [4] if the system is tuned properly. A few parameters should be considered in the tuning process. These parameters are considered for the transmit side:

tcp_xmit_hiwat—This parameter is the high watermark for transmission flow control, tunable by using the ndd command. When the amount of unsent data reaches this level, no more data from the application layer is accepted until the amount drops to below the tcp_xmit_lowat (also tunable by using the ndd command). The default value for this parameter is 24,576 in Solaris 8 OE.
Sending socket buffer size—The sending socket buffer is where the application puts data for the kernel to transmit. The size of the socket buffer determines the maximum amount of data the kernel and the application can exchange in each attempt.

For the reception side, these parameters are considered:

tcp_recv_hiwat—The is the high watermark for reception flow control, tunable by using the ndd command. When the application starts to lag behind in reading the data, data starts accumulating in the streamhead. When TCP detects this situation, it starts reducing the TCP receive window by the amount of incoming data on each incoming TCP segment. This process continues until the amount of accumulated data drops to below tcp_xmit_lowat(also tunable by using the ndd command). The default value for this parameter is 24,576 in Solaris 8 OE⁴.
Receiving socket buffer size—Similar to what the sending socket buffer is for on the transmission side. The receiving socket buffer is where the kernel puts data for the application to read.

The parameter tcp_xmit_hiwat determines the default size for the sending socket buffer, so does tcp_recv_hiwat for the receiving socket buffer. However, applications can overwrite the default by creating socket buffers of different sizes when calling the socket library function. Essentially, tuning for the receive window is equivalent to selecting socket buffer size.

At the first glance, it appears that the size of socket buffers should be set to the largest possible value. However, having larger socket buffers means more resources are allocated for each connection. As discussed earlier, a windows of 18,750 bytes may be sufficient to achieve 964 Mbps. Setting socket buffers beyond a certain size will not produce any more benefit. Furthermore, socket buffer sizes determine only the maximum window size. In the middle of TCP data transmission, the current receive window size is the available buffer space in the receiver, and the current send window size is equal to MIN (receive window, send socket buffer). Hence, the size of the send socket buffer should be no smaller than the receive socket buffer to match the sizes of send window and receive window. In the experiments, the sizes of send socket buffer and receive socket buffer are equal.

The window size only applies to each individual connection. For multiple simultaneous connections, the pressure on window size may not be as large as for a single connection. But what exactly is the value needed for the gigabit Ethernet on Sun Fire servers?

Impact of Socket Buffer Size on Single and 10-Connection Throughput

Socket buffers from 16 kilobytes to one megabyte (note that the size of send socket is always matched with that of receive socket) were investigated using the ce interface card. Throughput numbers for both one TCP connection and 10 TCP connections were measured. As FIGURE 1 shows, for a 10-connection sending operation, socket buffers of 48 kilobytes to one megabyte have a significant throughput advantage (up to 20 percent improvement) over socket buffers of smaller sizes. However, for a 10-connection receiving operation, only the 24-kilobyte socket buffer is an under performer. Although 16 kilobytes is quite small, the deficiency in individual connections is more than covered by the existence of multiple connections. For the single connection situation, it appears that 48-kilobyte or larger buffers are needed, but 128-kilobyte or larger buffers do not seem to bring additional benefit. In summary, socket buffers of 64 kilobytes appear to be the best compromise to accommodate both single-connection and 10-connection traffic. Future experiments will use 64-kilobyte socket buffers.

FIGURE 1 Impact of Socket Buffer Size On the Throughput of a ce Card

Reducing Transmission Overhead

Assuming the reception side can receive as fast as the sender can transmit, the sender must minimize the overhead it takes to transmit packets. The associated overhead is mostly in moving data between the socket buffer and the kernel modules, and the number of acknowledgment packets the sender needs to process for smooth pumping of data.

Reducing Overhead to Move Data

Since the Solaris OE must copy the data from the application area to the kernel area for transmission, there is an overhead related to the copy operation. Thendd-tunable parameter tcp_maxpsz_multiplier helps to control the amount of data each copy operation can move (FIGURE 2). Since the TCP module in the kernel needs to process all the pending data before it passes the data down to IP module, this amount should not be too large to prevent the packets from arriving at the hardware in a continuous flow. The default value for this parameter is two in the Solaris 8 OE and the unit is in TCP MSS. But what exactly is the best value for this parameter to support bulk transfer traffic?

FIGURE 2 Effect of tcp.maxpsz Multiplier on Sending Side Performance

To answer this question, 1, 2, 4, 8, 10, 12, 16, 22, and 44 were tried for this parameter using 64-kilobyte socket buffers. Although the maximum allowed value for this parameter is 100, the tests stopped at 44 due to the fact that 64 kilobytes (the socket buffer) divided by 1,460 (the MSS) yields roughly 44.88. As shown in FIGURE 2, this parameter has almost no effect when there are ten connections. For the single-connection case, values of eight or larger deliver about 25 percent higher performance than values of 1, 2, and 4. The best throughput is reached when this parameter is set to 10, so this value is recommended.

Potential Impact of ACK Packets

Another important factor that affects the transmission overhead is the number of acknowledgment (ACK) packets the sender receives for every data packet sent. Each ACK packet is no different than a regular data packet until it reaches the TCP module. Hence, a system must invest a considerable amount of resource to process the ACK packets. The larger the amount of ACK packets, the higher the overhead per data packet.

The ratio of data packets over ACK packets is used to measure this overhead. In Solaris 8 OE, the parameter tcp_deferred_acks_max controls the initial maximal amount of data the receiver (in the same local subnet as the sender) can hold before it must emit an ACK packet. Although the unit of this parameter is in TCP MSS, it is equivalent to the number of packets in the bulk transfer traffic. Hence, setting tcp_deferred_acks_max (use ndd) to eight says the receiver can send one ACK packet for every eight data packets it receives, provided that the timer set by tcp_deferred_acks_interval⁵ does not time-out. The effect of tcp_deferred_acks_max also depends on the link quality and the status of the network. If ACK packets get lost for some reason, the sender will eventually retransmit data. When the receiver sees this, it will adapt itself to send ACK packets more frequently by reducing the amount of data it can hold without ACKing by one MSS. This process will be triggered for every retransmitted segment. However, in the worst case, the receiver will send an ACK packet for every other data packet it receives, as suggested by RFC 1122.

FIGURE 3 shows how the data packet rate changes when the number of ACK packets gets higher in the first 38 seconds of a connection with poor link quality. This experiment uses the default value of eight (MSS) for the parameter tcp_deferred_acks_max and 100 milliseconds for the parameter tcp_deferred_acks_interval. The packet rate is the highest when the connection first starts (due to the resolution of one-second the behavior of slow-start phase cannot be observed). Even though eight data packets are desirable for each ACK packet, this figure started with a ratio of four. This ratio is also due to the low resolution of this graph. The packet rate reported is the average packet rate during the past second. Using the snoop utility, four retransmitted segments can be observed, which forces the receiver's Solaris 8 OE to adjust four times the amount of data it can hold without ACKing. Two more jumps of the ACK packet rate can be observed in the 6th and the 7th seconds. The ratio of data packets to ACK packets remains about 2.0 for the rest of the connection.

FIGURE 3 Impact of ACK Packets on Data Packet Rate

CPU Utilization

Data transfer is only a part of any running user application. To have an application running fast, the CPU time spent in data transfer must be minimized. Hence, knowing how much CPU time is dedicated to data transfer helps to plan the capacity requirement.

The CPU utilization for ce cards when the number of CPUs goes from one to eight was evaluated (FIGURE 4 through FIGURE 7). The utilization numbers shown in these figures are reported as the percentage of time that all available CPUs are engaged. A single number can have different meanings when the underlying number of CPUs is different. For instance, 50 percent utilization for a system with four CPUs means two CPUs are busy on average. Fifty percent utilization for a system with eight CPUs means four CPUs are busy on average.

FIGURE 4 Throughput and Amount of Kernel Mode CPU Time Required To Support One TCP Sending Operation By One ce Card

FIGURE 5 Throughput and Amount of Kernel Mode CPU Time Required To Support One TCP Receiving Operation By One ce Card

FIGURE 6 Throughput and Amount of Kernel mode CPU Time Required To support 10 Simultaneous Sending Operations By One ce Card

FIGURE 7 Throughput and Amount of Kernel mode CPU Time Required To support 10 Simultaneous Receiving Operations By One ce Card

One-Card, One-Connection CPU Utilization

In the one-connection case⁶ (FIGURE 4 and FIGURE 5), two CPUs are sufficient to drive one ce card to its maximum capacity for send-only operations. For reception, three CPUs are necessary to have one ce card deliver 600 Mbps, and five CPUs are needed to deliver 700 Mbps. The best reception performance is achieved with six CPUs (736 Mbps); seven or eight CPUs do not seem to bring additional performance benefit. Even though the overall CPU utilization drops to 15 percent when there are eight CPUs (FIGURE 4), it takes about 8 * 0.15 = 1.2 CPUs to handle network traffic. When there are five CPUs, the number of CPUs to handle network traffic is 5 * 0.26 = 1.30, not far from the 1.2 CPUs needed in the 8-CPU case. In summary, two CPUs are necessary to obtain good performance for one ce card. Diminishing returns are observed for three or more CPUs.

One-Card, Ten-Connection CPU Utilization}

For the 10-connection scenario (FIGURE 6 and FIGURE 7), each additional CPU brings higher sending performance, with eight CPUs achieving 830 Mbps. About 8 * 35% = 2.80 CPUs are dedicated to network traffic. For receiving, ce can reach close to line speed (920 Mbps) with six CPUs, utilizing the power of 6 * 70% = 4.2 CPUs. Diminishing returns are observed when four or more CPUs are added, indicating a recommendation of three CPUs for one ce card.

Note that the preceding numbers are measured using ce driver version 1.115. The Sun engineering team devotes continuous effort to improving both the throughput and utilization. You can expect to observe better than what is reported in your future experiments.

< Back Page 4 of 10 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children

This site is not directed to children under the age of 13.

Marketing

Pearson may send or direct marketing communications to users, provided that

Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
Such marketing is consistent with applicable law and Pearson's legal obligations.
Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

As required by law.
With the consent of the individual (or their parent, if the individual is a minor)
In response to a subpoena, court order or legal process, to the extent permitted or required by law
To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
To investigate or address actual or suspected fraud or other illegal activities
To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020

Email Address