NIC Teaming
Let’s take a well-deserved break from networking math for a moment and shift into the fun world of NIC teaming. The concept of teaming goes by many different names: bonding, grouping, and trunking to name a few. Really, it just means that we’re taking multiple physical NICs on a given ESXi host and combining them into a single logical link that provides bandwidth aggregation and redundancy to a vSwitch. You might think that this sounds a little bit like port channels from earlier in the book. And you’re partially right—the goal is very similar, but the methods are vastly different.
Figure 8.8 shows all the configuration options for teaming and failover.
Figure 8.8 Configuration options for teaming and failover, as viewed from the vSphere Web Client
Let’s go over all of the configuration options for NIC teaming within a vSwitch. These options are a bit more relevant when your vSwitch is using multiple uplinks but are still valid configuration points no matter the quantity of uplinks.
Load Balancing
The first point of interest is the load-balancing policy. This is basically how we tell the vSwitch to handle outbound traffic, and there are four choices on a standard vSwitch:
- Route based on the originating virtual port
- Route based on IP hash
- Route based on source MAC hash
- Use explicit failover order
Keep in mind that we’re not concerned with the inbound traffic because that’s not within our control. Traffic arrives on whatever uplink the upstream switch decided to put it on, and the vSwitch is only responsible for making sure it reaches its destination.
The first option, route based on the originating virtual port, is the default selection for a new vSwitch. Every VM and VMkernel port on a vSwitch is connected to a virtual port. When the vSwitch receives traffic from either of these objects, it assigns the virtual port an uplink and uses it for traffic. The chosen uplink will typically not change unless there is an uplink failure, the VM changes power state, or the VM is migrated around via vMotion.
The second option, route based on IP hash, is used in conjunction with a link aggregation group (LAG), also called an EtherChannel or port channel. When traffic enters the vSwitch, the load-balancing policy will create a hash value of the source and destination IP addresses in the packet. The resulting hash value dictates which uplink will be used.
The third option, route based on source MAC hash, is similar to the IP hash idea, except the policy examines only the source MAC address in the Ethernet frame. To be honest, we have rarely seen this policy used in a production environment, but it can be handy for a nested hypervisor VM to help balance its nested VM traffic over multiple uplinks.
The fourth and final option, use explicit failover order, really doesn’t do any sort of load balancing. Instead, the first Active NIC on the list is used. If that one fails, the next Active NIC on the list is used, and so on, until you reach the Standby NICs. Keep in mind that if you select the Explicit Failover option and you have a vSwitch with many uplinks, only one of them will be actively used at any given time. Use this policy only in circumstances where using only one link rather than load balancing over all links is desired or required.
Network Failure Detection
When a network link fails (and they definitely do), the vSwitch is aware of the failure because the link status reports the link as being down. This can usually be verified by seeing if anyone tripped over the cable or mistakenly unplugged the wrong one. In most cases, this is good enough to satisfy your needs and the default configuration of “link status only” for the network failure detection is good enough.
But what if you want to determine a failure further up the network, such as a failure beyond your upstream connected switch? This is where beacon probing might be able to help you out. Beacon probing is actually a great term because it does roughly what it sounds like it should do. A beacon is regularly sent out from the vSwitch through its uplinks to see if the other uplinks can “hear” it.
Figure 8.9 shows an example of a vSwitch with three uplinks. When Uplink1 sends out a beacon that Uplink2 receives but Uplink3 does not, this is because the upstream aggregation switch 2 is down, and therefore, the traffic is unable to reach Uplink3.
Figure 8.9 An example where beacon probing finds upstream switch failures
Are you curious why we use an example with three uplinks? Imagine you only had two uplinks and sent out a beacon that the other uplink did not hear. Does the sending uplink have a failure, or does the receiving uplink have a failure? It’s impossible to know who is at fault. Therefore, you need at least three uplinks in order for beacon probing to work.
Notify Switches
The Notify Switches configuration is a bit mystifying at first. Notify the switches about what, exactly? By default, it’s set to “Yes,” and as we cover here, that’s almost always a good thing.
Remember that all of your upstream physical switches have a MAC address table that they use to map ports to MAC addresses. This avoids the need to flood their ports—which means sending frames to all ports except the port they arrived on (which is the required action when a frame’s destination MAC address doesn’t appear in the switch’s MAC address table).
But what happens when one of your uplinks in a vSwitch fails and all of the VMs begin using a new uplink? The upstream physical switch would have no idea which port the VM is now using and would have to resort to flooding the ports or wait for the VM to send some traffic so it can re-learn the new port. Instead, the Notify Switches option speeds things along by sending Reverse Address Resolution Protocol (RARP) frames to the upstream physical switch on behalf of the VM or VMs so that upstream switch updates its MAC address table. This is all done before frames start arriving from the newly vMotioned VM, the newly powered-on VM, or from the VMs that are behind the uplink port that failed and was replaced.
These RARP announcements are just a fancy way of saying that the ESXi host will send out a special update letting the upstream physical switch know that the MAC address is now on a new uplink so that the switch will update its MAC address table before actually needing to send frames to that MAC address. It’s sort of like ESXi is shouting to the upstream physical switch and saying, “Hey! This VM is over here now!”
Failback
Since we’re already on the topic of an uplink failure, let’s talk about Failback. If you have a Standby NIC in your NIC Team, it will become Active if there are no more Active NICs in the team. Basically, it will provide some hardware redundancy while you go figure out what went wrong with the failed NIC. When you fix the problem with the failed Active NIC, the Failback setting determines if the previously failed Active NIC should now be returned to Active duty.
If you set this value to Yes, the now-operational NIC will immediately go back to being Active again, and the Standby NIC returns to being Standby. Things are returned back to the way they were before the failure.
If you choose the No value, the replaced NIC will simply remain inactive until either another NIC fails or you return it to Active status.
Failover Order
The final section in a NIC team configuration is the failover order. It consists of three different adapter states:
- Active adapters: Adapters that are Actively used to pass along traffic.
- Standby adapters: These adapters will only become Active if the defined Active adapters have failed.
- Unused adapters: Adapters that will never be used by the vSwitch, even if all the Active and Standby adapters have failed.
While the Standby and Unused statuses do have value for some specific configurations, such as with balancing vMotion and management traffic on a specific pair of uplinks, it’s common to just set all the adapters to Active and let the load-balancing policy do the rest. We get more into the weeds on adapter states later on in the book, especially when we start talking about iSCSI design and configuration in Part 3, “You Got Your Storage in My Networking: IP Storage.”