- The Need for QoS
- QoS Components
- QoS Capable Devices
- Summary
QoS Components
To give you enough background on the fundamentals and an implementation perspective, this section describes the overall network and systems architecture and identifies the sources of delays and a good overall understanding why QoS is essentially about controlling network and system resources in order to achieve more predictable delays for preferred applications. In this section, a generic QoS system overview is presented, describing the following high level QoS internal functional components.
Implementation Functions
The following are necessary implementation functions as well as challenges that may be experienced in practice:
Traffic Rate Limiting and Traffic Shaping - Token Leaky Bucket Algorithm. Network traffic is always bursty. The level of burstiness is controlled by the time resolution of the measurements. Rate limiting controls the burstiness of the traffic coming into a switch or server. Shaping refers to the smoothing of the egress traffic. Although these two functions are opposite, the same class of algorithms are used to implement these functions.
Packet Classification - Individual flows must be identified and classified at line rate. Fast packet classification algorithms are crucial, as every packet must be inspected and matched against a set of rules that determine the class of service that the specific packet should receive. The packet classification algorithm has serious scalability issues; as the number of rules increases it takes longer to classify a packet.
Packet scheduling - In order to provide differentiated services, the packet scheduler needs to decide quickly which packet should be scheduled and when. The most simplest packet scheduling algorithm is strict priority, however, this often does not work as low priority packets are starved and may never get scheduled.
QoS Metrics
QoS is defined by a multitude of metrics. The simplest is bandwidth, which can be conceptually visioned as a logical (or smaller) pipe of a larger pipe. But since actual network traffic network traffic is bursty, a fixed bandwidth would be wasteful since at one instant in time one flow would perhaps use 1% of this pipe, while another customer may need 110% of his allocated pipe. To reduce waste, certain burst metrics are used to determine how much of a burst and how long a burst can be tolerated. Other important metrics that directly impact the quality of service include packet loss rate, delay and jitter (variation in delay). The network and computing components that control these metrics are described later in this article.
Network and Systems Architecture Overview
In order to fully understand where QoS fits into the overall picture of network resources, it is useful to take a look at the details of the complete network path traversal, starting from the point where a client sends a request, traverses various network devices, and finally arrives at the destination where the server processes the request.
There are different classes of applications, having different characteristics and requirements (see the Section, "The Need for QoS" for additional details). It is due to the fact that there are several federated networks combined with different traffic characteristics that makes end-to-end QoS a complex issue.
FIGURE 1 illustrates a high-level overview of the components involved in an end-to-end packet traversal of an enterprise that relies on a service provider. There are two different paths shown, both originate from the client and end at a server.
FIGURE 1 Overview of End to End Network and Systems Architecture
Path A-H is a typical scenario, where the client and servers are connected to different local Internet Service Providers (ISPs) and need to traverse different ISP networks. There can be multiple Tier 1 ISPs traversed, connected together by peering points such as Metropolitan Area Exchange (MAE)-East or private peering points such as Sprints Network Access Point (NAP).
Path 1-4 is an example of the client and server connected to the same local Tier 2 ISP, when both client and server are physically located in the same geographical area.
In either case, the majority of the delays are attributed to the switches in the Tier 2 ISP. The links from the end-user customers to the Tier 2 ISP tend to be slow links, but the Tier 2 ISP aggregates many links, hoping that not all subscribers will use the links at the same point in time. If they do, packets get buffered up and eventually get dropped.
Implementing QoS
The previous section explained the positioning of deploying a QoS capable device, which can be a network switch/router or a server. The server can implement QoS on the network interface card or in the protocol stack. In either case, between the application socket end points. This section describes how this device actually implements QoS, with a focus on network traffic.
You can implement QoS in many different ways. Each domain has control over its resources and can implement QoS on its portion of the end-to-end path using different technologies. Two particular domains of implementation are:
EnterpriseEnterprises can control their own networks and systems. From a local ethernet/token ring LAN perspective, IEEE 801.p, can be used to mark frames according to priorities. These marks allow the switch to offer preferential treatment to certain flows across Virtual Local Area Networks (VLANS). For computing devices, there are facilities that allow processes to run at higher priorities, thus obtaining differentiated services from a process computing perspective.
Network Service Provider (NSP)The NSP, in general, aggregates traffic and forwards either within their own network or hands-off to another NSP. The NSP can use technologies such as DiffServ or IntServ, to prioritize the handling of traffic within their networks. Service Level Agreements (SLAs) are required between NSP to obtain a certain level of QoS for transit traffic.
Asychronous Transfer Mode (ATM)
This section takes a quick look at ATM from a QoS perspective. After 1995, ATM started taking off in data networks, with one of its advantages being that ATM provided QoS. NSPs implement QoS at both the IP layer and the ATM layer while most ISPs still have ATM networks, that carry IP traffic. ATM itself offers six types of QoS services. These six types are:
Constant Bit Rate (CBR)Provides a constant bandwidth, delay and jitter throughout the life of the ATM connection.
Variable Bit Rate-Real Time (VBR-rt)Provides constant delay and jitter, but variations in bandwidth
Variable Bit Rate-Non Real Time (VBR-nrt)Provides variable bandwidth, delay and jitter, but has a low cell loss rate.
Unspecified Bit Rate (UBR)Provides "Best Effort" service, no guarantees.
Available Bit Rate (ABR)Provides no guarantees, expects the applications to adapt according to network availability.
Guaranteed Frame Rate (GFR)Provides some minimum frame rate, delivers entire frame or none, used for ATM Adaptation Layer 5 (AAL5).
One of the main difficulties in providing an end-to-end QoS solution is that there are so many private networks that must be traversed, and each network has their own QoS implementations and business objectives. The Internet is constructed such that networks interconnect or "Peer" with other networks. One network may need to forward traffic of other networks. Depending on the arrangements, competitors may not forward the traffic in the most optimal manner. This is what is meant by business objectives.
Sources of Unpredictable Delay
From a non-real-time system computing perspective, delays that are unpredictable are often due to limited CPU resources or disk I/O latencies. These degrade during a heavy load. From a network perspective, there are many components that add up to the cumulative end-to-end delay. This section describes some of the important components that contribute to delay. The aim of this section is to explain that the choke points are at the access networks, where the traffic is aggregated and forwarded to a backbone or core. Service providers will over allocate their networks to increase profits and hope that not all subscribers will want network access at the same instant in time.
FIGURE 2 was constructed by taking out path A-G in FIGURE 1 and projecting it onto a Time-Distance plane. This is a typical web client accessing the Internet site of an enterprise. The vertical axis indicates the time that elapsed for a packet to travel a certain link segment. The horizontal axis indicates the link segment that a packet traverses. At the top, we see the network devices and vertical lines that project down to the distance axis, clearly showing the corresponding link segment. In this illustration, an IP packet's journey starts from the point in time when a user clicks on a web page. The Hyper Text Transfer Protocol (HTTP) request maps first to a TCP three-way handshake to create a socket connection. The first TCP packet is the initial SYN packet, which first traverses segment 1 and is usually quite slow since this link is typically 30 Kbyte/sec using a 56 Kbyte/sec modem, depending on the quality and distance of the last mile wiring.
FIGURE 2 One Way End-to-End Packet Data Path Transversal
Network Delay is composed of two components:
Propagation delay that depends on the media and distance.
Line rate that primarily depends on the link rate and loss rate or Bit Error Rate (BER).
The odd number links of FIGURE 2 represent the link delays. Please note that segment and link are used interchangeably.
Link 1, in a typical deployment, is the copper wire, or the "last mile" connection from the home or Small Office/Home Office (SOHO) to the Regional Bell Operating Company (RBOC). This is how a large portion of consumer clients connect to the Internet.
Link 3 is an ATM link inside the Carriers internal network, usually a Metropolitan Area Network Link.
Link 5 connects the Tier 2 ISP to the Tier 1 ISP.
This provides a Backbone Network. This link is a larger pipe, which can range from T1 to Operating Carrier 3 (OC-3) while growing.
Link 7 is the Core Network (POTS, Plain old telephone system) of the backbone Tier 1 provider.
This core is typically extremely fast consisting of DS3 links (the same ones used by International Discount Telecommunication (IDT)) or more modern links (like the ones used by VBNS of OC-48) and links who are beta testing OC-192 links while running Packet over Synchronous Optical Network (SONET) and eliminating the inefficiencies of ATM altogether.
Links 9 and 11 are a reflection of links 5 and 3.
Link 13 is a typical leased line, T1 link to the enterprise. This is how most enterprises connect to the Internet. However, after the 1996 telecommunications act, Competitive Local Exchanges (CLECs) emerged. CLECs provide superior service offerings at lower prices. Providers such as Qwest and Telseon provide Gigabit Ethernet connectivity at prices that are often below OC-3 costs (based on prices at the time writing).
Link 15 is the enterprise's internal network.
There should be a channel service (Time Division Multiplexing (TDM) side) and data service device (Data side), that terminates the T1 line and converts it to ethernet.
The even number links of FIGURE 2 represent the delays experienced in switches. These delays are composed of switching delays, route lookups, packet classification, queueing, packet scheduling and internal switch forwarding delays, such as sending a packet from the ingress unit, through the backplane to the egress unit.
As FIGURE 2 illustrated, QoS is needed to control access to shared resources during episodes of congestion. The shared resources are servers and specific links. For example, Link 1 is a dedicated point-to-point link, where a dedicated voice channel is setup at calltime, with a fixed bandwidth and delay. While Link 13 is a permanent circuit as opposed to a switched dedicated circuit, however, this is a digital line. QoS is usually implemented in front of a congestion point. QoS will restrict the traffic that is injected into the congestion point. Enterprises will have QoS functions that restrict the traffic that is being injected to their service provider. The ISP will have QoS functions that restrict the traffic that is injected into their core.
Tier 2 ISPs oversubscribe their bandwidth capacities hoping that not all their customers will need bandwidth at the same time. During episodes of congestion, switches buffer up packets until they can be transmitted. Link 5 and 9 are boundary links that connect two untrusted parties. The Tier 2 ISP must control the traffic injected into the network that must be handled by the Tier 1 ISP's core network. Tier 1 polices the traffic that customers inject into the network at Links 5 and 9. At the enterprise, many clients need to access the servers.