SLA Support in the ASP Environment
In order to meet the business SLA requirements within an ASP environment, an ASP needs to allocate the different resources that are available within the environment. The two key resources available within a server farm are the network bandwidth and the server processing capacity assigned to a specific customer.
The main aspect of dealing with SLAs for service providers involves the proportion of the network bandwidth that is to be allocated to each of the service levels. In most cases, the bandwidth is the amount of traffic generated by the servers, rather than the traffic coming into the servers. The allocation of access link bandwidth can be done in one of the following ways:
The access router maintains separate queues for each of the service levels. Most access routers can be configured to support a limited number of queues, and they use the weighted round robin queuing discipline to service the queues. The queuing discipline allows the sharing of access link bandwidth in the specified manner in cases of high load. If a customer is not using its allocated capacity, the remainder is shared among the active customers. The drawback of this approach is that most access routers permit a relatively limited number of queues (four or eight) to be created in this manner. If a service provider is supporting many customers, this approach is likely to be inadequate. The other drawback of rate control at the access router is that the behavior of the TCP connections in the presence of packet drops at the access router results in erratic application performance.
The access router implements a partial TCP stack and manipulates the TCP header information to control how much traffic flows into the network. The amount of data that TCP sends out into the network is controlled by its congestion control algorithm. At any time, TCP determines the total number of packets in flight in the network by the number of bytes that are acknowledged in each message from the other side. By reducing an acknowledgment of, for example, 1024 bytes into two acknowledgments of 512 bytes each, the transmission rates can be made smoother. Similarly, determining a maximum rate at which these bytes will be acknowledged can reduce the maximum bandwidth used in the network by any application.
Rates are allowed to different servers, each of which implements packet shaping and rate control at the server. This shaping can be done at the TCP connection level and therefore is more efficient at meeting the desired throughput on a per-connection basis. The advantage of this approach is that unnecessary packet losses (and the resulting TCP cut down of rate) are avoided. Server-based rate control tends to produce more predictable application performance than network-based rate control. However, the partitioning of bandwidth among different customers means that there might be excess capacity assigned to a customer who is not using it, but no other customer can use that bandwidth. This also requires that the server's TCP/IP stack support packet shaping and rate control. Not all commercial servers provide this support.
Apart from bandwidth, the other resources in the server farm can also be allocated to the various customers. Depending on the hosting environment, these resources could take the form of the space within the hosting complex, the number of servers, the disk space allocated from a common pool, or the number of access links available on the back end. The service provider must put into place policies that dictate how many servers will be allocated to each customer and under what conditions. Similarly, other resources in the server farm may be reallocated among the different customers.