Scalability
Scalability is described as how well an architecture will perform when the size of the system increases. Scalability is usually divided in two types: vertical and horizontal. Vertical scaling basically means putting more resources into the system to increase performance. Horizontal scaling basically means adding more servers to the configuration.
For solutions based on the Sun ONE Portal Server software, scalability is one of the factors that will define an architecture. This section describes how each component of the Sun ONE Portal server scales and what the common problems and limitations to overcome are.
Portal Server Instance Scalability
The Sun ONE Portal Server software can scale up to four CPUs for a single instance. This limit is due to the garbage collection system of the underlying Java VM, which becomes a bottleneck under heavy loads. If the Sun ONE Web Server is used as a web container, the Sun ONE Portal Server software can support multiple instances of the web server on the same machine. In this way, it can it can scale vertically.
When an Application Server is used as a web container, multiple instances on the same machine are not supported, so the only way to scale is horizontally by adding more servers to the configuration. In both cases, a load balancer or gateway servers should be used to provide a single system image.
The Sun ONE Portal Server software distribution contains a tuning utility that automates the tuning of a Portal Server node. The perftune script tunes the Solaris OS kernel and TCP parameters, and it modifies the Sun ONE Web Server software, the Sun ONE Directory Server software, the Sun ONE Identity Server software, and the Sun ONE Portal Server software configuration files.
The script implements two tuning strategies:
Production optimum for a high level of user requests from a small number of users
Production large for a low level of requests from a large number of users
The perftune script changes the application's configuration parameters to values that had been found to generally increase portal throughput, but it is possible that further tuning will be required to achieve optimum performance. These changes must be tested and validated on each portal installation. Such tuning should be done by someone who has an in-depth understanding of the Portal Server component products.
Gateway Scalability
The gateways can scale both vertically by running multiple instances of the Java technology-based process that implements the gateway and horizontally by adding more servers to the configuration. The best option is to use multiple small servers with up to four CPUs for the gateways, accessed through a load balancer. In this way, not only is it easy to scale the system if more power is required by adding another server to the configuration, but the load balancer can also failover requests if the gateway server goes down. The same applies to the Netlet and Rewriter proxies, except that a load balancer is not required because the gateways will distribute the connections in a round-robin manner among all of the available Netlet and Rewriter proxies.
The only case in which it is recommended to use multiple gateway instances on the same server is when the gateway must be accessed through different URLs. In this case, each URL can have its own X.509 certificate associated with it. The only way to implement this is with multiple gateway instances because each instance uses a single certificate.
The gateways have been tested with most commercial load balancers, such as Alteon, Cisco CSS, BigIP, and Central Dipatch. However, the load-balancing requirement for the gateway is very basic: load balancing should be based on the SSL session ID. There is no need for the load balancer to analyze the HTTP header or use complex content management operations. The most cost-effective solution is to use an inexpensive load balancer that supports persistence based on the SSL session ID. The use of a software load balancer is probably the best option because it can be installed on the same server that is supporting the gateway; thus, there is no requirement to purchase, monitor, or maintain additional hardware.
Directory Server Scalability
To avoid a bottleneck in the access to the Directory Server, especially when multiple Portal Server instances are used, you should use a Directory Server instance installed on each Portal Server node. Without it, the requests from the Identity Server to the LDAP server could saturate the network connection between the Portal Server and the Directory Server. If a consumer is used, the replication protocol will be more efficient than pointing the Identity Server to use an LDAP server on a different machine. The only caveat is that the Identity Server administration console must be used against an LDAP master server because the Identity Server is affected by the propagation delay of the replication protocol.
SSL Accelerators
SSL encryption and decryption can use a lot of processor cycles, limiting the number of SSL transactions per second that the system can sustain. To increase the ability to handle SSL transactions, as well as to reduce processor overhead, it is common to offload SSL processing from the server to a dedicated board installed on the system (internal SSL accelerator) or to a network device (external SSL accelerator).
External SSL accelerators, such as Cisco's CSS SSL module, are not supported on the Portal Server instances. Internal SSL accelerators, such as the Sun™ Crypto Accelerator 1000 board, will only accelerate the establishment of an SSL session, not the bulk encryption. Thus, SSL accelerators are a good option for portals for which there are a lot of short-lived sessions (that is, sessions lasting only a few minutes). If sessions are kept open for long periods of time, the accelerators will provide very little relief.
The Netlet component of the Sun ONE Portal Server software also does not support external accelerators, but you can use internal SSL accelerators. However, because the Netlet is used to proxy TCP sessions that tend to be of long duration, it is very unlikely that an SSL accelerator will provide significant gains in performance.
Sizing
The sizing of a Portal Server is an extremely difficult task because it is impossible to test and benchmark all of the applications that can be sent through the portal solution. Sizing is also difficult because each customer might want to integrate applications in different ways, possibly using their own customizations. There is a Sun internal Portal Server sizing tool, as mentioned in the Sun ONE Portal Server Deployment Guide. This sizing tool was built using benchmarks that simulated a limited number of possible scenarios.
To adapt the results from the tools to the reality of a given Portal Server, the sizing tool requires you to use a SHARP factor to propose the number of CPUs that are needed to support the Portal Server software. The SHARP factor summarizes in a single number the differences between your environment and the environment used as the baseline for factoring the sizing tool. Because the performance details of the desired production environment are not known and the scenarios used to build the sizing tool are not fully disclosed, a certain amount of guesswork is needed to obtain a final configuration using this tool.
A more scientific approach is to use the concept of building blocks, as explained in the Sun ONE Portal Server Deployment Guide. You can use the sizing tool to determine a building block size (that is, what kind of servers should be used as the Portal Server nodes). Then, you can use that building block size to implement a pilot that integrates most of the applications that will be accessed through the Portal Server and test this configuration with a limited number of real users.
The usage patterns of the users can be collected and used to create a load test script for any commercial load-testing tool. The test script enables you to obtain a detailed load curve for the building block. With this information, you can size your production architecture, based on the applications to be integrated and the expected number of users of the system. You can also use the load curve to predict when a building block needs to be added to the production architecture or when the number of users reaches a certain limit.