Web Applications
Web applications use enabling technologies to make their content dynamic and to allow users of the system to affect business logic on the server. The distinction between Web sites and Web applications is subtle and relies on the ability of a user to affect the state of the business logic on the server. Certainly, if no business logic exists on a server, the system should not be termed a Web application. For those systems on which the Web serveror an application server that uses a Web server for user inputallows business logic to be affected via Web browsers, the system is considered a Web application. For all but the simplest Web applications, the user needs to impart more than just navigational request information; typically, Web application users enter a varied range of input data: simple text, check box selections, or even binary and file information.
The distinction becomes even more subtle in the case of search engines, on which users do enter in relatively sophisticated search criteria. Search engines that are Web sites simply accept this information, use it in some form of database SELECT statement, and return the results. When the user finishes using the system, there is no noticeable change in the state of the search engineexcept, of course, in the usage logs and hit counters. This is contrasted with Web applications that, for example, accept online registration information. A Web site that accepts course registration information from a user has a different state when the user finishes using the application.
The architecture for a Web site is straightforward. It contains the same principal components of a Web site: a Web server, a network connection, and client browsers. Web applications also include an application server. The addition of the application server enables the system to manage business logic and state. A more detailed discussion of Web application architectures is given in Chapter 7, Defining the Architecture.
Client State Management
One common challenge of Web applications is managing client state on the server. Owing to the connectionless nature of client and server communications, a server doesn't have an easy way to keep track of each client request and to associate it with the previous request, since each and every Web page request establishes and breaks a completely new set of connections.
Managing state is important for many applications; a single use case scenario often involves navigating through a number of Web pages. Without a state management mechanism, you would have to continually supply all previous information entered for each new Web page. Even for the most simple applications, this can get tedious. Imagine having to reenter the contents of your shopping cart every time you visit it or to enter in your user name and password for each and every screen you visit while checking your Web-based e-mail.
To address this common problem, the W3C has proposed an HTTP state management mechanism.6 This mechanism, more commonly known as "cookies," has received quite a bit of attention from privacy advocates in the past few years and will most likely continue to as more and interesting uses of this mechanism are found. This book isn't about privacy concerns but rather is focused on the technology around Web applications, so I'll focus on describing the technology and leave the philosophy to you.
Cookies
A cookie is a piece of data that a Web server can ask a Web browser to hold on to, and to return every time the browser makes a subsequent request for an HTTP resource to that server. Typically, the size of the data is small, between 100 and 1K bytes; however, the official limit is around 4K. Initially, a cookie is sent from a server to a browser by adding a line to the HTTP headers:
Content-type: text/html Set-Cookie: sessionid=12345; path=/; expires Mon, 09-Dec-2002 11:21:00 GMT; secure
If the browser is configured to accept cookies, the line is accepted and stored somewhere on the client's machine, depending on the browser vendor. After that, each and every HTTP request to the server is sent back the values of these cookies.
When it is sent to a client, a cookie can have up to six parameters passed with it:
Name (required)
Value (required)
Expiration date
Path
Domain
Requires a secure connection
The Set-Cookie header is a string that contains characters, including white space, commas, and the semicolon. The name and value parameters are required and must not contain white space, commas, or semicolons. The expiration date tells the browser how long to keep this information. The path and the domain are a way of determining which servers, or domains, to send the cookies back to. If the domain is not set explicitly, it defaults to the full domain of the document creating the cookie. The path helps organize cookies within a domain. Only when a resource is requested in the domain under the path will the cookie be sent back to the server.
The server sending the cookie must be a member of the domain that is specified. Thus, a server in the domain http://www.myserver.com cannot set a cookie for the domain http://www.otherserver.com. If it could, one company would be able to set cookies in another company's domain.
The server can send multiple Set-Cookie headers with an HTTP response. When the browser responds with the Set-Cookie header, all cookies for the domain and the path are returned. For example, the server could have included the following Set-Cookie headers:
Set-Cookie: sessionid=12345; path=/; expires Mon, 09-Dec-2002 11:21:00 GMT Set-Cookie: colorPref=Blue; path=/; expires Mon, 09-Dec-2002 11:21:00 GMT
When it requests a URL in the path / on this same server, the client sends with its HTTP request the following:
Cookie: sessionid=12345; colorPref=Blue
When the response is returned, the server might set another cookie with:
Set-Cookie: rateCode=B; path=/order
When it requests a URL in path /order, a client sends:
Cookie: sessionid=12345; colorPref=Blue; rateCode=B
Note that all three cookies are sent with the request because the first two are in a higher path and are "inherited" in the mapping.
In addition to the server's being able to set a cookie value, so too can JavaScript. Chapter 3, Dynamic Clients, describes the capabilities of client-side scripting in more detail; here, however, it is sufficient to say that cookies can be set and obtained in multiple ways. The specific mechanisms for setting and accessing cookies are typically provided by the development environment and architecture, by a single function call to an accessible object.
This mechanism is not without faults. Privacy advocates point to cookies as the primary mechanism supporting the tracking of unknowing users across multiple Web sites. In fact, while writing this chapter, I wanted to look at some sample cookies on my machine. When I scanned the list, I was surprised to find a few cookies from domains that I know I had never visited. Of course, this piqued my curiosity, and as I looked at the data in the cookies, I noticed name/value pairs that included URLs from sites I do remember visiting. Investigating a little further, I found out that these cookies were placed on my machine through the use of banner ads that appeared in the sites that I did visit.
The reality here is that the images in most banner ads are not hosted by the sites that referenced them. Rather, companies specializing in banner ads sell a service to Web sites. When someone visits those sites, the companies provide most of the content of the Web page, as well as a reference to an image stored on the advertisement company's server. Because the image is obtained with a standard HTTP request, the exchange of cookies also happens with this "other" server. So when you visit a Web page that has banner ads in it, they most likely are coming from another company's server and are being collected and managed by that company. After a while, you will visit enough Web sites using the same advertiser's server that the banner ad company can start to build a profile of the sites you visit most and begin to target more appropriate advertisement for you.
Using cookies in this way is very controversial and has led to the heated debate on the use of cookies, privacy, and the Internet. But we won't focus on that type of usage here. Instead, we'll look at how cookies were intended to be used: to manage client state in the context of a single use case or set of use cases.
Sessions
A session represents a single cohesive use of the system. A session usually involves many executable Web pages and a lot of interaction with the business logic on the application server. Because achieving a use case goal often requires the successful execution of a number of executable Web pages, it is often useful to keep track of a client's progress throughout the use case session.
The most common example of keeping client state on the server can be found on the Internet at any e-commerce site. The use of virtual shopping carts is a nice feature of an online store. A shopping cart contains all the items an online customer has selected from the store's catalog. In most sites, the shopper can check the contents of the cart at any time during the session. This feature requires that the server be capable of maintaining some state about the client across a series of Web page requests.
Session state in a Web application can be maintained in four common ways, two of which require the use of cookies:
Place all state values in cookies.
Place a unique key in the cookie and use with a server-managed dictionary or map.
Include all state values as parameters in every URL of the system.
Include a unique key as a parameter in every URL of the system and use with a server-managed dictionary or map.
When you place all state values in cookies, you are first limited by size (4K) and at most 20 cookies per domain. All state data must be encoded into simple text: no white space, semicolons, and so on. You can't directly use higher-level objects in the session state. The real limitation, however, is that many clients' security settings don't allow the automatic storing of cookies. If the application is an Internet application targeting the consumer market, you don't want to automatically turn away a significant number of potential customers without a good reason.
When a unique key is used in a cookie and then used on the server as a key into a dictionary or a map, any type of server-side object can be part of the session state. This is the default mechanism used by most Web applicationenabling environments, such as ASP and JSP. It is very effective and flexible; however, like any cookie-based method, it depends on the willingness of clients to accept cookies.
URL redirection is the other class of session management. In this mechanism, all URLs in the system are dynamically constructed to include parameters that contain either the entire session state or only one key into a server-side dictionary.
Each mechanism has tradeoffs. Keeping a dictionary in memory for every user of the system could be very expensive if it never expired. For practical reasons, most session dictionaries are removed when the Web application user either finishes the process or stops using the system for a set period of time. A session timeout value of 15 minutes is typical. No matter what technique is used, the management of client state on the server is almost always an issue in Web applications.
Enabling Technologies
The enabling technologies for Web applications are varied and differentiated principally by the vendor. Enabling technologies are, in part, the mechanism by which Web pages become dynamic and respond to user input. Of the several approaches to enabling a Web application, the earliest involved the execution of a separate module by a Web server. Instead of requesting an HTML-formatted page from the file system, the browsers would request the module, which the Web server interpreted as a request to load and to run the module. The module's output is usually a properly formatted HTML page but could be image, audio, video, or other data.
The original mechanism for processing user input in a Web system is the Common Gateway Interface (CGI), a standard way to allow Web users to execute applications on the server. Because letting users run applications on your Web server might not be the safest thing in the world, most CGI-enabled Web servers require CGI modules to reside in a special directory, typically named cgi-bin. CGI modules can be written in any language and can even be scripted. In fact, the most common language for small-scale CGI modules is Perl (practical extraction and reporting language), which is interpreted each time it is executed.
Even though HTML documents are the most common output of CGI modules, they can return any number of document types. They can send to the client an image, plaintextan ASCII document with no special formattingaudio, or even a video clip. They can also return references to other documents. In order for it to interpret the information properly, the browser must know what kind of document it is receiving. In order for the browser to know this, the CGI module must tell the server what type of document it is returning.
In order to tell the server what kind of document is being sent backa full document or a reference to oneCGI requires a short header on the output. This header is ASCII text, consisting of separate lines followed by a single blank line. For HTML documents, the line would be
Content-type: text/html
If it does not build the returning HTML Web page, the CGI module can redirect the Web server to another Web page on the server or even another CGI module. To accomplish this, the CGI module simply outputs a header similar to
Location: /responses/default.html
In this example, the Web server is told to return the page default.html from the responses directory.
The two biggest problems with CGI are that it doesn't automatically provide session management services and that every execution of the CGI module requires a new and separate process on the application/Web server. Creating a lot of processes can be expensive on the server.
All the available solutions overcome the multiprocess problems of CGI by adding plug-ins to the Web server. The plug-ins allow the Web server to concentrate on servicing standard HTTP requests and deferring executable pages to another, already running process. Some solutions, such as Microsoft's Active Server Pages, can even be configured to run in the same process and to address space as the Web server itself, although this is not recommended.
Two major approaches to Web applicationenabling technologies are used today: compiled modules and interpreted scripts. Compiled-module solutions are CGI-like modules that are compiled loadable binaries executed by the Web server. These modules have access to APIs that provide the information submitted by the request, including the values and names of all the fields in the form and the parameters on the URL. These modules produce HTML output that is sent to the requesting browser. Some popular implementations of this approach are Microsoft's Internet Server API (ISAPI), Netscape Server API (NSAPI), and Java servlets.
ISAPI and NSAPI server extensions can also be used to manage user authentication, authorization, and error logging. These extensions to the Web server are essentially a filter placed in front of the normal Web server's processing.
Compiled modules are an efficient, suitable solution for high-volume applications. The biggest drawbacks are related to development and maintenance. These modules usually combine business logic with HTML page construction. The modules often contain many print lines of HTML tags and values, which can be confusing and difficult for a programmer to read.
The other problem is that each time the module needs to be updated, or fixed, the Web application has to be shut down and the module unloaded. For most mission-critical applications, this is not much of a problem; the rate of change in the application should be small. Also, it's likely that a significant effort would have been made by the QA/test team to ensure that the delivered application was free of bugs. For smaller, internal intranet applications, however, the rate of change might be significant. For example, the application might provide sets of financial or administrative reports. The logic in these reports might change over time, or additional reports might be requested.
The other category of solutions is scripted pages. Whereas the compiled-module solution looks like a business logic program that happens to output HTML, the scripted-page solution looks like an HTML page that happens to process business logic. A scripted page, a file in the Web server's file system, contains scripts to be interpreted by the server; the scripts interact with objects on the server and ultimately produce HTML output. The page is centered on a standard HTML Web page but includes special tags, or tokens, that are interpreted by an application server. Typically, the file name's extension tells the Web server which application server or filter should be used to preprocess the page. Some popular vendor offerings in this category are JavaServer Pages, Microsoft's Active Server Pages, and PHP.
Figure 2-5 shows the relationship between components of the enabling technology and the Web server. The database in the figure, of course, could be any server-side resource, including external systems and other applications. This figure shows how the compiled-module solution almost intercepts the Web page requests from the Web server and in a sense acts as its own Web server. In reality, the compiled module must be registered with the Web server before it can function. Nonetheless, the Web server plays only a small role in the fulfillment of these requests.
FIGURE 2-5 Web serverenabling technologies
The scripted-page solution, however, is invoked by the Web server only after it has determined that the page does indeed have scripts to interpret. Typically, this is indicated by the file name extension: .aspx, .jsp, .php. When it receives a request for one of these pages, the Web server first locates the page in the specified directory and then hands that page over to the appropriate application server engine, or filter. The application server preprocesses the page, interpreting any server-side scripts in the page and interacting with server-side resources, if necessary. The results are a properly formatted HTML page that is sent to the requesting client browser.
Even though JavaServer Pages are scripted, they get compiled and loaded as a servlet the first time they are invoked. As long as the server page doesn't change, the Web server will continue to use the already compiled server page/servlet. This gives JavaServer Pages some performance benefits over the other scripted-page offerings.
The real appeal of scripted pages, however, is not their speed of execution but their ease of development and deployment. Typically, scripted pages don't contain most of the application's business logic, which instead is often found in compiled business objects that are accessed by the pages. Scripted pages are used mostly as the glue that connects the HTML user interface aspects of the system with the business logic components.
In any Web application, the choice of technologies depends on the nature of the application, the organization, and even the development team itself. On the server, a wealth of technologies and approaches may be used, many of them together. Regardless of the choices, they need to be expressed in the larger model of the system. The central theme in this book is that all the architecturally significant components of a Web application need to be present in the system's models. Servers, browsers, Web pages, and enabling technologies are architecturally significant elements and must be part of the model.