- Intended Audience
- Deployment Assumptions
- How the Gateway Works
- Concepts of the Rewriter
- Adding and Removing Rewriter Rules
- Methodology for Rule Extraction
- Out-Of-Box Rule Set
- Rewriting HTML Attributes
- Rewriting FORM Tag Input
- Rewriting JavaScript Content
- Rewriting Applet Parameters
- Rewriting Cascading Style Sheets
- Rewriting XML
- Performance
- Order Importance
- CASE Studies: How to Configure the Gateway to Rewrite a Web-Based JavaScript Navigation Bar
- Third Party Application Cookbooks
- Exchange
- How to Get Hot Patches
- Glossary
- Acknowledgements
How the Gateway Works
The mainstay of the Sun ONE Portal Server gateway component is the ability to present content from backend web servers and application servers through a single interface to a remote end user in a secure fashion. This can be done in one of two ways using the gateway component.
The first is by using a netlet connection and tunneling the data to the client. The netlet is usually used for tunneling fixed-port TCP/IP protocols (such as telnet and IMAP) to specialized applications running on the remote client. The second is by redirecting content requests through the gateway by using the bookmark provider on the Portal Server desktop or by specifying the URL to the internal resource with the gateway URL prepended in the location field of the web browser. The second method is used primarily for securely accessing Intranet web content.
Gateway Details
To present Intranet content to a remote user, the gateway must first know what URLs are contained in the content itself. For HTML, this is a relatively easy task because the gateway knows what HTML tags and tag attributes represent URLs. One example would be that of the Anchor tag with an HREF attribute. If the HREF value is an absolute URL, the gateway prepends its own URL to the beginning of the original URL so that if an end user selects that particular link, the request would go directly to the gateway component where it would then be retrieved by the gateway on behalf of the browser.
The following is an example of a URL:
<A HREF="http://www.internal.iplanet.com">
The above URL would become the following:
<A HREF="https://ips-gateway.iplanet.com/http://www.internal.iplanet.com">
This result is known as URL rewriting or URL translation. Using the same example, if the HREF value was relative to the server root, the gateway would first resolve the server URL based on the HTTP header information. It would then concatenate the the gateway URL and the absolute Intranet URL to make the final result.
The following is an example of a relative URL:
<A HREF="/pages/page2.html">
After passing through the gateway, the above would become the following:
<A HREF="https://ips-gateway.iplanet.com/http://www.internal.iplanet.com/pages /page2.html">
If no prepended path information is given for a URL, the gateway does one of the following to expand the relative path into an absolute URL:
It uses a BASE tag if one exists in the document.
It attempts to resolve the absolute URL by using the host and path information from the HTTP request header.
It does nothing, and the browser uses the URL in the location field as the URL to resolve the relative path when the page is rendered.
The third possibility is described in more detail in "Rewriter Verses Browser" on page 6 and applies more to pre-SP3 deployments. In addition, Portal Server releases earlier than SP3 (with the exception of SP2 Hot Patch 4) ignored the BASE tag altogether. This behavior also assumes that the HREF rule is still listed in the Rewrite HTML Attributes field of the gateway profile (which is there by default, out-of-box). How to add, remove, and view rules in the gateway profile is discussed in "Adding and Removing Rewriter Rules" on page 8.
With this basic model in mind, there needs to be some way to tell the gateway that a particular string or piece of code actually represents a URL. The purpose of the rewriter is to provide a human computer interface for providing context where the gateway sees only syntax. By adding rules to different sections of the gateway profile, the way in which the gateway parses, interprets, and modifies the result-set is changed.
Rule entries are simply a list of substring matches or regular expressions that the gateway uses to determine if a string, or portion of scripted code, needs to be rewritten. The rule entries are stored in LDAP and are part of the gateway profile. In SP3, the gateway can rewrite most of the HTML 4.0 tags and tag attributes. The exception is the STYLE attribute that can contain a background URL parameter. The JavaScript code is dealt with using a variety of methods described in "Rewriting JavaScript Content" on page 39. As of SP3, Form Input Data and Java Applet parameters can be rewritten with the appropriate gateway profile configuration. SP3 Hot Patch 1 adds the ability to rewrite XML data, inlined CSS, imported CSS, and imported JavaScript code (to some extent). SP3 Hot Patch 3 includes case insensitive rewriting of the background-URL CSS function within a STYLE tag. This release also adds the ability to use wildcards with the JavaScript content, which is particularly useful for JavaScript document object arrays.
Rewriter Verses Netlet
As stated earlier, the Netlet is typically used to provide secure remote access to specialized, fixed-port TCP/IP applications that talk to their own client application residing on the end-user machine. The netlet does this by establishing a secure tunnel between the client and server, using a preconfigured local port that communicates directly with the gateway. Telnet, Citrix, and IMAP are just a few of the programs and protocols that leverage Netlet functionality.
The netlet is not used for web surfing because of the difficulty in configuring it to actually work in that manner. The browser would have to be configured so that all HTTP requests are redirected to the local host port on which the Netlet is listening. There would also have to be a web proxy configured on the corporate side that knows how to handle the incoming netlet HTTP requests. Those requests might be for content outside of the corporate Intranet and have to be handed off to another proxy. While this extends the Portal Server to include more VPN-like functionality, it would be difficult to implement, put undue strain on the gateway, and require client customizations that may not be feasible, depending on the client type, the end-user location, and the intended Portal Server audience.
Take, for example, a business-to-business portal that provides a parts ordering interface. The company providing the parts interface would not be able to dictate web browser configuration requirements to the parts ordering company. The netlet would be used in this case if the parts ordering interface was a TCP/IP application that had a separate client application to interact with it, rather than a Web-based interface.
In contrast to the netlet, the rewriter allows remote access through the use of a Netscape Navigator™ 4.x or Internet Explorer 5.x browser. The rewriter uses similar functionality of a full-featured reverse web proxy, with the added benefit of rewriting the URLs so that there is no browser configuration required to make sure requests for Intranet content are routed back to the gateway. This prevents the browser from trying to make direct requests to content that is not available outside of the corporate firewall.