Types of Proxies
Proxies can be used for several purposes. The classic use is as a proxy firewall located on the perimeter between the Internet and your private network. Proxies are not limited to this role though. Proxies can be used to accelerate web performance, provide remote access to internal servers, and provide anonymity for network conversations. In this section, we will highlight these other uses that can be made of proxy technology.
Web Proxies
Proxies are not just used to implement firewalls. One of their most popular uses inside a network is increasing web performance. Web conversations make up a large percentage of the traffic on many networks, so making the Web more efficient can have a dramatic impact on network operations. Proxies can help by monitoring web conversations and eliminating redundant requests. Web traffic is often characterized by frequent transmissions of nearly identical information. Some studies have shown that as much as half the requests for information across the Web are duplicates of other recent requests. Caching frequently requested web pages can dramatically speed up web browsing.
Proxy servers that provide web caching are often referred to as proxy caches or web caches. When a proxy cache is used, browsers are directed to make their HTTP requests to the proxy cache instead of directly to the destination web server (see Figure 4.3). The proxy then has the opportunity to determine whether it already has a copy of the requested information or if it needs to request a copy from the destination server. If the HTTP request is new, the proxy will make a TCP connection and HTTP request to the destination server, returning the resulting information to the browser and also storing a copy of the returned result for future use. Whenever any client of the proxy requests the same information, the proxy can reply using its local copy, eliminating the need to make a request from the destination server. This reduces network traffic as well as the load on the web server. However, it can introduce problems.
Caching works best when the information being retrieved does not change rapidly. However, some information is very time sensitive, such as stock quotes. This can cause problems if the client receives old information from the cache, when newer, more relevant data is available on the web server. The term for this is freshness. A file is "fresh" if the version on the cache is the same as the version on the web server. Web servers can specify when a file should no longer be considered fresh by placing an "Expires:" header in the returned request. This tells any caches being used (whether proxy or browser based) when to discard the file and request a new one. Many web servers do not provide good expiration guidance though. Because of this, it is important during the configuration of a proxy cache to establish good freshness policies.
Figure 4.3 Web caches accelerate performance by eliminating unnecessary server requests.
Freshness policies are normally developed using several values associated with the file. The most important, if supplied by the web server, is the "Expires:" field. This field is part of the HTTP protocol and, if configured by the web administrator, is provided in the server's response to a browser request. It allows the website to provide specific guidance concerning when a file should be disregarded. When this information is not available, though, the web proxy server will need to look at other data to make a freshness decision. One simple method would be to set a fixed time to cache all files that lack "Expires:" headers. The problem with this approach is that many sites with dynamic content that do not support "Expires:" will not work correctly when cached. A better approach is to use the age of the file to determine how long to cache. If a file is received that is seconds old, you might not want to cache it because it is much more likely that it was dynamically generated. A file that is weeks old, though, is much less likely to change while its copy is held in the cache. Even with files that have not been modified for a long time, it is still a good idea to periodically refresh the cached files, so most web proxy servers set a maximum time a file can be considered fresh.
Another benefit that can be gained through web proxies is control over where users can browse. Security and productivity can be increased by limiting access to non-organization-related web browsing. It is not uncommon for viruses, worms, and other types of malicious code to be introduced into a protected network based on files downloaded by users from inappropriate websites. By limiting what sites users can reach, you can decrease the chance that this will happen to your network. Placing restrictions on browsing has also been shown to increase productivity by taking away the temptation to spend excessive time surfing the Web. However, not all organizations will want to or be able to place restrictions on user web behavior. Before considering web filtering, you must examine your site's policies and procedures regarding user web access. Often your Human Resources and Legal departments will need to be involved.
One last item to discuss with web proxies is the logging they can provide. As we showed earlier with RingZero, proxy logging can be very useful in detecting malicious activity on your network. With a web proxy, all the URLs that browsers request can be used for intrusion analysis. Looking for requests that do not appear normal can be a powerful method to discover attacks against your network. Often your proxy logs will contain the first indications that your network is under attack. Things to look for include excessive requests for files that do not exist on your web servers (such as those that return 404 errors). This can indicate that someone is scanning your websites looking for vulnerable software. Also looking for excessively long URL requests, or requests that contain special characters, can indicate that someone is attacking your site. If you do discover that someone has successfully attacked your site, these logs can also be invaluable at discovering what weakness led to the compromise, how extensive the damage is, and (rarely) who is responsible.
Reverse Proxies
Firewalls are frequently thought of as devices that restrict access, not enable it. However, proxy techniques can be used for both. If you have a need to support remote Internet users, reverse proxies can be the answer.
Reverse proxies are used to provide controlled access for external (normally Internet-based) users to internal servers. They act as a trusted intermediary that external users must use to gain access to internal servers that would not normally be Internet accessible. An external user attempting to gain access to an internal server first connects and authenticates to the reverse proxy. Normally this is done over a Secure Sockets Layer (SSL) connection to provide confidentiality and integrity for the session. If authentication is successful, the proxy will check its policy to see whether the user is allowed to access the requested server. If so, it will begin proxying the connection for the user.
The type of internal servers that can be accessed using a reverse proxy vary depending on the sophistication of the proxy. Simple reverse proxies can only support web-based services. These products are basically normal web proxies that have been enhanced to support user authentication. In many cases, they are sufficient because many sites provide a significant amount of their network content using web systems. If you are trying to grant access to other applications that do not have a web interface, you may need to work harder.
One approach is placing a web interface on top of the application you are trying to proxy. Once the application is web enabled, normal reverse proxy techniques can be used to grant remote access. An example of this is Microsoft's Outlook Web Access (OWA). OWA is part of Microsoft Exchange and provides a web version of the Outlook mail and calendaring application. Any clients who can make a web connection to the OWA application will be able to use most Outlook functions. In fact, it can be difficult to recognize that you're accessing Outlook through a browser because the interface you are interacting with inside the browser so closely resembles the desktop version of Outlook. OWA combined with a reverse proxy provides a secure mail and calendaring solution for remote users.
Alternatively, you can roll the web-enabling technology together with a reverse proxy. This is the approach taken by Citrix MetaFrame. Citrix allows common desktop and server applications to be accessed by web browsers, including applications such as Microsoft Word and Adobe Acrobat. In fact, Citrix can proxy an entire user desktop through a browser, giving a user experience that is highly similar to sitting in front of the actual computer. Citrix also provides extensive management controls, including role-based access to internal applications. Although a capable product, it is not necessarily cheap and simple to implement. If you're considering technologies such as Citrix, make sure to include acquisition and operational costs in your analysis. In some cases, though, Citrix-like products can actually save you money by allowing shared access to products too expensive to place on every user's desktop.
Anonymizing Proxies
Privacy can be an important security service but can be a hard commodity to come by on the Internet. Almost all actions taken on a computer leave a digital trail. If you don't want someone else following that digital trail back to you, an anonymizing proxy may be the answer.
Anonymizing proxies work exactly like normal proxies, but are used for the purpose of protecting your identity while you use services across the Internet. Your requests are forwarded to the anonymizing proxy (usually over an SSL connection), which hides your identifying details (such as IP address) by making the request on your behalf. The destination server you are using only learns about the proxy's information and does not learn who actually made the request. This assumes that you do not pass anything identifying in the actual request.
Also assumed is that no one is monitoring the anonymizing proxy. If they were, they might be able to match incoming requests to outgoing requests, breaching an important aspect of the connection's privacy. This is especially easy to do if the proxy is not busy. If yours is the only IP address connected to the proxy, it's not terribly hard to guess who it is making requests through the proxy!
Various approaches have been used to solve this problem. One of the most popular is proxy chaining. Tools such as SocksChain (http://www.ufasoft.com/socks) can be used to build connections through multiple anonymizing proxies. An observer at the first proxy in the chain will only see that you are sending a request to the anonymizer, but will not learn the destination because the next hop will only be another anonymizer. In this way, the ultimate destination of your request is hidden from any outside observers (see Figure 4.4). Another approach along the same lines is Onion routing (http://www.onion-router.net), which combines proxy chaining with multiple layers of encryption to ensure that a conversation cannot be followed through the proxy nodes.
Figure 4.4 Proxy chains allow private communications by hiding the true source and destination of a packet from network eavesdroppers.
If you are in need of an anonymizer service, but do not want to set your own up, preexisting services are available on the Internet. Searching on Google for "anonymizers" will return many sites offering this privacy service. However, caveat emptor: You should trust that they maintain your privacy slightly less than you trust them.
A perfect case in point is the Java Anonymous Proxy (JAP). JAP is a anonymizer service run as a joint effort of the Dresden University of Berlin, the Free University of Berlin, and the Independent Centre for Privacy Protection, Schleswig-Holstein, Germain. It is available at http://anon.inf.tu-dresden.de/index_en.html. Back in July of 2003, it was discovered that they had, as a result of a court order, added code to JAP that was designed to monitor access to certain IP addresses. Whenever a user of the service accessed one of these forbidden sites, a message was generated recording the who, what, and when and sent to the police. This hidden behavior was uncovered several days later by an observant user of the service, but until this discovery was made, users of the JAP service were getting less privacy than they thought. For the record, the current version of JAP is supposed to be free of any tattle-tail code.