- Implementing a Client
- Parsing Strings by Using StringTokenizer
- Example: A Client to Verify E-Mail Addresses
- Example: A Network Client That Retrieves URLs
- The URL Class
- WebClient: Talking to Web Servers Interactively
- Implementing a Server
- Example: A Simple HTTP Server
- RMI: Remote Method Invocation
- Summary
17.5 The URL Class
The URL class provides simple access to URLs. The class automatically parses a string for you, letting you retrieve the protocol (e.g., http), host (e.g., java.sun.com), port (e.g., 80), and filename (e.g., /reports/earnings.html) separately. The URL class also provides an easy-to-use interface for reading remote files.
Reading from a URL
Although writing a client to explicitly connect to an HTTP server and retrieve a URL was quite simple, this task is so common that the Java programming language provides a helper class: java.net.URL. We saw this class when we looked at applets (see Section 9.5, "Other Applet Methods"): a URL object of this type that needed to be passed to getAppletContext().showDocument. However, the URL class can also be used to parse a string representing a URL and read the contents. An example of parsing a URL is shown in Listing 17.11.
Listing 17.11 UrlRetriever2.java
import java.net.*; import java.io.*; /** Read a remote file using the standard URL class * instead of connecting explicitly to the HTTP server. */ public class UrlRetriever2 { public static void main(String[] args) { checkUsage(args); try { URL url = new URL(args[0]); BufferedReader in = new BufferedReader( new InputStreamReader(url.openStream())); String line; while ((line = in.readLine()) != null) { System.out.println("> " + line); } in.close(); } catch(MalformedURLException mue) { // URL constructor System.out.println(args[0] + "is an invalid URL: " + mue); } catch(IOException ioe) { // Stream constructors System.out.println("IOException: " + ioe); } } private static void checkUsage(String[] args) { if (args.length != 1) { System.out.println("Usage: UrlRetriever2 <URL>"); System.exit(-1); } } }
Here is the UrlRetriever2 in action:
Prompt> java UrlRetriever2 http://www.whitehouse.gov/ > <HTML> > <HEAD> > <TITLE>Welcome To The White House</TITLE> > </HEAD> > ... Remainder of HTML document omitted ... > </HTML>
This implementation just prints out the resultant document, not the HTTP response lines included in the original "raw" UrlRetriever class. However, another Java class called URLConnection will supply this information. Create a URLCon_nection object by calling the openConnection method of an existing URL, then use methods such as getContentType and getLastModified to retrieve the response header information. See the on-line API for java.net.URLConnection for more details.
Other Useful Methods of the URL Class
The most valuable use of a URL object is to use the constructor to parse a string representation and then to use openStream to provide an InputStream for reading. However, the class is useful in a number of other ways, as outlined in the following sections.
public URL(String absoluteSpec)
public URL(URL base,
String relativeSpec)
public URL(String protocol, String host, String
file)
public URL(String protocol, String host, int port, String
file)
These four constructors build a URL in different ways. All throw a MalformedURLException.
public String getFile()
This method returns the filename (URI) part of the URL. See the output following Listing 17.12.
public String getHost()
This method returns the hostname part of the URL. See the output following Listing 17.12.
public int getPort()
This method returns the port if one was explicitly specified. If not, it returns 1 (not 80). See the output following Listing 17.12.
public String getProtocol()
This method returns the protocol part of the URL (i.e., http). See the output following Listing 17.12.
public String getRef()
The getRef method returns the "reference" (i.e., section heading) part of the URL. See the output following Listing 17.12.
public final InputStream openStream()
This method returns the input stream that can be used for reading, as used in the UrlRetriever2 class. The method can also throw an IOException.
public URLConnection openConnection()
This method yields a URLConnection that can be used to retrieve header lines and (for POST requests) to supply data to the HTTP server. The POST method is discussed in Chapter 19 (Server-Side Java: Servlets).
public String toExternalForm()
This method gives the string representation of the URL, useful for printouts. This method is identical to toString.
Listing 17.12 gives an example of some of these methods.
Listing 17.12 UrlTest.java
import java.net.*; /** Read a URL from the command line, then print * the various components. */ public class UrlTest { public static void main(String[] args) { if (args.length == 1) { try { URL url = new URL(args[0]); System.out.println ("URL: " + url.toExternalForm() + "\n" + " File: " + url.getFile() + "\n" + " Host: " + url.getHost() + "\n" + " Port: " + url.getPort() + "\n" + " Protocol: " + url.getProtocol() + "\n" + " Reference: " + url.getRef()); } catch(MalformedURLException mue) { System.out.println("Bad URL."); } } else System.out.println("Usage: UrlTest <URL>"); } }
Here's UrlTest in action:
> java UrlTest http://www.irs.gov/mission/#squeeze-them-dry URL: http://www.irs.gov/mission/#squeeze-them-dry File: /mission/ Host: http://www.irs.gov Port: -1 Protocol: http Reference: squeeze-them-dry