SOAP
SOAP is the new standard for network communication between software services. It is a general-purpose technology for sending messages between endpoints, and may be used for RPC or straightforward document transfer. SOAP messages are represented using XML and can be sent over any transport layer. HTTP is the most common transport layer, with implementations also available for Simple Mail Transport Protocol (SMTP), Java Messaging Service (JMS), and IBM MQSeries (see Figure 1.9).
FIGURE 1.9 SOAP messages are XML documents, usually sent over HTTP
The easiest way to publish a software component as a web service is to use a SOAP container which accepts incoming requests and dispatches them to published components, automatically translating between SOAP and the component's native language interface. SOAP containers are available for most programming languages, including Java, C++, Perl, and C# (see Figure 1.10).
Figure 1.10 A SOAP container converts XML messages into native calls
Once a component has been published as a web service, any SOAP-enabled client that knows the network address of the service and the messages that it understands can send a SOAP request and get back a SOAP response. To get the address and message information, SOAP clients read a WSDL file that describes the web service. Fortunately, most SOAP containers will automatically generate WSDL for the web services that they host, so developers don't have to write WSDL manually unless they really want to. Once the WSDL file is read, the client can start sending SOAP messages to the web service (see Figure 1.11).
FIGURE 1.11 A client needs WSDL before invoking the service
Publishing a Web Service
Before delving into the details of the SOAP protocol, I'll show you how easy it is to create and invoke a web service using a modern language like Java. The main thing to note is that no knowledge of SOAP or WSDL is necessary to immediately become a productive web services developer.
The following example shows the steps that are necessary to publish an object as a web service and then invoke it from a SOAP client. Although most examples in this book are written in Java, it is important to note that SOAP is language neutral and can support any combination of languages on the client and server. Some examples of Java programs talking to C# programs using SOAP are presented in the .NET chapter.
The object in this example is a simple stock trading service that defines a single method for buying stock. The buy() method returns the cost of purchasing a specified quantity of a particular stock. Here is the source code for the ITrader interface.
wsbook\src\book\soap\ITrader.java
package book.soap; /** * An interface for buying stock. */ public interface ITrader { /** * @param quantity The number of shares to purchase. * @param symbol The ticker symbol of the company. * @throws TradeException If the symbol is not recognized. * @return The cost of the purchase. */ float buy( int quantity, String symbol ) throws TradeException; }
The Trader class is a simple implementation of ITrader that uses hard-coded stock prices and throws an exception if it doesn't recognize a particular ticker symbol.
wsbook\src\book\soap\Trader.java
package book.soap; /** * Simple implementation of ITrader. */ public class Trader implements ITrader { public float buy( int quantity, String symbol ) throws TradeException { if( symbol.equals( "IBM" ) ) return 117.4F * quantity; else if( symbol.equals( "MSFT" ) ) return 68.1F * quantity; else throw new TradeException( "symbol " + symbol + " not recognized" ); } }
Notice that neither the interface nor the source code for the trader service contains any code related to SOAP or web services. Most SOAP containers are able to publish unmodified software components, which is good because domain objects should not be coupled to details of distributed computing.
Each SOAP container has different Application Programming Interfaces (APIs) for starting up an in-process HTTP server and for publishing objects as web services. Here is the way that you would start an HTTP server on http://localhost:8003/soap and export an instance of Trader using GLUE, the web services platform included with this book. GLUE is described in more detail in the next chapter.
wsbook\src\book\soap\TraderServer.java
package book.soap; import electric.registry.Registry; import electric.server.http.HTTP; public class TraderServer { public static void main( String[] args ) throws Exception { // start a web server on port 8003, accept messages via /soap HTTP.startup( "http://localhost:8003/soap" ); // publish an instance of Trader Registry.publish( "trader", new Trader() ); } }
Binding to a Web Service
Once an object is published as a web service, a SOAP client can bind to it and invoke it. For example, here's what a SOAP client written using GLUE looks like. Fortunately, from a Java developer's viewpoint, a web service can be invoked as if it were a local object, with all the details of SOAP and WSDL hidden by the underlying infrastructure. Microsoft .NET provides a similar mechanism for C# and Visual Basic developers.
wsbook\src\book\soap\TraderClient.java
package book.soap; import electric.registry.Registry; public class TraderClient { public static void main( String[] args ) throws Exception { // the URL of the web service WSDL file String url = "http://localhost:8003/soap/trader.wsdl"; // read the WSDL file and bind to its associated web service ITrader trader = (ITrader) Registry.bind( url, ITrader.class ); // invoke the web service as if it was a local object float ibmCost = trader.buy( 54, "IBM" ); System.out.println( "IBM cost is " + ibmCost ); float tmeCost = trader.buy( 32, "TME" ); System.out.println( "TME cost is " + tmeCost ); } }
The binding process returns a proxy that implements a Java interface whose methods mirror those of the remote service. A message sent to the proxy is automatically converted into a SOAP request, delivered across the network, and the SOAP response is converted back into a regular Java result.
FIGURE 1.12 A client proxy hides the communication details from the application
When the TraderClient is executed, SOAP messages fly back and forth between the client and server, translated automatically between XML and native calls by the SOAP container. The first method succeeds and returns a value, whereas the second method throws an exception because the symbol TME is not recognized.
Here is the server output:
> java book.soap.TraderServer GLUE 1.2 (c) 2001 The Mind Electric startup server on http://199.174.20.117:8003/soap
Here is the client output:
> java book.soap.TraderClient IBM cost is 6339.6 Exception in thread "main" book.soap.TradeException: symbol TME not recognized > _
This example hopefully has convinced you that web services programs can be written without any detailed knowledge of SOAP or WSDL. Now let's examine the SOAP messages in detail.
Anatomy of a SOAP Request
Here's what the SOAP request looks like when the example client sends a buy() message, with the method and arguments highlighted for clarity.
POST /soap/trader HTTP/1.1 Host: 199.174.18.220:8004 Content-Type: text/xml User-Agent: GLUE/1.0 Connection: Keep-Alive SOAPAction: "buy" Content-Length: 525 <?xml version='1.0' encoding='UTF-8'?> <soap:Envelope xmlns:soap='http://schemas.xmlsoap.org/soap/envelope/' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:soapenc='http://schemas.xmlsoap.org/soap/encoding/' soap:encodingStyle='http://schemas.xmlsoap.org/soap/encoding/'> <soap:Body> <n:buy xmlns:n='http://tempuri.org/book.soap.Trader'> <quantity xsi:type='xsd:int'>54</quantity> <symbol xsi:type='xsd:string'>IBM</symbol> </n:buy> </soap:Body> </soap:Envelope>
Even without an explanation of the SOAP format, you can probably figure out what most of it means. Contrast this with the CORBA and DCOM protocols, which are binary, not self-describing, and tough to trace. I know this firsthand, having written a CORBA ORB in a previous lifetime.
The first part of the SOAP request is a standard HTTP header that indicates that the request is an HTTP POST operation whose Universal Resource Identifier (URI) is /soap/trader. The Content-Type field shows that the HTTP payload is XML, and the SOAPAction field tells the remote host that the content is a SOAP message. SOAPAction is often set to the name of the method to invoke so that the host web server or firewall can perform some high-level message filtering.
The second part of the SOAP request is an XML document that consists of three main portions:
Envelope |
The envelope defines the various XML namespaces that are used by the rest of the SOAP message, and typically include xmlns:soap (SOAP envelope namespace), xmlns:xsi (XML Schema for instances), xmlns:xsd (XML Schema for data types) and xmlns:soapenc (SOAP encoding namespace). More information about these namespaces is presented later in this book. |
Header |
The header is an optional element for carrying auxiliary information for authentication, transactions, routing, and payments. Any element in a SOAP processing chain can add or delete items from the header; elements can also choose to ignore items if they are unknown. If a header is present, it must be the first child of the envelope. Because our example is simple and does not invoke routers, the header is absent. |
Body |
The body is the main payload of the message. When SOAP is used to perform an RPC call, the body contains a single element that contains the method name and arguments. The namespace of the method name is specified by the web service, and in this case is equal to http://tempuri.org/ followed by the type of the target web service. The type of each argument can be optionally supplied using the xsi:type attribute; in this example, the first argument is flagged as an xsd:int, and the second argument as an xsd:string. If a header is present, the body must be its immediate sibling; otherwise it must be the first child of the envelope. |
A SOAP request is typically accepted by a servlet, CGI or standalone daemon running on the remote web server. In this example, the GLUE SOAP container started a servlet running on localhost:8003/soap. When the servlet gets a request, it checks that the request has a SOAPAction field, and if it does, forwards it to the SOAP container. The container uses the POST URI to look up the target web service, parses the XML payload, and then invokes the method on the component.
Anatomy of a SOAP Response
The result of the invocation is translated by the SOAP container into a SOAP response and returned back to the sender within the HTTP reply. Here's the SOAP response from the buy() message sent to the Trader service, with the result name and value highlighted for clarity.
HTTP/1.1 200 OK Date: Sat, 19 May 2001 06:58:38 GMT Content-Type: text/xml Server: GLUE/1.0 Content-Length: 489 <?xml version='1.0' encoding='UTF-8'?> <soap:Envelope xmlns:soap='http://schemas.xmlsoap.org/soap/envelope/' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:soapenc='http://schemas.xmlsoap.org/soap/encoding/' soap:encodingStyle='http://schemas.xmlsoap.org/soap/encoding/'> <soap:Body> <n:buyResponse xmlns:n='http://tempuri.org/book.soap.Trader'> <Result xsi:type='xsd:float'>6339.6</Result> </n:buyResponse> </soap:Body> </soap:Envelope>
The XML document is structured just like the request except that the body contains the encoded method result. By convention, the name of the result is equal to the name of the method followed by "Response", and the namespace of the result is the same as the namespace of the original method.
SOAP Exceptions
If an exception occurs at any time during the processing of a message, a SOAP fault is generated and encoded in a manner similar to a regular SOAP response. Here is the SOAP response that is returned when our example client attempts to buy stock for a ticker symbol that is not recognized.
HTTP/1.1 500 Internal Server Error Content-Type: text/xml Content-Length: 244 <soap:Fault> <faultcode>soap:Server</faultcode> <faultstring>symbol TME not recognized</faultstring> <detail> <stacktrace> book.soap.TradeException: symbol TME not recognized at book.soap.Trader.buy(Trader.java:16) at java.lang.reflect.Method.invoke(Native Method) </stacktrace> </detail> </soap:Fault>
The standard HTTP reply header indicates an exception by using status code 500. The XML payload contains an envelope and body just like a regular response, except that the content of the body is a soap:Fault structure whose fields are defined as follows:
faultcode |
A code that indicates the type of the fault. The valid values are soap:Client (incorrectly formed message), soap:Server (delivery problem), soap:VersionMismatch (invalid namespace for Envelope element) and soap:MustUnderstand (error processing header content). |
Faultstring |
A human readable description of the fault. |
Faultactor |
An optional field that indicates the URL of the source of the fault. |
detail |
An application-specific XML document that contains detailed information about the fault. |
Some SOAP implementations add an additional element to encode information about remote exceptions such as their type, data, and stack trace so that they can be rethrown automatically on the client.
Performance
Now that you've seen how SOAP messages are passed back and forth using HTTP and XML, it is time to contemplate performance issues.
CORBA and DCOM use binary encoding for arguments and return values. In addition, they assume that both the sender and the receiver have full knowledge of the message context and do not encode any meta-information such as the names or types of the arguments. This approach results in good performance, but makes it hard for intermediaries to process messages. And since each system uses a different binary encoding, it's hard to build systems that interoperate.
Because SOAP uses XML to encode messages, it's very easy to process messages at every step of the invocation process. In addition, the ease of debugging SOAP messages is leading to a quick convergence of the various SOAP implementations, which is important because large-scale interoperability is what SOAP is all about.
On the surface, it seems that an XML-based scheme would be intrinsically slower than that of a binary-based model, but it's not as straightforward as that.
First, when SOAP is used for sending messages across the Internet, the time to encode/decode the messages at each endpoint is tiny compared with the time to transfer bytes between endpoints, so using XML in this case is not significant.
Second, when SOAP is used to send messages between endpoints in a closed environment, such as between departments within the same company, it's likely that the endpoints will be running the same implementation of SOAP. In this case, there are opportunities for optimizations that are unique to that particular implementation. For example, a SOAP client could add an HTTP header tag to a SOAP request that indicates that it supports a particular optimization. If the SOAP server also supports that optimization, it could return an HTTP header tag in the first SOAP response that tells the client that it's okay to use that optimization in subsequent communications. At that point, both the client and the server could start using the optimization.
The fastest SOAP implementations typically get at least 500 messages/second on a 600MHz desktop PC when the client and the server are in different programs in the same machine, and around 300 messages/second on a fast local area network (LAN).
Other SOAP Features
The example in this section was very simple and demonstrated only a subset of SOAP functionality. Additional features, many of which are covered later in this book, include:
Arrays, objects, and other complex data structures may be sent across the network in a platform and language neutral way.
SOAP headers support security, transactions, and routing.
Custom encoding types may be defined.
SOAP supports request-response, one-way, solicit-response, and notification operations.
Now that you've seen what SOAP messaging looks like, it's time to look at WSDL.