- Chapter 3: Simple Object Access Protocol (SOAP)
- Simple Object Access Protocol (SOAP)
- Doing Business with SkatesTown
- Inventory Check Web Service
- SOAP Envelope Framework
- Taking Advantage of SOAP Extensibility
- SOAP Intermediaries
- Error Handling in SOAP
- SOAP Data Encoding
- Architecting Distributed Systems with Web Services
- Purchase Order Submission Web Service
- SOAP Protocol Bindings
- Summary
- The Road Ahead
- Resources
SOAP Envelope Framework
The most important part that SOAP specifies is the envelope framework. Although it consists of just a few XML elements, it provides the structure and extensibility mechanisms that make SOAP so well suited as the foundation for all XML-based distributed computing. The SOAP envelope framework defines a mechanism for identifying what information is in a message, who should deal with the information, and whether this is optional or mandatory. A SOAP message consists of a mandatory envelope wrapping any number of optional headers and a mandatory body. These concepts are discussed in turn in the following sections.
SOAP Envelope
SOAP messages are XML documents that define a unit of communication in a distributed environment. The root element of the SOAP message is the Envelope element. In SOAP 1.1, this element falls under the http://schemas.xmlsoap.org/soap/envelope/ namespace. Because the Envelope element is uniquely identified by its namespace, it allows processing tools to immediately determine whether a given XML document is a SOAP message.
This certainly is convenient, but what do you trade off for this capability? The biggest thing you have to sacrifice is the ability to send arbitrary XML documents and perform simple schema validation on them. True, you can embed arbitrary XML inside the SOAP Body element, but naïve validation will fail when it encounters the Envelope element at the top of the document instead of the top document element of your schema. The lesson is that for seamless validation of arbitrary XML inside SOAP messages, you must integrate XML validation with the Web services engine. In most cases, the Web services engine will have to separate SOAP-specific from application-specific XML before validation can take place.
The SOAP envelope can contain an optional Header element and a mandatory Body element. Any number of other XML elements can follow the Body element. This extensibility feature helps with the encoding of data in SOAP messages. We'll discuss it later in this chapter in the section "SOAP Data Encoding Rules."
SOAP Versioning
One interesting note about SOAP is that the Envelope element does not expose any explicit protocol version, in the style of other protocols such as HTTP (HTTP/1.0 vs. HTTP/1.1) or WDDX (<wddxPacket version="1.0"> ... </wddxPacket>). The designers of SOAP explicitly made this choice because experience had shown simple number-based versioning to be fragile. Further, across protocols, there were no consistent rules for determining what changes in major versus minor version numbers truly mean. Instead of going this way, SOAP leverages the capabilities of XML namespaces and defines the protocol version to be the URI of the SOAP envelope namespace. As a result, the only meaningful statement that you can make about SOAP versions is that they are the same or different. It is no longer possible to talk about compatible versus incompatible changes to the protocol.
What does this mean for Web service engines? It gives them a choice of how to treat SOAP messages that have a version other than the one the engine is best suited for processing. Because an engine supporting a later version of SOAP will know about all previous versions of the specification, it has a range of options based on the namespace of the incoming SOAP message:
If the message version is the same as any version the engine knows how to process, the engine can just process the message.
If the message version is older than any version the engine knows how to process, the engine can do one of two things: generate a version mismatch error and/or attempt to negotiate the protocol version with the client by sending some information regarding the versions that it can accept.
If the message version is newer than any version the engine knows how to process, the engine can choose to attempt processing the message anyway (typically not a good choice) or it can go the way of a version mismatch error combined with some information about the versions it understands.
All in all, the simple versioning based on the namespace URI results in the fairly flexible and accommodating behavior of Web service engines.
SOAP Headers
Headers are the primary extensibility mechanism in SOAP. They provide the means by which additional facets can be added to SOAP-based protocols. Headers define a very elegant yet simple mechanism to extend SOAP messages in a decentralized manner. Typical areas where headers get involved are authentication and authorization, transaction management, payment processing, tracing and auditing, and so on. Another way to think about this is that you would pass via headers any information orthogonal to the specific information needed to execute a request.
For example, a transfer payment service only really needs from and to account numbers and a transfer amount to execute. In real-world scenarios, however, a service request is likely to contain much more information, such as the identity of the person making the request, account/payment information, and so on. This additional information is usually handled by infrastructure services (login and security, transaction coordination, billing) outside the main transfer payment service. Encoding this information as part of the body of a SOAP message will only complicate matters. That is why it will be passed in as headers.
A SOAP message can include any number of header entries (simply referred to as headers). If any headers are present, they will all be children of the SOAP Header element, which, if present, must appear as the first child of the SOAP Envelope element. The following example shows a SOAP message with two headers, Transaction and Priority. Both headers are uniquely identified by the combination of their element name and their namespace URI:
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> <SOAP-ENV:Header> <t:Transaction xmlns:t="some-URI" SOAP-ENV:mustUnderstand="1"> 12345 </t:Transaction> <p:Priority xmlns:p="some-Other-URI"> <ReallyVeryHigh/> </p:Priority> </SOAP-ENV:Header> <SOAP-ENV:Body> ... </SOAP-ENV:Body> </SOAP-ENV:Envelope>
The contents of a header (sometimes referred to as the header value) are determined by the schema of the header element. This allows headers to contain arbitrary XML, another example of the benefits of SOAP being an XML-based protocol. Compare it to protocols such as HTTP where header values must be simple strings, thus forcing any structured information to be somehow encoded to become a string. For example, cookie values come in a semicolon delimited format, such as cookie1=value1;cookie2=value2. It is easy to reach the limits of these simple encodings. XML is a much better way to represent this type of structured information.
Also, notice the SOAP mustUnderstand attribute with value 1 that decorates the Transaction element. This attribute indicates that the recipient of the SOAP message must process the Transaction header entry. If a recipient does not know how to process a header tagged with mustUnderstand="1", it must abort processing with a well-defined error. This rule allows for robust evolution of SOAP-based protocols. It ensures that a recipient that might be unaware of certain important protocol extensions does not ignore them.
Note that because the Priority header is not tagged with mustUnderstand="1", it can be ignored during processing. Presumably, this will be OK because a server that does not know how to process message priorities will assume normal priority.
You might have noticed that the SOAP body can be treated as a well-specified SOAP header flagged with mustUnderstand="1". Although this is certainly true, the SOAP designers thought that having a separation between the headers and body of a message does not complicate the protocol and is convenient for readability.
Before leaving the topic of headers, it is important to point out that, despite the obvious need for header extensions to support such basic distributed computing concepts such as authentication credentials or transaction information, there hasn't been a broad standardization effort in this area, with the exception of some security extensions that we'll review in Chapter 5. Some of the leading Web service vendors are doing interesting work, but the industry as a whole is some way away from agreeing on core extensions to SOAP. Two primary forces maintain this unsatisfactory status quo:
Most current Web service engines do not have a solid extensibility architecture. Therefore, header processing is relatively difficult right now. At the time of this writing, Apache Axis is a notable exception to this rule.
Market pressure is pushing Web service vendors to innovate in isolation and to prefer shipping software over coordinating extensions with partners and competitors.
Wider Web service adoption will undoubtedly put pressure on the Web services community to think more about interoperability and begin broad standardization in some of these key areas.
SOAP Body
The SOAP Body element immediately surrounds the information that is core to the SOAP message. All immediate children of the Body element are body entries (typically referred to simply as bodies). Bodies can contain arbitrary XML. Sometimes, based on the intent of the SOAP message, certain conventions will govern the format of the SOAP body. The conventions for representing RPCs are discussed later in the section "SOAP-based RPCs." The conventions for communicating error information are discussed in the section "Error Handling in SOAP."