Correlation Identifier
My application is using Messaging (53) to perform a Request-Reply (154) and has received a reply message.
How does a requestor that has received a reply know which request this is the reply for?
When one process invokes another via Remote Procedure Invocation (50), the call is synchronous, so there is no confusion about which call produced a given result. But Messaging (53) is asynchronous, so from the caller’s point of view, it makes the call, and then sometime later a result appears. The caller may not even remember making the request, or it may have made so many requests that it no longer knows which one this is the result for. Now, when the caller finally gets the result, it may not know what to do with it, which defeats the purpose of making the call in the first place.
Figure 5.7 Cannot Match Reply to Request
There are a couple of approaches the caller can use to avoid this confusion. It can make just one call at a time and wait for a reply before sending another request, so there is at most one outstanding request at any given time. This, however, will greatly slow processing throughput. The caller could assume that it will receive replies in the same order it sent requests, but messaging does not guarantee what order messages are delivered in (see Resequencer [283]), and all requests may not take the same amount of time to process, so the caller’s assumption would be faulty. The caller could design its requests so that they do not need replies, but this constraint would make messaging useless for many purposes.
What the caller needs is for the reply message to have a pointer or reference to the request message, but messages do not exist in a stable memory space where they can be referenced by variables. However, a message could have some sort of key, a unique identifier like the key for a row in a relational database table. Such a unique identifier could be used to identify the message from other messages, clients that use the message, and so on.
Figure 5.8 Each reply message should contain a Correlation Identifier, a unique identifier that indicates which request message this reply is for.
There are six parts to Correlation Identifier.
- Requestor—An application that performs a business task by sending a request and waiting for a reply.
- Replier—Another application that receives the request, fulfills it, and then sends the reply. It gets the request ID from the request and stores it as the correlation ID in the reply.
- Request—A Message (66) sent from the requestor to the replier, containing a request ID.
- Reply—A Message (66) sent from the replier to the requestor, containing a correlation ID.
- Request ID—A token in the request that uniquely identifies the request.
- Correlation ID—A token in the reply that has the same value as the request ID in the request.
This is how a Correlation Identifier works: When the requestor creates a request message, it assigns the request a request ID—an identifier that is different from those for all other currently outstanding requests, that is, requests that do not yet have replies. When the replier processes the request, it saves the request ID and adds that ID to the reply as a correlation ID. When the requestor processes the reply, it uses the correlation ID to know which request the reply is for. This is called a Correlation Identifier because of the way the caller uses the identifier to correlate (i.e., match; show the relationship) each reply to the request that caused it.
As is often the case with messaging, the requestor and replier must agree on several details. They must agree on the name and type of the request ID property, and they must agree on the name and type of the correlation ID property. Likewise, the request and reply message formats must define those properties or allow them to be added as custom properties. For example, if the requestor stores the request ID in a first-level XML element named request_id and the value is an integer, the replier has to know this so that it can find the request ID value and process it properly. The request ID value and correlation ID value are usually of the same type; if not, the requestor has to know how the replier will convert the request ID to the reply ID.
This pattern is a simpler, messaging-specific version of the Asynchronous Completion Token pattern [POSA2]. The requestor is the Initiator, the replier is the Service, the consumer in the requestor that processes the reply is the Completion Handler, and the Correlation Identifier the consumer uses to match the reply to the request is the Asynchronous Completion Token.
A correlation ID (and also the request ID) is usually put in the header of a message rather than in the body. The ID is not part of the command or data the requestor is trying to communicate to the replier. In fact, the replier does not really use the ID at all; it just saves the ID from the request and adds it to the reply for the requestor’s benefit. Since the message body is the content being transmitted between the two systems, and the ID is not part of that, the ID goes in the header.
The gist of the pattern is that the reply message contains a token (the correlation ID) that identifies the corresponding request (via its request ID). There are several different approaches for achieving this.
The simplest approach is for each request to contain a unique ID, such as a message ID, and for the response’s correlation ID to be the request’s unique ID. This relates the reply to its corresponding request. However, when the requestor is trying to process the reply, knowing the request message often isn’t very helpful. What the requestor really wants is a reminder of what business task caused it to send the request in the first place so that the requestor can complete the business task using the data in the reply.
The business task, such as needing to execute a stock trade or to ship a purchase order, probably has its own unique business object identifier (such as an order ID), so that the business task’s unique ID can be used as the request-reply correlation ID. Then, when the requestor gets the reply and its correlation ID, it can bypass the request message and go straight to the business object whose task caused the request in the first place. In this case, rather than use the messages’ built-in request message ID and reply correlation ID properties, the requestor and replier should use a custom business object ID property in both the request and the reply that identifies the business object whose task this request-reply message pair is performing.
A compromise approach is for the requestor to keep a map of request IDs and business object IDs. This is especially useful when the requestor wants to keep the object IDs private or when the requestor has no control over the replier’s implementation and can only depend on the replier copying the request’s message ID into the reply’s correlation ID. In this case, when the requestor gets the reply, it looks up the correlation ID in the map to get the business object ID and then uses that to resume performing the business task using the reply data.
Messages have separate message ID and correlation ID properties so that request-reply message pairs can be chained. This occurs when a request causes a reply, and the reply is in turn another request that causes another reply, and so on. A message’s message ID uniquely identifies the request it represents; if the message also has a correlation ID, then the message is also a reply for another request message, as identified by the correlation ID.
Figure 5.9 Request-Reply Chaining
Chaining is only useful if an application wants to retrace the path of messages from the latest reply back to the original request. Often, all the application wants to know is the original request, regardless of how many reply steps occurred in between. In this situation, once a message has a non-null correlation ID, it is a reply, and all subsequent replies that result from it should also use the same correlation ID.
While a Correlation Identifier is used to match a reply with its request, the request may also have a Return Address (159) that states what channel to put the reply on. Whereas a correlation identifier is used to matching a reply message with its request, a Message Sequence’s (170) identifiers are used to specify a message’s position within a series of messages from the same sender.
Example: JMS Correlation-ID Property
JMS messages have a predefined property for correlation identifiers: JMSCorrelationID, which is typically used in conjunction with another predefined property, JMSMessageID [JMS 1.1], [Monson-Haefel]. A reply message’s correlation ID is set from the request’s message ID like this:
Message requestMessage = // Get the request message Message replyMessage = // Create the reply message String requestID = requestMessage.getJMSMessageID(); replyMessage.setJMSCorrelationID(requestID);
Example: .NET CorrelationId Property
Each Message in .NET has a CorrelationId property, a string in an acknowledgment message that is usually set to the Id of the original message. MessageQueue also has special peek and receive methods, PeekByCorrelationId(string) and ReceiveByCorrelationId(string), for peeking at and consuming the message on the queue (if any) with the specified correlation ID (see Selective Consumer [515]) [SysMsg], [Dickman].
Example: Web Services Request-Response
Web services standards, as of SOAP 1.1 [SOAP 1.1], do not provide very good support for asynchronous messaging, but SOAP 1.2 starts to plan for it. SOAP 1.2 incorporates the Request-Response Message Exchange pattern [SOAP 1.2 Part 2], a basic part of asynchronous SOAP messaging. However, the request-response pattern does not mandate support for “multiple ongoing requests,” so it does not define a standard Correlation Identifier field, not even an optional one.
As a practical matter, service requestors often do require multiple outstanding requests. “Web Services Architecture Usage Scenarios” [WSAUS] discusses several different asynchronous web services scenarios. Four of them—Request-Response, Remote Procedure Call (where the transport protocol does not support [synchronous] request-response directly), Multiple Asynchronous Responses, and Asynchronous Messaging—use message-id and response-to fields in the SOAP header to correlate a response to its request. This is the request-response example:
SOAP Request Message Containing a Message Identifier
<?xml version="1.0" ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope"> <env:Header> <n:MsgHeader xmlns:n="http://example.org/requestresponse"> <n:MessageId>uuid:09233523-345b-4351-b623-5dsf35sgs5d6</n:MessageId> </n:MsgHeader> </env:Header> <env:Body> ........ </env:Body> </env:Envelope>
SOAP Response Message Containing Correlation to Original Request
<?xml version="1.0" ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope"> <env:Header> <n:MsgHeader xmlns:n="http://example.org/requestresponse"> <n:MessageId>uuid:09233523-567b-2891-b623-9dke28yod7m9</n:MessageId> <n:ResponseTo>uuid:09233523-345b-4351-b623-5dsf35sgs5d6</n:ResponseTo> </n:MsgHeader> </env:Header> <env:Body> ........ </env:Body> </env:Envelope>
Like the JMS and .NET examples, in this SOAP example, the request message contains a unique message identifier, and the response message contains a response (e.g., a correlation ID) field whose value is the message identifier of the request message.