Versioning REST Services
Versioning is a perennial issue in the development of multi-tier applications. Whether you are updating a remote procedure call, changing a COM object, or updating a Web service, you need to think about how to support existing consumers while providing new functionality. With each new technology, we have to revisit versioning and come up with new recommendations for how to handle change.
In this article, we look at the types of events that cause you to create a new version in a REST service. We then look at two approaches to deploying versions. We only consider HTTP REST. REST can be implemented as an architectural style on many different protocols. As a practical matter, most of us choose HTTP technology as a cornerstone in our REST architectures.
Creating New Versions
When creating a REST service, the first iteration in the project likely exposes the resource as read-only. In HTTP terms, this means that the project called for you to support GET on the URL. Later on, you might add the ability to add new resources through POST, update resources through PUT, and delete resources through DELETE. By supporting more HTTP methods, you cannot break existing clients. Put another way, the act of supporting DELETE does not change an older client’s ability to GET the resource.
What types of events do break existing clients, then?
- Removing a field from a data type breaks clients. Developers who write client apps should handle missing values, but they frequently don’t. Their code breaks. And they file bugs against you.
- Repositioning existing fields or adding a field to a data type. This will break existing clients that rely on the field position. For example, if the client relies on the field FirstName being in position 0 and a layout changes that field to any other position, clients break and users complain. Again: The client code should do name-based lookups, but code that uses indexes may be easier to write for some folks.
- Renaming an existing field. Changing FirstName to firstName will break clients that rely on case sensitivity. Most of you transmit data as XML or JSON; both of these are case sensitive. If you transmit data in some other (non-case sensitive) form, this is even more important to keep In mind.
- Updating your URL structure. If you move the resource, existing clients won’t be able to get to them.
The easiest one of these to handle is changes to URL structure. The “right way” to handle this is to use the HTTP status code 301, Moved Permanently. Doing so tells the caller the new home for the resource via the HTTP Location header in the response.
That said, not all HTTP clients are smart enough to automatically follow redirects. If you do not control the existing HTTP clients, some, perhaps many, were written to not follow redirects. In this case, I advise you to keep the old URL structure in place and write the underlying implementation to handle the old and new URL structure with the same code. The old URL structure can be handled with networking equipment as well as with code. If you have an IT department to call on, work with the teams that own the load balancer and web servers to get the right rules in place.
I have two recommendations for renaming, reordering, adding, or deleting fields. They largely depend on who consumes your service. We first look at the RESTafarians: folks who read Roy Fielding’s PhD dissertation and really understand how the Web works. The second recommendation addresses how to handle versioning when your target clients are more interested in simplicity of implementation than in maximizing their use of the architecture of the Internet.
Hypermedia as the Engine of Application State
One school of thought follows what is called HATEOAS: Hypermedia as the Engine of Application State. HATEOAS says that you should use the HTTP Accept and Content-Type headers to handle versioning of data as well as describing data. Accept states the type of content the requester would like to get. When an HTTP message contains data, Content-Type states the type of content in that message.
The values in these headers are Multipurpose Internet Mail Extensions (MIME) types. The Internet Corporation for Assigned Names and Numbers maintains the list of accepted MIME types. As a vendor, you can also create your own MIME types using the vnd prefix. For example, if you are exposing the foo data type and your company is example.com, you can define the following MIME types for the data:
- vnd.example-com.foo+xml for the XML representation of foo data
- vnd.example-com.foo+json for the JSON representation of foo data
Then, whenever anyone requests data from your service, they create an HTTP request and set the Accept header to the correct MIME type. The response contains the data in the user requested format. As you version the foo data type, allow for the MIME type information to include version data.
For example, for versions 1.0, 1.1, and 2.0 of the foo data type as JSON set the Accept/Content-Type header as follows:
- 1.0: vnd.example-com.foo+json; version=1.0
- 1.1: vnd.example-com.foo+json; version=1.1
- 2.0: vnd.example-com.foo+json; version=2.0
All the HTTP stacks have a mechanism to read and set the HTTP Accept and Content-Type headers. For example, in jQuery I would write the following to request version 1.1 of the foo object as JSON:
$.ajax({ beforeSend: function (req) { req.setRequestHeader("Accept", "vnd.example-com.foo+json; version=1.1"); }, type: "GET", url: "http://http://www.example.com/foo/12", success: function (data) { /* code elided */ }, dataType: "json" });
On the server, your code needs to look at the accept type and handle writing out only the fields that the client expects, depending on which version of foo was requested. The server has to set the HTTP Vary header to say that the response is cacheable based on the URL plus the returned Content-Type, as follows:
Vary: Content-Type
By using the Vary header, you make sure that the Internet architecture can accurately cache the response. Otherwise, servers that rely on the architecture could cache the XML representation and return that when the caller asks for JSON. Using the MIME type, you can handle any version a resource representation by supporting a well know set of MIME types on a single URL. As you change the implementation, the receiving endpoint needs to know how to read and write the representations as requested.
Unfortunately, this last bit can be fairly code intensive. The receiving application needs to dig into the HTTP Accept header and determine which formatting should be used to write the response. Before decoding data in a request, such as in a PUT or POST, the receiver needs to look at the Content-Type.
Many popular web frameworks such as Django, Microsoft ASP.NET, Microsoft WCF, and those built on PHP do not have mechanisms to handle serialization based on Content-Type automatically. Instead, the developer has to write that code. The client frameworks to send and receive messages also make it possible, but not always simple, to set the HTTP Accept header. They do make it easy to set the URL.
This brings us to my second recommendation – where ease of use is paramount.
The URL is King
A second way to create Web services is to observe that The URL is King (TUK). Developers who follow this pattern call it REST because they heard that REST and HTTP are somehow related, but they couldn’t be bothered to read the Fielding dissertation. Fortunately, the only thing this school is guilty of is calling what they do REST. So as to not incite the RESTafarians, I call it TUK.
In TUK, you still identify resources by their URLs. When manipulating resources, use the standard HTTP methods:
- GET to read
- POST to create
- PUT to update
- DELETE to remove
At this point, we depart from REST. We have the same breaking changes as before. Adding, removing, reordering, and renaming fields constitutes a breaking change for someone somewhere. When you create a new version in this world, change the URL structure. Typically, you version large chunks of your objects at a time. Our foo object version for 1.0, 1.1, and 2.0 looks like this, instead:
- 1.0: http://www.example.com/app/1.0/foo
- 1.1: http://www.example.com/app/1.1/foo
- 2.0: http://www.example.com/app/2.0/foo
An acceptable alternative to version numbers are date stamps. If your organization handles versioning by date, the following would also work:
- June 2008: http://www.example.com/app/2008/06/foo
- October 2009: http://www.example.com/app/2009/10/foo
- February 2010: http://www.example.com/app/2010/02/foo
If you version by date, always put the year before the month, and use two digit months. This makes it easy to sort the versions visually.
The TUK style has another characteristic: Accept is used rarely, if ever. Instead, endpoints rely on a format parameter in the query string to determine the content type of the request and the desired response content type.
By convention, the default value for format is json. Examples:
- Request the resource as XML: http://www.example.com/app/1.0/foo?format=xml
- Request the resource as JSON: http://www.example.com/app/1.0/foo?format=json
Both Portable Contacts and OpenSocial use this pattern because it is so easy for people to understand.
Developers tend to prefer versioning by URL. Versioning by URL allows them to figure out which version of the service is in use at a glance. Just look at the HTTP request URL, and you know everything!
When implementing your code, you should keep the business logic in one central location. The various listeners for each version should know how to transpose between the business logic representation of the object and the external representation. In general, keep the HTTP part fairly thin and simple so that you can fix code centrally and support disparate clients easily.
Summary
You have a new version of your service whenever you reorder, rename, add, or delete fields. By changing the representation, you invalidate the assumptions consumers have already made about how to interpret the data. If your audience is architecturally minded and aware of REST, you should version data representations in the MIME types your application accepts. If your clients view the URL as the most important facet, make the URL the center of your versioning efforts. Folks who are familiar with versioning with WS-* Web services tend to be more comfortable with changing the URL when versions change.
Both mechanisms are valid. You need to know your consumer to know which path to follow. In general, working with enterprises and academically-minded folks tends to point developers towards REST versioning. If your clients are smaller businesses and users with a hacker mentality, follow the TUK approach.