Digital Signatures
A digital signature is a guarantee of data integrity for electronic documents. The analog in the non-digital world is writing your name across the face of a document so that a modified version can’t be substituted for the original. In the electronic world, digital signatures are used in a variety of applications ranging from online credit card purchases to the verification of complex legal documents.
To illustrate, let’s consider the electronic delivery of your last will and testament to your attorney in another city. You want assurance that Bob, your evil twin, will not be able to modify the will before it’s deposited in your attorney’s vault. This assurance of your electronic document’s data integrity is possible with digital signature technology—algorithms that generate a unique series of bytes known as a message digest, satisfying the following important criterion:
If a single bit in the original document changes, the signature will be vastly different.
The XML digital signature specification defines an XML vocabulary that contains all the information needed by a receiving party to test whether the signed data has been compromised. It includes either the signed data itself (optionally encrypted) or a URI reference to the data. Additionally, it includes details of the algorithm used, any transformations carried out prior to signing, and the message digest resulting from the application of the digital signature algorithm.
Upon receipt of the digital signature XML, the recipient has sufficient information to re-create the digest value. If the two digests match, the document is considered valid. If the digests don’t match, the original document has been modified. Figure 3 illustrates the process of creating a digital signature, showing some of the elements that appear in digital signature XML.
Figure 3 The process of digital signature creation, indicating some of the elements that define an XML digital signature document.
XML Is Special
XML poses special problems for digital signatures. When XML is processed using standard XML parsers, some surface representation information may legitimately be removed or modified. When this occurs, the digital signature may no longer be valid because if a single bit changes, the entire signature is changed. The bottom line is that standard XML processing can invalidate a digital signature. For example, the following parser actions can invalidate a digital signature:
- Removal of ignorable white space by XML processors
- Replacement of entity references with a declared entity
- Removal of the XML declaration and DTD
- Insertion of default attributes, defined in a DTD but not present in the document
To prevent these actions from invalidating a digital signature, a stripped-down representation of the XML document is required. This is the function of the Canonical XML Recommendation, a W3C standard that defines an essential representation of an XML document.
Canonical XML
The Canonical XML Recommendation outlines the requirements for the generation of canonical XML from regular XML. Canonical XML is a version of an XML document independent of any parser operations. Figure 4 shows how several different XML documents with the same element and attribute data, but different surface characteristics, map to a single canonical representation.
Figure 4 XML canonicalization.
Once we have a Canonical XML version of our XML, we can then guarantee data integrity by applying an XML digital signature, assured that any non-essential processing of the XML after the signing will not invalidate the signature.