- Example Programs and crypttool
- Cryptographic Services and Providers
- Cryptographic Keys
- Encryption and Decryption
- Message Digest
- Message Authentication Code
- Digital Signature
- Key Agreement
- Summary of Cryptographic Operations
- Cryptography with crypttool
- Limited versus Unlimited Cryptography
- Performance of Cryptographic Operations
- Practical Applications
- Legal Issues with Cryptography
- Summary
- Further Reading
Message Digest
Message digests, also known as message fingerprints or secure hash, are computed by applying a one-way hash function over the data bits comprising the message. Any modification in the original message, either intentional or unintentional, will most certainly result in a change of the digest value. Also, it is computationally impossible to derive the original message from the digest value. These properties make digests ideal for detecting changes in a given message. Compute the digest before storing or transmitting the message and then compute the digest after loading or receiving the message. If the digest values match then one can be sure with good confidence that the message has not changed. However, this scheme fails if a malicious interceptor has access to both the original message and its digest. In this case the interceptor could easily alter the message, compute the digest of the modified message and replace the original digest with the new one. The solution, as we see in the next section, is to secure the message digest by encrypting it with a secret key.
A common use of message digests is to securely store and validate passwords. The basic idea is that you never store the password in clear-text. Compute the message digest of the password and store the digest value. To verify the password, compute its digest and match it with the stored value. If both values are equal, the verification succeeds. This way no one, not even the administrator, gets to know your password. A side effect of this mechanism is that you cannot get back a forgotten password. This is not really as bad as it sounds, for you can always get it changed to a temporary password by an administrator, and then change it to something that only you know.
Message digests of messages stored in byte arrays are computed using engine class java.security.MessageDigest. The following program illustrates this.
Listing 3-7. Computing message digest
// File: src\jsbook\ch3\ComputeDigest.java import java.security.MessageDigest; import java.io.FileInputStream; public class ComputeDigest { public static void main(String[] unused) throws Exception{ String datafile = "ComputeDigest.java"; MessageDigest md = MessageDigest.getInstance("SHA1"); FileInputStream fis = new FileInputStream(datafile); byte[] dataBytes = new byte[1024]; int nread = fis.read(dataBytes); while (nread > 0) { md.update(dataBytes, 0, nread); nread = fis.read(dataBytes); }; byte[] mdbytes = md.digest(); System.out.println("Digest(in hex):: " + Util.byteArray2Hex(mdbytes)); } }
A concrete, algorithm-specific MessageDigest object is created following the general pattern of all engine classes. The invocation of update() method computes the digest value and the digest() call completes the computation. It is possible to make multiple invocations of update(byte[] bytes) before calling the digest() method, thus avoiding the need to accumulate the complete message in a single buffer, if the original message happens to be fragmented over more than one buffer or cannot be kept completely in main memory. This is likely to be the case if the data bytes are being read from a huge file in fixed size buffers. In fact, convenience classes DigestInputStream and DigestOutputStream, both in the package java.security, exist to compute the digest as the bytes flow through the associated streams.
The verification or check for integrity of the message is done by computing the digest value and comparing this with the original digest for size and content equality. Class MessageDigest even includes static method isEqual(byte[] digestA, byte[] digestB) to perform this task.
Theoretically, because a much larger set of messages get mapped to a much smaller set of digest values, it is possible that two or more messages will have the same digest value. For example, the set of 1 KB messages has a total of 2(8*1024) distinct messages. If the size of the digest value is 128 then there are only 2128 different digest values possible. What it means is that there are, on the average, 2(8*1024128) different 1KB messages with the same digest value. However, a brute-force search for a message that results in a given digest value would still require examining, on the average, 2127 messages. The problem becomes a bit simpler if one were to look for any pair of messages that give rise to the same digest value, requiring, on the average, only 264 attempts. This is known as the birthday attack, deriving its name from a famous mathematics puzzle, whose result can be stated as: there is more than a 50 percent chance that you will find someone with the same birthday as yours in a party of 183 persons. However, this number drops to 23 for any pair to have the same day as their birthday.
The providers bundled with J2SE v1.4 support two message digest algorithms: SHA (Secure Hash Algorithm) and MD5. SHA, also known as SHA-1, produces a message digest of 160 bits. It is a FIPS (Federal Information Processing Standard) approved standard. In August 2002, NIST announced three more FIPS approved standards for computing message digest: SHA-256, SHA-384 and SHA-512. These algorithms use a digest value of 256, 384 and 512 bits respectively, and hence provide much better protection against brute-force attacks. MD5 produces only 128 bits as message digest, and is considerably weaker.