2.3 Applying Cryptography
Now that you’ve freshened your recollection of database terminology and surveyed the basics of modern cryptography, we examine how cryptography can help secure your databases against the classes of threats covered in Chapter 1.
As we discuss the types of solutions offered by cryptography, we’ll also consider the threats that cryptography is expected to mitigate. This threat analysis, as discussed previously, is an essential component of any cryptographic project, and the answers significantly shape the cryptographic solution. Unfortunately, in practice, a requirement to encrypt data is rarely supported with a description of the relevant threats. Encrypting to protect confidentiality from external attackers launching SQL injection attacks is different from protecting against internal developers with read-only access to the production database. The precise nature of the threat determines the protection.
2.3.1 Protecting Confidentiality
A breach of confidentiality occurs when sensitive data is accessed by an unauthorized individual. Encrypting that sensitive data, then, seems to make excellent sense: if the data is encrypted, you’ve secured it against unauthorized access. Unfortunately, the solution is not this simple. Cryptography only changes the security problem; it doesn’t remove it.
The initial problem was to protect the confidentiality of the business data. Encrypting that data changes the problem to one of protecting the confidentiality of the key used for the encryption. The key must be protected with very strong access controls, and those controls must cover both direct and indirect access.
Direct access is access to the key itself. An attacker with direct access may copy the key and use it without fear of detection. Indirect access is access to an application or service that has access to the key. With indirect access, an attacker can feed encrypted data to the application or service and view the decrypted information. An attacker exploiting indirect access faces additional risk, because the application or service, due to its sensitivity, is generally well monitored. From an attacker’s point of view, the advantage that might make the indirect access worth the additional risk is that the application or service will continue to provide decryption even after the key is changed. An attacker who has a copy of the key will find the key useless as soon as the data is encrypted with a different key.
The problem of securing access to the key lies behind much of the complexity of key management systems. Cryptography is often said to transform the problem of protecting many secrets into the problem of protecting one secret. Protecting this one secret, despite the complexity, is generally easier than protecting the many secrets. Because of this, encryption is a strong and preferred method of protecting confidentiality in a database.
Consider the confidentiality threats identified in the previous chapter. The potential attackers included individuals with a copy of the database, privileged administrators, IT troubleshooters, development staff using nonproduction environments, individuals with access to backup media (often stored off-site), and attackers exploiting application weaknesses. In each of these cases, encryption protects the data so long as access to the keys is tightly controlled.
While tightly controlling direct access to keys is a relatively solvable problem (this book recommends dedicated key storage hardware but offers suggestions if such protection is unavailable), controlling indirect access is more difficult. Information is stored because the business is likely to need it at a later date, and to use it at that later date, encrypted information will need to be decrypted. The application that provides the decryption service is a weak link in the security chain. Rather than attack the encryption or the controls protecting direct access to the key, a smart attacker targets the decrypting application. We’ll consider this issue in more detail in section 2.5.1, "Indirect Access of Keys."
Protecting against attackers with access to a copy of the database, whether stolen from a production machine or a backup tape, requires that the key not be stored within or, ideally, on the same machine as the database. In the case of attacks exploiting backup media, backups of keys must be stored separately from the backups of encrypted data. Access to those backups must also be restricted to different individuals. How deep that separation goes depends on the threat model. If the relevant threat is the off-site backup staff, two different staffs (perhaps two different backup companies) are all that is necessary. If the relevant threats extend all the way to internal system administrators, separate administrators should manage the backups of each system.
Protecting against administrators with full access to database or application servers is primarily a matter of strong indirect access controls to the keys. However, even with just moderate access controls protecting the keys, encryption prevents casual attacks from administrators. In particular, encryption significantly increases the amount of effort required for an administrator to compromise confidentiality and keep the risk of detection low. Such protection is often described as "keeping the honest, honest." If the threat model rates the risk of administrator compromise sufficiently low, keys may need only a moderate level of protection from indirect access.
Most threat models should identify the presence of sensitive production data in nonproduction environments as a significant threat. Because encryption does such a fine job of preserving confidentiality, encrypted production data can’t be decrypted in a nonproduction environment.2 While this is good from a security perspective, the failure in decryption typically results in a malfunctioning environment.
The best solution is to replace all the encrypted data with mock data. Ideally, the mock data reflects the entire data model specified by the application’s design, including common, general-use scenarios as well as edge cases. Depending on resource availability, the mock data might be encrypted after it is written to the database. In some cases, it might be possible to encrypt the mock data first and then update the table with the encrypted mock data wherever it is needed. This latter strategy avoids the row-by-row encryption of the mock data.
2.3.2 Assuring Integrity
Cryptography can help detect and prevent integrity attacks, which are unauthorized modifications of data. In some cases, both integrity and confidentiality protection are desired, while in other cases just integrity protection may be needed. When integrity protection alone is called for, the data itself remains unencrypted, and some other operation protects integrity.
The naive solution for both confidentiality and integrity is to simply encrypt the information with a symmetric cipher. Later, if it doesn’t decrypt properly, someone has tampered with the information.
Unfortunately, the naive solution is not very robust. A clever attacker will attack integrity in a less obvious fashion. For instance, an attacker might move encrypted fields around so that the rows containing my information now have someone else’s encrypted credit card number. Or the attacker might swap blocks within the ciphertext or between ciphertexts. In such an attack, much of the field could decrypt to the correct value, but selected portions of it would decrypt to something else. This attack might result in some garbled data, but the rest of the field would look fine.
A better solution, and one that works even if confidentiality protection is not needed, is to use a message authentication code (MAC). A MAC is generated from the plaintext and a unique ID for that row (the ID thwarts attacks that move entire fields around). To confirm the integrity of the data, we check to make sure that the MAC still corresponds to the data.
While this will detect a past integrity attack, a MAC can also prevent integrity attacks. When data and its MAC are inserted into a table, the database can first check to ensure that the MAC is the correct MAC for that data. If it is the wrong MAC (or the MAC is not included), the database can reject the change.
Every database threat model should consider integrity threats, but as described in the previous chapter, cryptographic integrity protection is typically not a good fit for databases. The threat model will help make this clear. Integrity threats against the database may be carried out by attackers directly targeting the database or by attackers targeting the application providing access to the database, which also stamps changes with the MAC. In general, attacks against the application are more likely to be successful than attacks directly against the database, so, in this context, the risk posed by the application is greater than the risk posed by the database itself. To be effective, security should be applied to the higher-risk items. Because the protection offered by a MAC further increases the difficulty of directly attacking the database successfully (which is already a lower risk), those resources should be applied to securing the application instead, thus reducing the overall risk. Refer to section 1.1.3, "Integrity Attacks," for a more detailed discussion.