Database Encryption
Database encryption is the process of converting data stored in a database into a secure format that is unreadable without proper decryption keys. It involves using cryptographic algorithms to protect sensitive information from unauthorized access, ensuring data confidentiality and integrity. This security measure is critical for safeguarding personal, financial, and operational data within IT systems. This is why all major database and data warehouse systems offer encryption capabilities, and it is a critical component of any database security strategy.
What is Encryption?
Cryptography (from Greek kryptos “hidden” + graphia “writing”) has been one of the key enablers for the rapid proliferation of a digital economy over the past three decades.Even most laymen today understand the word encryption to refer to the technique of transforming data so it can be hidden in plain sight. Many are unfamiliar though with the much larger set of cryptographic techniques, such as hash functions and digital signatures, that are commonly used today to solve a wider range of security problems, including data integrity, entity and data authentication, and non-repudiation.
Benefits | Description |
---|---|
Confidentiality | Allow only authorized users to see the data |
Authentication | Verify the user is who they claim to be before they access the data |
Integrity | Ensure the data has not been altered |
Non-repudiation | Ensure the sender cannot deny responsibility for a message, update, or action. |
Over time, cryptographic algorithms previously considered unbreakable have been shown to have weaknesses. The fast-growing power of computing systems has also made older algorithms obsolete. Thankfully, advances in cryptography have kept pace to ensure that breaking the security of the latest algorithms remains beyond the realm of practical computing.
Strategies for Data Encryption
To protect sensitive data from theft, it needs to be protected at rest, in transit, and during use. This section gives a brief overview of the encryption techniques available at each of these levels.
Encryption of Data at Rest
Data at rest refers to how it is stored in persistent storage. An attacker with access to the physical storage infrastructure can gain unauthorized access to the data stored on it unless that data has been encrypted. This is why all major database and data warehouse systems offer encryption capabilities. Encryption of data at rest can be achieved in multiple ways.
Disk or file system level encryption:
The encryption is performed by the implementation of the virtual storage layer. This is completely transparent to all application software and can be deployed with any database engine, regardless of its encryption capabilities.
Server-side encryption:
The database server is responsible for encrypting and decrypting data, transparently from its clients. The cryptographic keys used for encryption are known only to the server. From the perspective of the database clients, encryption is transparent.
Client-side encryption:
The database client is responsible for encrypting data before sending it to the server for storage. Similarly, during retrieval, the client needs to decrypt the data. Such encryption is usually performed by the database drivers or other client-side libraries. This makes the design of application software more difficult. Another disadvantage of this approach is that the server cannot perform many actions on encrypted data as part of query processing (integer comparison is a simple example). An advantage of client-side encryption is that if not every bit of stored data needs to be encrypted, then the database client can be configured to encrypt only the sensitive parts of the data (field level encryption).
Encryption of data at rest is now considered an almost mandatory best practice. However, one must be aware of its limitations and challenges. These relate mostly to how and where the encryption keys are stored and who can gain access to them:
- It’s non-trivial to set up key management correctly. Weaknesses in key management are, unfortunately, far too common and are much likelier to lead to confidentiality breaches than someone breaking a modern encryption algorithm.
- Key rotation (the recommended practice of periodically changing secret keys) is extremely disruptive and costly since large volumes of data need to be decrypted and then re-encrypted. Until a key is rotated, that key can pose a risk if an employee with access to it leaves the organization or the key otherwise becomes compromised.
- Encryption of data at rest doesn’t solve the problem of access control. Since the same key is used to encrypt large volumes of data, anyone with access to part of the data has access to it all.
Encryption of Data in Transit
Clearly, data traveling in plain text over the network is open to compromise in a variety of ways. Thankfully, this problem is easily solved using secure transport protocols such as TLS. These protocols are almost universally deployed today to protect most communication over the network. While the use of TLS also has its caveats (specifically relating to use of untrusted certificates and potential man-in-the-middle attacks), well understood best practices are usually in place to mitigate these issues. Given this, we will not explore in-transit encryption in this paper.
Encryption of Data in Use
The data needs to be available in plaintext form while it is being processed by an Application. We describe below two relatively recent techniques to address threats to data security while it is in use.
- Confidential Computing:
Advanced features in recent CPU chipsets have enabled real-time encryption and decryption of data held in the RAM of a computer system even as it is being processed by an application. This is completely transparent to the application. This technique can protect against attacks on physical or virtual compute infrastructure. For example, in the absence of this technique, someone with administrative access to a virtual machine or its hypervisor could maliciously access the sensitive data in memory as it’s being processed by an application. - Homomorphic Encryption:
Homomorphic encryption is a class of encryption algorithms that allow certain limited kinds of computations to be performed on the encrypted data itself. These are usually limited to a small set of arithmetic operations. There have been some recent attempts to derive analytics information or insights from homomorphically encrypted data. It remains to be seen how practical this will be, given the limited set of possible operations.
When Database Encryption Alone is Not Enough
While database encryption is essential to prevent data breaches, encryption alone is not enough to secure the data. Even with encryption enabled, the fact remains that users and applications need access to the data in plaintext. This is where most attackers focus their attention in order to gain unauthorized access to data. Some such attack vectors are:
- Insider attacks:
An authorized user exfiltrates information for profit. - Social engineering:
An attacker fools an authorized user into installing malware or otherwise revealing sensitive information (such as passwords). - Inadequate authentication controls:
Shared or inadequately protected credentials are relatively easy to steal, and shared credentials make it impossible to attribute an action to the responsible person. This is particularly common with shared database passwords. - Application breaches:
Attackers can exploit vulnerabilities (for example, SQL injection attacks) in applications to get unauthorized access to data.
This is why additional measures are needed to improve the overall security posture. Below, we list the most effective measures for securing the data infrastructure and highlight the challenges that prevent organizations from fully adopting these measures: