Data Privacy vs. Security
Data security and data privacy may seem similar, but the two represent distinct practices that work in concert together to protect the data we care about. The key to data privacy lies in the proper use and protection of controls that govern what data is shared with whom. Data security meanwhile is the fundamental protection of that data and all associated metadata from external entities. For example, if you choose to share photos with your best friend on a new app, but by doing so, you allow the company to resell those photos, that would be a breach of privacy. If on the other hand, you choose to share photos with your best friend, and the app suffers an incident whereby someone is allowed to access all of your details, that would be a breach of data security.
What is Data Privacy?
Wikipedia defines data privacy as “relationship between the collection and dissemination of data, technology, the public expectation of privacy, and the legal and political issues surrounding them.” Data privacy encompasses the measures taken to protect or shield data while still being able to utilize the data. When we think about privacy, it’s focused on the who, what, where, and how that come into play when we share our data. Privacy asks for secrecy or freedom from those that we do not want to share with. Privacy in the real world and on the internet is built around trust that a party will respect your expectations around the sharing of your data.
Privacy Regulations
Privacy today is often associated with a number of different regulation acronyms including GDPR, CCPA, and LGPD. Each of these are governmental regulations designed to lay out specific privacy rules businesses must follow when they handle the data of the residents of the government that enacted the statute. General Data Protection Regulation (GDPR) applies to citizens of the European Union. California Consumer Privacy Act (CCPA) applies to citizens of the state of California, and Lei Geral de Proteção de Dados (LGPD) applies to Brazilian citizens. Each of these is intended to clarify and bring more privacy protections to individuals.
Techniques for increasing privacy
There are multiple techniques an organization can undertake to reduce the risk of data processing. One class of techniques involves removing identifying data from the data to be processed. These techniques include tokenization, data masking and pseudonymization. Another key concept outlined as a suggested best practice is the concept of privacy by design. This concept can utilize one of the above techniques or could just include only collecting the minimal amount of data absolutely necessary for processing and the utilization of the service.
Pseudonymization
GDPR specifically suggests that data processors implement pseudonymization which they define as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information.” There are many different techniques to pseudonymize your data including both tokenization and data masking as well as others.
Tokenization
According to Wikipedia, tokenization “is the process of substituting a sensitive data element with a non-sensitive equivalent, referred to as a token, that has no extrinsic or exploitable meaning or value.” Tokenization first became popular as a method for entities subject to PCI DSS controls. Tokenization allowed those entities to replace payment card numbers with tokens thereby reducing the risk associated with storing the account information. Tokenization solutions for privacy include Tokenex etc.
Data Masking
Data masking is the process of scrambling data in a non reversible format. Whereas tokenization points to valid data, masking simply replaces it with scrambled data. Masking is of limited use as the underlying data cannot be retrieved. Masking is most often used for development and test environments which require similar but obfuscated data. There are multiple options for masking including static masking which masks in advance of use, dynamic masking which masks in real time and data redaction which masks unstructured data. Many of the major database vendors have a masking feature built in including Informatica and Oracle.
What is Data Security?
Wikipedia defines data security as “protecting digital data…from destructive forces and from the unwanted actions of unauthorized users, such as a cyberattack or a data breach.” Data security is primarily concerned with keeping information safe from external threats. This area focuses on the policies and procedures necessary that are undertaken to protect sensitive data from reaching the wrong people. Data security is a necessity to ensure data privacy as data security tends to encompass more aspects than just the information that is intended to be shared.
Data security and information security in general rely on three major principles:
- Confidentiality
- Integrity
- Availability
These three principles are referred to as the CIA triad, and they govern the overarching goals of information security when protecting the assets they are entrusted with.
Confidentiality refers to the protection of data and keeping information protected and secret. Integrity refers to ensuring that data has not been altered and therefore can be trusted. Availability refers to users having access to the data in a reliable manner without interruption. Taken together these three form the cornerstone of data security.
Techniques for increasing security
There have been many frameworks and methodologies over time that have provided guidelines for increasing security. For many years the two most popular were the castle and moat security model and the onion model. The castle and moat security model and onion both advocated for a defense-in-depth approach, but today these models are harder to apply given the prevalence of cloud-native environments.
For most organizations today, good data security means security that’s built into their daily operations. If security best practices, reviews, and monitoring are not built into an organization’s processes, then every product and organizational change represents a potential security risk. For that reason, three frameworks have emerged to ensure security is part of organizations’ everyday operations: Security as Code, Shift Left Security and Security Automation. All three frameworks offer similar yet distinct approaches of operationalizing security with a focus on automation.
Shift Left Security
Shift Left Security is rooted in the principles of shift-left testing, where the goal is to implement security steps and procedures closer to the beginning phases of a project. The original shift-left testing paradigm focused on cost overruns that stemmed from changes made to projects as they progressed through a traditional waterfall approach. As early as 1981 with the release of Software Engineering Economics by Barry Boehm through to DORA’s 2019 State of DevOps research program, the Shift Left paradigm has proved the benefits of focusing on making changes earlier in the pipeline. Shifting left means that teams can spend less time fixing security issues and more time delivering secure systems. By shifting left, responsibility for security is no longer confined to a single team, but instead becomes part of everyone’s mandate, resulting in high-performing teams delivering more secure applications, infrastructure, and operations.
Security as Code
Security as Code is a multi-part methodology of ensuring that all of your security and policy procedures are set out in code. This approach looks at each stage of the software delivery pipeline from code commit to production release and monitoring, converting manual processes or adding new checks and gates that will automatically scan and monitor for vulnerabilities, enhancing testing and expanding on user and data access policies. At the code commit level, tools from GitHub, Snyk, Veracode, Synopsys, and others can automatically monitor for vulnerabilities in your code and libraries. At the application level, vulnerability scanning can happen from container to instance to dynamic scanning of your application directly. There are innumerable vendors here that can help—from Stackrox to Tenable to Stackhawk and OWASP ZAP. Finally at the Policy as Code level, Styra’s Open Policy Agent and Hashicorp Sentinel are the major players. Taken together, this approach focuses on not only automation, but most importantly codifying your security decisions for visibility and inclusiveness. Security as code meets developers where they are, encouraging teams to find and fix problems early in the pipeline.
Security Automation
Security Automation focuses directly on removing the manual tasks that, until now, teams have had to take in order to find and fix problems. Security automation is generally mentioned in the same breath as Security, Orchestration, Automation, and Response (SOAR). SOAR focuses on streamlining those incident and security responses that traditionally have been done as a series of manual steps. Following SOAR principles, teams automate these mundane and rote tasks, adding systems to find and fix problems automatically, This frees responders so they can handle more high-value tasks. Security automation can increase productivity, reduce burnout and empower teams to find advanced threats sooner. With an advanced Security Automation toolset, teams are no longer just responding to known issues, but can spend their time instead hunting for unknown threats that may exist in their environment.
Address data privacy concerns with good data security
To provide personalized products and services, organizations today rely on data, much of it private data. This means that customers and stakeholders are entrusting every organization to keep their private data private. When a product team offers new services to their customers, that team can only be confident in using its stakeholders’ private data if the team has robust data security protection in place.
Good data security rests on the foundation we described above—security systems and practices that are built into daily operations. The good news is that a robust security environment doesn’t need to slow your team’s access to the data they need. With the right data security tools, teams can provide highly differentiated, well instrumented access to data across the organization’s many repositories. Team members and applications that need access to private data can get that access reliably, so they can deliver personalized services. Less privileged applications and users can be given access to the non-private data they need for reporting and management.
Keeping private data private while building personalized services is not as daunting as it sounds. Check out Cyral’s Zero Trust for the Data Cloud white paper for our wide-ranging introduction.