Latest White Paper | "Cyral for Data Access Governance"· Learn More
Cyral
Free Trial
Blog

Cyral’s Data Masking for Data Security Governance and Privacy

In the rapidly evolving landscape of data management and security, enterprises are faced with a complex challenge of ensuring comprehensive Data Security Governance (DSG) and protecting sensitive information such as Personally Identifiable Information (PII) and Protected Health Information (PHI), while still empowering data teams to harness the power of data analysis that helps secure competitive edge in the market.

In this blog post, we delve into the concept of Data Masking and explore how Cyral’s Data Masking solution addresses key use cases related to DSG and Privacy.

What is Data Masking?

Data Masking is the process of hiding sensitive information from a dataset with the deidentification of data, while still providing the usability and value for analytical purposes. Masking protects sensitive data by providing users with altered, but equivalent data while maintaining the data coherence, which is important for the business processes. Here is an example of masking in action.

Original Data
NameSSNAge
Nancy134-67-076253
Frank988-23-457340
:arrow_down:
Masked Data
NameSSNAge
Dan111-22-333322
Rob222-44-888867

Why is it important?

Data Masking is an important security tool for any organization for meeting the compliance requirements such as Payment Card Industry Data Security Standard (PCI DSS), and the Health Insurance Portability and Accountability Act (HIPAA) or privacy legislations such as Brazil’s Lei Geral de Proteção de Dados Pessoais (LGPD, General Personal Data Protection Law), the EU’s General Data Protection Regulation (GDPR), and China’s Personal Information Protection Law (PIPL). The main objective of these regulations is to protect personal data from unauthorized access or misuse.

Typically, sensitive and non-sensitive data coexist in the same dataset. A strict blanket blocking policy restricts complete access, making the dataset valueless from analytics and decision making perspective. On the flip side, not having any policy exposes sensitive information. Data Masking helps in striking a balance between the accessibility and confidentiality of sensitive information.

How Cyral’s solution works?

The foundational aspect of Cyral’s solution is to identify sensitive data. As explained in this blog post, Cyral provides an easy to use Data Discovery solution for identifying and labeling sensitive information. Cyral also provides granular controls on who can access what datasets. Security policies can be written either as code or in the User Interface using the labeled data. Data Masking is one of the policy enforcements in a security policy that helps in protecting sensitive information while it is being accessed by specific individuals or groups. Cyral’s solution performs data masking at the time of data access (aka Dynamic Data Masking (DDM)) and not ahead of data access.

The below diagram illustrates Cyral’s Data Masking process. Based on the policy definition and privileges as defined by the Data Manager, the end-user’s query is transformed and presented to the database. The data on database is not altered. Cyral rewrites the original user query applying the specific data masking function, then sends the new query to the database.

The organization’s data manager also gains visibility through metrics and activity logs, indicating when, what, and how often sensitive information was accessed and its results were masked.

For example, when the end-user queries "SELECT * FROM doctors" where doctor_name column in doctors table is labeled as sensitive information in Cyral along with a policy to apply Null Mask, Cyral rewrites the query to a format similar to the following and then sends it to the database.

"SELECT public.doctors.id AS id, cyral.CyralMask(\"public\".\"doctors\".\"doctor_name\", '{\"maskFunction\":\"null_mask\",\"args\":[]}')::varchar AS doctor_name, public.doctors.clinic_name AS clinic_name, public.doctors.address1 AS address1, public.doctors.city AS city, public.doctors.state_province AS state_province, public.doctors.postal_code AS postal_code, public.doctors.country AS country FROM doctors"

Masking techniques

Cyral empowers an organization to establish policies with the following out-of-the-box masking techniques, that saves customer time and improves their security posture.

  1. Null Mask: All values in the sensitive column is returned as null. This effectively removes any identification of the data.
  2. Preserve Mask: All values in the sensitive column is returned with semi-randomized string that preserves hyphens, dots and other punctuations while replacing alphabets and numbers with random alphabets and numbers. For example, MyEmail123@cyral.com is replaced with ZaFxbcd517@dzbxq.pqd
  3. Constant Mask: All values in the sensitive column is substituted with a constant value defined in the policy. For example, 134-11-2334will be replaced with 111-11-1111

Cyral’s Data Masking capability is flexible and extensible to cater to the needs of an organization where customized masking technique is required. This is achieved through installation of Custom User-Defined Functions (UDF) on database and referencing the Custom UDF in the Cyral policy definition. This is a differentiating capability that Cyral’s solution brings compared to other DDM capabilities in the market.

How Enterprises benefit from Cyral’s Data Masking?

Cyral’s Data Masking capability addresses a wide array of data governance challenges faced by modern enterprises. Many of these challenges are associated with the sharing of complete tables or databases. In most situations, sensitive and non-sensitive information coexist, meaning that only a small subset, rather than the entirety, requires restricted or obfuscated access. Most security solutions only allow for complete restriction of access to the table or database, rather than enabling granular control over sensitive data. However, Cyral provides precisely this capability, allowing data managers to strike a balance between data accessibility and granular protection.

Similarly, when there is a need for data access by individuals at a partner organization, Cyral’s Data Masking helps providing appropriate access, without compromising sensitive information. This is crucial for collaborative initiatives while maintaining control over data exposure. On the other hand, when there is a need within the organization where different departments and roles require varying levels of data access, Cyral’s Data Masking ensures that only authorized personnel can view specific data, safeguarding data privacy. Cyral’s Data Masking solution complements any synthetic data used by testing and development teams.

Benefits specific to AI/ML

Cyral’s Data Masking extends its benefits to the realm of AI/ML, providing a robust solution for organizations engaged in data-driven model development. By seamlessly integrating data masking into machine learning pipelines, organizations can ensure that sensitive data remains hidden during model training and evaluation. This is particularly valuable in scenarios like healthcare diagnostics, financial fraud detection, and retail analytics, where accurate insights are crucial but privacy must be maintained. By safeguarding sensitive data within machine learning processes, data masking promotes ethical AI practices, enhances data privacy, and fosters compliance with regulations, building a strong foundation for responsible data-driven innovation.

Final Thoughts

As businesses navigate the complex landscape of data governance and security, Data Masking is emerging as a vital tool. By Cyral’s innovative approach to Data Masking by intercepting and modifying queries, and dynamically masking sensitive data, organizations can strike the balance between data accessibility and protection. This technology not only empowers businesses to adhere to regulations but also facilitates collaboration and secure data analysis. In an era where data privacy is paramount, Cyral’s Data Masking stands out as a beacon of effective and responsible data management.

Finally, Machine learning’s potential to transform industries is undeniable, but the responsible and ethical use of data must be a priority. Data masking is emerging as a fundamental solution to address the challenge of sensitive data leakage during machine learning processes. By preserving data utility while protecting sensitive information, organizations can confidently harness the power of machine learning without compromising privacy, trust, or compliance. As the intersection of data-driven insights and ethical practices continues to evolve, Data Masking stands as a crucial guardian of privacy in the age of AI.

Resources:

  1. The Growing Risks of Sensitive Data Leaks through AI Tools: A Guide for Senior Management
  2. Bounding information leakage in machine learning
Subscribe to our Blog

Get stories about data security delivered directly to your inbox

Try Cyral

Get Started in Minutes with our Free Trial