Skip to main content
Version: v4.14

Global policy rules

The rules block of a policy

Rules specify who can interact with which data, and what actions they can take on that data. Inside the rules block:

  • Every rule except your default rule has an identities specification that specifies the people, applications, or groups this rule applies to.

  • Every rule contains of a set of contexted rules, one for each type of access: reads, updates, and/or deletes. A contexted rule specifies the policy enforcement actions that will ensure users can only see and operate on data as allowed by your policy.

    Each contexted rule applies only in the context of its specified operation type. For example the reads rule applies only when someone tries to retrieve data. The rules block does not need to include all three operation types; actions you omit are disallowed.

  • A rule may optionally contain a hosts specification that limits access to only those users connecting from a certain network location.

Unless you create a default rule, users and groups only have the rights you explicitly grant them.

The default rule

A default rule is an optional rule without an identity specification (identities field). It applies to any user whose username or group affiliation failed to match any other rule. Without a default rule, the policy only allows those actions explicitly granted in the identities-based rules.

The following default rule from the sample policy specifies that any person who failed to match the other rules will be allowed to read only 1 row of EMAIL at a time. Updates and deletes are disallowed in for such users, since the default rule contains no updates or deletes permissions.

reads:
- data: [EMAIL]
rows: 1

The identities specification in a rule

Cyral policy rules can determine what a user can do in a repository, based on the authenticated user's identity. For each rule, you specify the set of identities (people, applications, and groups) to which the rule applies. If you omit the identity specification, this rule becomes the default rule.

  • users ([string]): one or more individual users identified by the user account they use to sign-in:

    • for SSO users registered with email, this string will be the user's email address, like nancy.drew@hhiu.us.
    • for SSO users registered with a username, this string will be the SSO username, like nancydrew.
    • for native database accounts, this string is the database username (like what's shown in the examples on this page: [bob, sara])
  • services ([string]): applications

    • for users going through Looker, use the service name looker
    • for custom services use the application name provided in the connection URL when connecting to the database
  • groups ([string]): one or more groups, as identified by the SSO group name they use to sign-in. Group names are defined in your enterprise SSO service, such as GSuite or Okta. For a policy rule to match, the group name listed here must match the group name of the access rule that granted the user access.

For example, the following identity specification indicates that the rule will apply to users bob and sara, any users going through the service looker, and any users belonging to the user group analyst.

identities:
user: [bob, sara]
services: [looker]
groups: [analyst]

In a policy, a limit of one rule per user or group

Within a given policy, make sure you only create one rule per user or group. In other words, no two rules in a single policy can contain the same user/group/service. In our example, this means that the user bob can only appear in one rule for a given policy.

Specifically, the following limits apply in order to prevent conflicts within a policy:

  • Each person must have only one rule that specifically applies to that person by username

  • Each group must have only one rule that specifically applies to that group by name

  • A person may have both a rule applied to them by username, and one or more rules that apply to them based on group affiliation. In this case, the rule that applies to them by username takes precedence.

    • Looking at the sample policy, we can see that one rule applies to the user bob and another applies to the user group analyst. If bob happens to be a member of the group analyst, then when Bob attempts to perform a data operation, we will apply the rule specified for the user bob and ignore the rule specified for the group analyst. In overlap cases like this, Cyral enforces a single rule with the following precedence: user > group > service.

The hosts specification in a rule

The hosts specification is optional. It lists the host addresses that are allowed to connect to the data locations governed by this rule. If you do not include a hosts block, Cyral does not enforce limits based on the connecting client's host address.

To specify a hosts block, provide addresses as a comma-separated list of IP addresses and network blocks in CIDR notation. When a user tries to perform a data operation while connected from any host other than those you list here, the rule blocks the action.

For example, the hosts specification shown below ensures that data locations in this rule can be accessed only while connected from a host at 192.0.2.22 or one of the hosts in the 203.0.113.16/28 block.

hosts: [192.0.2.22, 203.0.113.16/28]

Contexted rules

A contexted rule is where you specify the enforcement actions that will ensure users can only see and operate on data as allowed by your policy.

Enforcement actions in contexted rules

Cyral policy enforcement actions can limit or change what data a user sees in response to a query request, and they set limits on what data users can update or delete.

The available enforcement actions in a contexted rule are described in the sections that follow:

  • Blocking blocks access to a table or location.
  • Row limiting limits how many rows are returned per query.
  • Rate limiting limits the speed at which rows are returned per hour.
  • Dataset rewriting filters the set of rows returned.
  • Masking hides or replaces specific field values in results.
  • Additional checks let you add conditions that must be satisfied in order for results to be returned.

Below, we explain Cyral policy enforcement actions and how they interact.

Where do I specify the enforcement action?

For most types of enforcement actions, you'll add the enforcement action in a contexted rule in your policy.

info

What if I don't have a policy for the repository?

Basic access control to a repository does not require a policy. Once you've set up SSO authentication for a repository, only the allowed SSO users and groups can connect to that repository.

Likewise, preconfigured alerts don't require a policy. Cyral notifies you when suspicious activity is detected on a repository.

Data scope

Inside a contexted rule, the data block lists the data labels or tags of the data locations protected by this rule:

  • Specify locations using data labels you've established in your Data Map.
  • Specify a value of any to grant access to all the data locations protected by the current policy.

For example, the following rule from the sample policy specifies that individuals belonging to the user group analyst can read 10 rows at a time from any of the tracked data locations (labels EMAIL, CCN, and SSN). They can also write 1 row at a time to the locations EMAIL and CCN, and they can delete 1 row at a time from any of the tracked locations.

identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium

Optional fields in a contexted rule

Users can also specify the following optional fields in a contexted rule:

  • additionalChecks (string): constraints on the data access specified in Rego. See Additional checks.
  • severity (string): severity level that's recorded when someone triggers this rule. This is an informational value that will be written to the query log. Settings: (low | medium | high). If not specified, the severity is considered to be low.

Example with optional fields

For example, the following rule from the sample policy specifies that individuals belonging to the user group analyst can read 10 rows at a time from any of the data locations covered by this policy (EMAIL, CCN, and SSN). They can write 1 row at a time to the locations EMAIL and CCN. Finally, they can delete 1 row at a time from any of the data locations covered by this policy, provided they are using the psql application to do it.

rules:
- identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium
additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}

Blocking access

The block on violation enforcement action stops the user from receiving the database results for a query, if that query violates your policy.

A Cyral policy is inherently a blocking policy, meaning you do not need to include a keyword in your policy to block access to a data location. Instead, simply create a rule so that identities scope covers the users or groups to be blocked, set the data scope to include the data labels or tags of the data locations to be protected, and then do not add further instructions to the rule.

caution

You must enable blocking on each repository where you wish to block access to data locations.

When does the blocking occur?

Depending on the type of request, Cyral may block the request before it's submitted to the repository, or it may block the response from the repository.

  • For read operations such as SELECTs:
    • Cyral blocks the request is blocked if the query referred to a forbidden data label.
    • Cyral blocks the response if the result set would contain more rows than allowed for the referenced data labels.
  • For UPDATE and DELETE attempts, the request is blocked if it refers to a forbidden data label.
Blocking example

In the following example, all CCN data is blocked for all the users in the level-1-support user group.

rules:
- identities:
groups: [level-1-support]
reads:
- data: [CCN]

In other words, no rule declaration in a contexted rule is needed to block access to a data location. Instead, blocking is what happens when there's no rule granting a user access to a labeled data location.

Row limiting

Use the rows keyword in a contexted rule to limit the number of rows or documents a user can retrieve in a single query statment.

Specify a value of any to allow an unlimited number of records to be accessed/affected in a single statement.

For example, to ensure that a member of level-2-support can retrieve at most 10 rows per query, create a rule like:

rules:
- identities:
groups: [level-2-support]
reads:
- data: any
rows: 10

Rate limiting

Use the rateLimit keyword in a contexted rule to limit the rate at which a user can read, update, and/or delete data. The limit is expressed in the number of rows or documents per hour.

To set this up, add a rateLimit contexted rule for any operation you wish to limit (in the reads, updates, and/or deletes sections of the rule).

For example, to set a limit of 20 records per user per hour from the CCN data location, you would add:

rules:
- reads:
- data: [CCN]
rows: any
rateLimit: 20

Dataset rewriting

The dataset rewriting enforcement action lets you specify a filter on the set of rows that the user is allowed to access.

This action rewrites table expressions in the user query, replacing them with a substitute query that you've specified in the policy. Rewriting is typically used to filter the set of rows the user can see. The most common use case is to specify a query of the form SELECT * FROM table WHERE ... and including a WHERE clause that specifies a filter allowing only the rows that the user is allowed to access. However, you also have the option to supply a more complex replacement query.

caution

You must enable dataset rewrites on each repository where you wish to perform rewrites.

tip

See policy evaluation to understand how dataset rewriting interacts with other actions like blocking, rate limiting, and masking data.

Procedure

  1. Turn on dataset rewrites for this repository.
  2. Specify how the dataset will be rewritten by adding the datasetRewrites field in your contexted rule. The datasetRewrites field contains an array of objects with the following structure:
    • repo (string): the name of the repository that the rewrite applies to
    • dataset (string): the dataset or data location that should be rewritten. This name is case insensitive. For example, if you specify a table name, orders, it will also match a table called Orders in your database.
      • For most database types, this is a fully qualified table name in the form <schema>.<table>
      • For Snowflake, this is a fully qualified table name in the form <database>.<schema>.<table>
    • parameters ([string]): the set of parameters used in the substitution request, these are references to fields in the activity log as described in the Additional Checks section above
    • substitution (string): the request used to substitute references to the dataset

Example

For example, the following contexted rule specifies a rewrite that is triggered in the event a request reads EMAIL data. As a result, in this case, any references to the fully qualified table myDb.finance.customers will be replaced with the subquery SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:, where :identity.endUser: would be replaced with the value in the identity.endUser field in the activity log.

reads:
- data: [EMAIL]
rows: 10
datasetRewrites:
- repo: claims
dataset: myDb.finance.customers
parameters: [identity.endUser]
substitution: "SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:"

As a more specific example, suppose an individual makes the following query that tries to read more than the 10 row limit specified above. Suppose also that the individual has accessed the repository using SSO authentication, and is identified as the end user nancy.drew@hhiu.us.

SELECT * FROM myDb.finance.customers;

Given the dataset rewrite specification in the example above, the Cyral sidecar would rewrite the query such that the receiving database sees the following query.

SELECT * FROM (SELECT * FROM myDb.finance.customers WHERE email='nancy.drew@hhiu.us');
note

Currently, parameter substitutions take place even within string literals. For example, the substitution "SELECT FROM myDb.finance.customers WHERE greeting = 'Hello, :identity.endUser:'" contains the string literal 'Hello, :identity.endUser:'. During dataset rewriting, the sidecar will substitute :identity.endUser: with whatever value is in the identity.endUser field in the activity log associated with the data access.

Masking data

The data masking enforcement action hides or replaces specific field values in each row returned, rather than filtering the set of rows returned.

To mask the contents of a data location, use one of the mask keywords in your contexted rule in the format, <mask_type>(<data_label>, <mask_argument>) where mask_type is one of:

  • mask to replace the field's contents with a semi-randomized string;
  • constant_mask to replace the field's contents with a value you provide; or
  • null_mask to replace the field's contents with a null value.

For example, to mask both the EMAIL and CCN fields for all members of level-3-support you would add a rule like:

data:
- EMAIL
- CCN
rules:
- identities:
users: [level-3-support]
reads:
- data:
- mask(EMAIL)
- constant_mask(CCN, "***")
caution

You must enable masking and install helper functions in each repository where you wish to perform masking. See Mask data for instructions.

Additional checks

Beyond specifying which and how much data can be accessed in the data and rows fields, you can impose more sophisticated constraints by adding the additionalChecks field to a contexted rule.

The additionalChecks field contains a rule you'll write in the Rego language. The checks you specify in this field will be evaluated each time the contexted rule applies to an access request. Specify each check in the form of a Rego rule named is_valid_request, which needs to evaluate to true for the access attempt to be considered an allowed request. Otherwise the request will be considered a policy violation.

Each rule can evaluate attributes of the access request that are made available in the activity log. This information is exposed in the context of the Rego rule through the following variables that represent top-level fields in the activity log:

  • identity: information about the entity performing the observed data access
  • client: information about the client application from which the data is accessed
  • repo: information about the repository being accessed
  • request: information about the request itself
  • tags: values provided in the request comment via the pass-through CyralTags

Attributes nested inside these top-level fields can be accessed using dot notation (e.g. identity.endUser, client.applicationName, repo.type, and so on).

As an example, the following additional check denotes that whatever access the check is specified for is only valid if the access is through a psql client. A more sophisticated example is provided in the Examples section at the end of this document (Example 6: Only allow users to see data pertaining to themselves).

additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}

In the above example, we use the | operator, which denotes a multiline string in YAML. See this page for more information on specifying multiline strings in YAML.

note

The Rego language defines a Rego module as comprising a Package declaration, a set of Import statements for declaring data dependencies, and a set of Rules. In this context, users need only specify Rules, omitting Package declaration and Import statements.