Skip to main content
Version: v2.x

Rules

The rules block of a policy

Rules specify who can interact with which data, and what actions they can take on that data. Inside the rules block:

  • Every rule except your default rule has an identities specification that specifies the people, applications, or groups this rule applies to.
  • Every rule contains of a set of contexted rules, one for each type of access: reads, updates, and/or deletes. Each contexted rule applies only in the context of its specified operation type. For example the reads rule applies only when someone tries to retrieve data. The rules block does not need to include all three operation types; actions you omit are disallowed.
  • A rule may optionally contain a hosts specification that limits access to only those users connecting from a certain network location.

Unless you create a default rule, users and groups only have the rights you explicitly grant them.

The default rule

A default rule is an optional rule without an identity specification (identities field). It applies to any user whose username or group affiliation failed to match any other rule. Without a default rule, the policy only allows those actions explicitly granted in the identities-based rules.

The following default rule from the sample policy specifies that any person who failed to match the other rules will be allowed to read only 1 row of EMAIL at a time. Updates and deletes are disallowed in for such users, since the default rule contains no updates or deletes permissions.

reads:
- data: [EMAIL]
rows: 1

The identities specification in a rule

For each rule, you can specify the set of identities (people, applications, and groups) to which the rule applies. If you omit the identity specification, this rule becomes the default rule.

  • users ([string]): individual users

  • services ([string]): applications

    • for users going through Looker, use the service name looker
    • for custom services use the application name provided in the connection URL when connecting to the database
  • groups ([string]): user groups defined your enterprise SSO service such as GSuite or Okta

For example, the following identity specification indicates that the rule will apply to users bob and sara, any users going through the service looker, and any users belonging to the user group analyst.

identities:
user: [bob, sara]
services: [looker]
groups: [analyst]

In a policy, a limit of one rule per user or group

Within a given policy, make sure you only create one rule per user or group. In other words, no two rules in a single policy can contain the same user/group/service. In our example, this means that the user bob can only appear in one rule for a given policy.

Specifically, the following limits apply in order to prevent conflicts within a policy:

  • Each person must have only one rule that specifically applies to that person by username

  • Each group must have only one rule that specifically applies to that group by name

  • A person may have both a rule applied to them by username, and one or more rules that apply to them based on group affiliation. In this case, the rule that applies to them by username takes precedence.

    • Looking at the sample policy, we can see that one rule applies to the user bob and another applies to the user group analyst. If bob happens to be a member of the group analyst, then when Bob attempts to perform a data operation, we will apply the rule specified for the user bob and ignore the rule specified for the group analyst. In overlap cases like this, Cyral enforces a single rule with the following precedence: user > group > service.

The hosts specification in a rule

The hosts specification is optional. It lists the host addresses that are allowed to connect to the data locations governed by this rule. If you do not include a hosts block, Cyral does not enforce limits based on the connecting client's host address.

To specify a hosts block, provide addresses as a comma-separated list of IP addresses and network blocks in CIDR notation. When a user tries to perform a data operation while connected from any host other than those you list here, the rule blocks the action.

For example, the hosts specification shown below ensures that data locations in this rule can be accessed only while connected from a host at 192.0.2.22 or one of the hosts in the 203.0.113.16/28 block.

hosts: [192.0.2.22, 203.0.113.16/28]

Contexted rules

Each contexted rule comprises these fields describing the allowed access for a given access type:

  • data ([string]): the data locations protected by this rule.
    • Specify locations using LABELs you've established in your data map.
    • Specify a value of any to grant access to all the data locations protected by the current policy.
  • rows (int): the number of records (for example, rows or documents) that can be accessed/affected in a single statement.
    • Specify a value of any to allow an unlimited number of records to be accessed/affected in a single statement.
  • other optional fields, like additional checks and request rewriting.

For example, the following rule from the sample policy specifies that individuals belonging to the user group analyst can read 10 rows at a time from any of the tracked data locations (EMAIL, CCN, and SSN). They can also write 1 row at a time to the locations EMAIL and CCN, and they can delete 1 row at a time from any of the tracked locations.

identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium

Optional fields in a contexted rule

Users can also specify the following optional fields in a contexted rule:

  • additionalChecks (string): constraints on the data access specified in Rego. See Additional checks.
  • datasetRewrites ([object]): defines how requests should be rewritten in the case of policy violations. See Request rewriting.
  • severity (string): severity level that's recorded when someone violate this rule. This is an informational value. Settings: (low | medium | high). If not specified, the severity is considered to be low.

Example with optional fields

For example, the following rule from the sample policy specifies that individuals belonging to the user group analyst can read 10 rows at a time from any of the data locations covered by this policy (EMAIL, CCN, and SSN). They can write 1 row at a time to the locations EMAIL and CCN. Finally, they can delete 1 row at a time from any of the data locations covered by this policy, provided they are using the psql application to do it.

identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium
additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}

Additional checks

Beyond specifying which and how much data can be accessed in the data and rows fields, you can impose more sophisticated constraints by adding the additionalChecks field to a contexted rule.

The additionalChecks field contains a rule you'll write in the Rego language. The checks you specify in this field will be evaluated each time the contexted rule applies to an access request. Specify each check in the form of a Rego rule named is_valid_request, which needs to evaluate to true for the access attempt to be considered an allowed request. Otherwise the request will be considered a policy violation.

Each rule can evaluate attributes of the access request that are made available in the activity log. This information is exposed in the context of the Rego rule through the following variables that represent top-level fields in the activity log:

  • identity: information about the entity performing the observed data access
  • client: information about the client application from which the data is accessed
  • repo: information about the repository being accessed
  • request: information about the request itself
  • tags: values provided in the request comment via the pass-through CyralTags

Attributes nested inside these top-level fields can be accessed using dot notation (e.g. identity.endUser, client.applicationName, repo.type, and so on).

As an example, the following additional check denotes that whatever access the check is specified for is only valid if the access is through a psql client. A more sophisticated example is provided in the Examples section at the end of this document (Example 6: Only allow users to see data pertaining to themselves).

additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}

In the above example, we use the | operator, which denotes a multiline string in YAML. See this page for more information on specifying multiline strings in YAML.

note

The Rego language defines a Rego module as comprising a Package declaration, a set of Import statements for declaring data dependencies, and a set of Rules. In this context, users need only specify Rules, omitting Package declaration and Import statements.

Request rewriting

You can specify how a read request should be rewritten when that request would otherwise violate your policy. This allows you to place constraints on what the data user can retrieve.

Specify this by adding the datasetRewrites field in your contexted rule. The datasetRewrites field contains an array of objects with the following structure:

  • repo (string): the name of the repository that the rewrite applies to
  • dataset (string): the dataset that should be rewritten
    • in the case of Snowflake, this denotes a fully qualified table name in the form <database>.<schema>.<table>
  • parameters ([string]): the set of parameters used in the substitution request, these are references to fields in the activity log as described in the Additional Checks section above
  • substitution (string): the request used to substitute references to the dataset

For example, the following contexted rule specifies a rewrite that is triggered in the event a request which reads EMAIL data would produce a policy violation. As a result, in this case, any references to the fully qualified table myDb.finance.customers will be replaced with the subquery SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:, where :identity.endUser: would be replaced with the value in the identity.endUser field in the activity log.

reads:
- data: [EMAIL]
rows: 10
datasetRewrites:
- repo: claims
dataset: myDb.finance.customers
parameters: [identity.endUser]
substitution: "SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:"

As a more specific example, suppose an individual makes the following query which would cause a policy violation due to reading more than the 10 row limit specified above. Suppose also that the individual has accessed the repository using SSO authentication, and is identified as the end user nancy.drew@hhiu.us.

SELECT * FROM myDb.finance.customers;

Given the dataset rewrite specification in the example above, the Cyral sidecar would rewrite the query such that the receiving database sees the following query.

SELECT * FROM (SELECT * FROM myDb.finance.customers WHERE email='nancy.drew@hhiu.us');
note

Currently, parameter substitutions take place even within string literals. For example, the substitution "SELECT FROM myDb.finance.customers WHERE greeting = 'Hello, :identity.endUser:'" contains the string literal 'Hello, :identity.endUser:'. During request rewriting, the sidecar will substitute :identity.endUser: with whatever value is in the identity.endUser field in the activity log associated with the data access.