Skip to content

Commit

Permalink
Add docs for OPA access control
Browse files Browse the repository at this point in the history
  • Loading branch information
mosabua committed Dec 30, 2023
1 parent b6eb50c commit d437f63
Show file tree
Hide file tree
Showing 3 changed files with 269 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/src/main/sphinx/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ security/group-file
security/built-in-system-access-control
security/file-system-access-control
security/opa-access-control
```

## Security inside the cluster
Expand Down
265 changes: 265 additions & 0 deletions docs/src/main/sphinx/security/opa-access-control.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
# Open Policy Agent access control

Use [Open Policy Agent (OPA)](https://www.openpolicyagent.org/) as authorization
engine for fine-grained access control to catalogs, schemas, tables, and more in
Trino. Policies are defined in OPA, and Trino checks access control privileges
in OPA.

## Requirements

* A Open Policy Agent deployment
* Network connectivity from the Trino cluster to the OPA server

With the requirements fulfilled, you can proceed to set up Trino and OPA with
your desired access control configuration.

## Trino configuration

To use only OPA for access control, create the file `etc/access-control.properties`
with the following minimal configuration:

```properties
access-control.name=opa
opa.policy.uri=https://your-opa-endpoint/v1/data/allow
```

To combine OPA access control with file-based or other access control systems,
configure multiple access control configuration file paths in
`etc/config.properties`:

```properties
access-control.config-files=etc/trino/file-based.properties,etc/trino/opa.properties
```

Ensure to configure each access-control system in the specified files.

The following table lists all configuration properties for the OPA access control:

:::{list-table} OPA access control configuration properties
:widths: 40, 60
:header-rows: 1

* - Name
- Description
* - `opa.policy.uri`
- The **required** URI for the OPA endpoint, for example,
`https://opa.example.com/v1/data/allow`.
* - `opa.policy.batched-uri`
- The **optional** URI for activating batch mode for certain authorization
queries where batching is applicable, for example
`https://opa.example.com/v1/data/batch`. Find more details in
[](opa-batch-mode).
* - `opa.log-requests`
- Configure if request details, including URI, headers and the entire body, are
logged prior to sending them to OPA. Defaults to `false`.
* - `opa.log-responses`
- Configure if OPA response details, including URI, status code, headers and
the entire body, are logged. Defaults to `false`.
* - `opa.allow-permissioning-operations`
- Configure if permissioning operations are allowed. Find more details in
[](opa-permissioning-operations). Defaults to `false`.
* - `opa.http-client.*`
- Optional HTTP client configurations for the connection from Trino to OPA,
for example `opa.http-client.http-proxy` for configuring the HTTP proxy.
Find more details in [](/admin/properties-http-client).
:::

### Logging

When request or response logging is enabled, details are logged at the `DEBUG`
level under the `io.trino.plugin.opa.OpaHttpClient` logger. The Trino logging
configuration must be updated to include this class, to ensure log entries are
created.

Note that enabling these options produces very large amounts of log data.

(opa-permissioning-operations)=
### Permissioning operations

The following operations are allowed or denied based on this setting, no request is sent
to OPA. The following operations are controlled by the
`opa.allow-permissioning-operations` setting. If set to `true`, these operations
are allowed. If set to `false`, they are denied. In both cases, no request is
sent to OPA.

- `GrantSchemaPrivilege`
- `DenySchemaPrivilege`
- `RevokeSchemaPrivilege`
- `GrantTablePrivilege`
- `DenyTablePrivilege`
- `RevokeTablePrivilege`
- `CreateRole`
- `DropRole`
- `GrantRoles`
- `RevokeRoles`

The setting defaults to `false` due to the complexity and potential unexpected
consequences of having SQL-style grants and roles together with OPA.

Additionally, users are always allowed to show information about roles (`SHOW
ROLES`), regardless of this setting. The following operations are _always_
allowed:

- `ShowRoles`
- `ShowCurrentRoles`
- `ShowRoleGrants`

## OPA configuration

The OPA access control in Trino contacts OPA for each query and issues an
authorization request. OPA must return a response containing a boolean `allow`
field, which determines whether the operation is permitted or not.

Policies in OPA are defined with the purpose built policy language Rego. Find
more information in the [detailed
documentation](https://www.openpolicyagent.org/docs/latest/policy-language/).
After the initial installation and configuration in Trino, these policies are
the main configuration aspect for your access control setup.

A query from the OPA access control in Trino to OPA contains a `context` and an
`action` as its top level fields.

The `context` object contains all other contextual information about the query:

- `identity`: The identity of the user performing the operation, containing the
following two fields:
- `user`: username
- `groups`: list of groups this user belongs to
- `softwareStack`: Information about the software stack issuing the request to
OPA. The following information is included:
- `trinoVersion`: Version of Trino used

The `action` object contains information about what action is performed on what
resources. The following fields are provided:

- `operation`: the performed operation, for example `SelectFromColumns`.
- `resource`: information about the accessed objects
- `targetResource`: information about any newly created object, if applicable
- `grantee`: grantee of a grant operation.

Fields that are not applicable for a specific operation are set to null.
Examples are an empty `targetResource` if not modifying a table or schema or
catalog is modified, or an empty `grantee` if not granting permissions is set.
Any null field is omitted altogether from the `action` object.

### Example requests to OPA

Accessing a table results in a query similar to the following example:

```json
{
"context": {
"identity": {
"user": "foo",
"groups": ["some-group"]
},
"softwareStack": {
"trinoVersion": "434"
}
},
"action": {
"operation": "SelectFromColumns",
"resource": {
"table": {
"catalogName": "example_catalog",
"schemaName": "example_schema",
"tableName": "example_table",
"columns": [
"column1",
"column2",
"column3"
]
}
}
}
}
```

The `targetResource` is used in cases where a new resource, distinct from the one in
`resource` is created. For example, when renaming a table.

```json
{
"context": {
"identity": {
"user": "foo",
"groups": ["some-group"]
},
"softwareStack": {
"trinoVersion": "434"
}
},
"action": {
"operation": "RenameTable",
"resource": {
"table": {
"catalogName": "example_catalog",
"schemaName": "example_schema",
"tableName": "example_table"
}
},
"targetResource": {
"table": {
"catalogName": "example_catalog",
"schemaName": "example_schema",
"tableName": "new_table_name"
}
}
}
}
```

(opa-batch-mode)=
## Batch mode

A very powerful feature provided by OPA is its ability to respond to
authorization queries with more complex answers than a `true` or `false` boolean
value.

Many features in Trino require filtering to determine to which resources a user
is granted access. These resources are catalogs, schema, queries, views, and
others objects.

If `opa.policy.batched-uri` is not configured, Trino sends one request to OPA
for each object, and then creates a filtered list of accessible objects.
response was returned.

Configuring `opa.policy.batched-uri` allows Trino to send a request to
the batch endpoint, with a list of resources in one request using the
under `action.filterResources` node.

All other fields in the request are identical to the non-batch endpoint.

An OPA policy supporting batch operations should return a list containing the
_indices_ of the items for which authorization is granted. Returning a `null`
value or an empty list is equivalent and denies any access.

You can batching support for policies that't originally do not support it:

```text
package foo
# ... rest of the policy ...
# this assumes the non-batch response field is called "allow"
batch contains i {
some i
raw_resource := input.action.filterResources[i]
allow with input.action.resource as raw_resource
}
# Corner case: filtering columns is done with a single table item, and many columns inside
# We cannot use our normal logic in other parts of the policy as they are based on sets
# and we need to retain order
batch contains i {
some i
input.action.operation == "FilterColumns"
count(input.action.filterResources) == 1
raw_resource := input.action.filterResources[0]
count(raw_resource["table"]["columns"]) > 0
new_resources := [
object.union(raw_resource, {"table": {"column": column_name}})
| column_name := raw_resource["table"]["columns"][_]
]
allow with input.action.resource as new_resources[i]
}
```
2 changes: 2 additions & 0 deletions docs/src/main/sphinx/security/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,8 @@ To implement access control, use:
- {doc}`File-based system access control <file-system-access-control>`, where
you configure JSON files that specify fine-grained user access restrictions at
the catalog, schema, or table level.
- [](opa-access-control), where you use Open Policy Agent to make access control
decisions on a fined-grained level.

In addition, Trino {doc}`provides an API </develop/system-access-control>` that
allows you to create a custom access control method, or to extend an existing
Expand Down

0 comments on commit d437f63

Please sign in to comment.