diff --git a/docs/src/main/sphinx/security.md b/docs/src/main/sphinx/security.md index 5b1ecf62fb2354..05e3dc8db1dfc5 100644 --- a/docs/src/main/sphinx/security.md +++ b/docs/src/main/sphinx/security.md @@ -51,6 +51,8 @@ security/group-file security/built-in-system-access-control security/file-system-access-control +security/opa-access-control + ``` ## Security inside the cluster diff --git a/docs/src/main/sphinx/security/opa-access-control.md b/docs/src/main/sphinx/security/opa-access-control.md new file mode 100644 index 00000000000000..0e6fcc075f6629 --- /dev/null +++ b/docs/src/main/sphinx/security/opa-access-control.md @@ -0,0 +1,264 @@ +# Open Policy Agent access control + +Use [Open Policy Agent (OPA)](https://www.openpolicyagent.org/) as authorization +engine for fine-grained access control to catalogs, schemas, tables, and more in +Trino. Policies are defined in OPA, and Trino checks access control privileges +in OPA. + +## Trino configuration + +To use only OPA access control, use the default file +`etc/access-control.properties` with the following minimal content: + +```properties +access-control.name=opa +opa.policy.uri=https://your-opa-endpoint/v1/data/allow +``` + +To combine OPA access control with file-based or other access control systems, configure multiple access control configuration file paths in `etc/config.properties`: + +```properties +access-control.config-files=etc/trino/file-based.properties,etc/trino/opa.properties +``` + +Ensure to configure each access-control system in the specified files. + +The following table details all configuration properties for the OPA access control: + +:::{list-table} +:widths: 20, 80 +:header-rows: 1 + +* - Name + - Description +* - `opa.policy.uri` + - The **required** URI for the OPA endpoint, for example, + `https://opa.example.com/v1/data/allow`. +* - `opa.policy.batched-uri` + - The **optional** URI for activating batch mode for certain authorization + queries where batching is applicable, for example + `https://opa.example.com/v1/data/batch`. Find more details in + [](opa-batch-mode). +* - `opa.log-requests` + - Configure if request details, including URI, headers and the entire body, are + logged prior to sending them to OPA. Defaults to `false`. +* - `opa.log-responses` + - Configure if OPA response details, including URI, status code, headers and + the entire body, are logged. Defaults to `false`. +* - `opa.allow-permissioning-operations` + - Configure if permissioning operations are allowed. Find more details in + [](opa-permissioning-operations). Defaults to `false`. +* - `opa.http-client.*` + - Optional HTTP client configurations for the connection from Trino to OPA, + for example `opa.http-client.http-proxy` for configuring the HTTP proxy. + Find more details in [](/admin/properties-http-client). +::: + +### Logging + +When request or response logging is enabled, details are logged at the `DEBUG` +level under the `io.trino.plugin.opa.OpaHttpClient` logger. The Trino logging +configuration must be updated to include this class, to ensure log entries are +created. + +Note that enabling these options produces very large amounts of log data. + +(opa-permissioning-operations)= +### Permissioning operations + +The following operations are allowed or denied based on this setting, no request is sent +to OPA. The following operations are controlled by the +`opa.allow-permissioning-operations` setting. If set to `true`, these operations +are allowed. If set to `false`, they are denied. In both cases, no request is +sent to OPA. + +- `GrantSchemaPrivilege` +- `DenySchemaPrivilege` +- `RevokeSchemaPrivilege` +- `GrantTablePrivilege` +- `DenyTablePrivilege` +- `RevokeTablePrivilege` +- `CreateRole` +- `DropRole` +- `GrantRoles` +- `RevokeRoles` + +The setting defaults to `false` due to the complexity and potential unexpected +consequences of having SQL-style grants and roles together with OPA. + +Additionally, users are always allowed to show information about roles (`SHOW +ROLES`), regardless of this setting. The following operations are _always_ +allowed: + +- `ShowRoles` +- `ShowCurrentRoles` +- `ShowRoleGrants` + +## OPA configuration + +The OPA access control in Trino contacts OPA for each query and issues an +authorization request. OPA must return a response containing a boolean `allow` +field, which determines whether the operation is permitted or not. + +Policies in OPA are defined with the purpose built policy language Rego. Find +more information in the [detailed +documentation](https://www.openpolicyagent.org/docs/latest/policy-language/). + +A query from the OPA access control to Trino to OPA contains a `context` and an +`action` as its top level fields. + +The `context` object contains all other contextual information about the query: + +- `identity`: The identity of the user performing the operation, containing the + following two fields: + - `user`: username + - `groups`: list of groups this user belongs to +- `softwareStack`: Information about the software stack issuing the request to + OPA. The following information is included: + - `trinoVersion`: Version of Trino used + +The `action` object contains information about what action is performed on what +resources. The following fields are provided: + +- `operation`: the performed operation, for example `SelectFromColumns`. +- `resource`: information about the accessed objects +- `targetResource`: information about any newly created object, if applicable +- `grantee`: grantee of a grant operation. + +Fields that are not applicable for a specific operation are set to null. +Examples are an empty `targetResource` if not modifying a table or schema or +catalog is modified, or an empty `grantee` if not granting permissions is set. +Any null field is omitted altogether from the `action` object. + +### Example requests to OPA + +Accessing a table results in a query similar to the following example: + +```json +{ + "context": { + "identity": { + "user": "foo", + "groups": ["some-group"] + }, + "softwareStack": { + "trinoVersion": "434" + } + }, + "action": { + "operation": "SelectFromColumns", + "resource": { + "table": { + "catalogName": "example_catalog", + "schemaName": "example_schema", + "tableName": "example_table", + "columns": [ + "column1", + "column2", + "column3" + ] + } + } + } +} +``` + +The `targetResource` is used in cases where a new resource, distinct from the one in +`resource` is created. For example, when renaming a table. + +```json +{ + "context": { + "identity": { + "user": "foo", + "groups": ["some-group"] + }, + "softwareStack": { + "trinoVersion": "434" + } + }, + "action": { + "operation": "RenameTable", + "resource": { + "table": { + "catalogName": "example_catalog", + "schemaName": "example_schema", + "tableName": "example_table" + } + }, + "targetResource": { + "table": { + "catalogName": "example_catalog", + "schemaName": "example_schema", + "tableName": "new_table_name" + } + } + } +} +``` + +(opa-batch-mode)= +## Batch mode + +A very powerful feature provided by OPA is its ability to respond to +authorization queries with more complex answers than a `true` or `false` boolean +value. + +Many features in Trino require _filtering_ to be performed to determine, given a +list of resources, (e.g. tables, queries, views, etc...) which of those a user +should be entitled to see/interact with. + +If `opa.policy.batched-uri` is _not_ configured, the plugin sends one +request to OPA _per item_ being filtered, then use the responses from OPA to +construct a filtered list containing only those items for which a `true` +response was returned. + +Configuring `opa.policy.batched-uri` will allow the plugin to send a request to +that _batch_ endpoint instead, with a **list** of the resources being filtered +under `action.filterResources` (as opposed to `action.resource`). + +> The other fields in the request are identical to the non-batch endpoint. + +An OPA policy supporting batch operations should return a (potentially empty) +list containing the _indices_ of the items for which authorization is granted +(if any). Returning a `null` value instead of a list is equivalent to returning +an empty list. + +> We may want to reconsider the choice of using _indices_ in the response as +> opposed to returning a list containing copies of elements from the +> `filterResources` field in the request for which access should be granted. +> Indices were chosen over copying elements as it made validation in the plugin +> easier, and from the few examples we tried, it also made certain policies a +> bit simpler. Any feedback is appreciated! + +An interesting side effect of this is that we can add batching support for +policies that didn't originally have it quite easily. Consider the following +rego: + +```text +package foo + +# ... rest of the policy ... +# this assumes the non-batch response field is called "allow" +batch contains i { + some i + raw_resource := input.action.filterResources[i] + allow with input.action.resource as raw_resource +} + +# Corner case: filtering columns is done with a single table item, and many columns inside +# We cannot use our normal logic in other parts of the policy as they are based on sets +# and we need to retain order +batch contains i { + some i + input.action.operation == "FilterColumns" + count(input.action.filterResources) == 1 + raw_resource := input.action.filterResources[0] + count(raw_resource["table"]["columns"]) > 0 + new_resources := [ + object.union(raw_resource, {"table": {"column": column_name}}) + | column_name := raw_resource["table"]["columns"][_] + ] + allow with input.action.resource as new_resources[i] +} +``` diff --git a/docs/src/main/sphinx/security/overview.md b/docs/src/main/sphinx/security/overview.md index 1d82479a81bb8d..f9d723935ddfee 100644 --- a/docs/src/main/sphinx/security/overview.md +++ b/docs/src/main/sphinx/security/overview.md @@ -119,6 +119,8 @@ To implement access control, use: - {doc}`File-based system access control `, where you configure JSON files that specify fine-grained user access restrictions at the catalog, schema, or table level. +- [](opa-access-control), where you use Open Policy Agent to make access control + decisions on a fined-grained level. In addition, Trino {doc}`provides an API ` that allows you to create a custom access control method, or to extend an existing