-
Notifications
You must be signed in to change notification settings - Fork 3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
268 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,264 @@ | ||
# Open Policy Agent access control | ||
|
||
Use [Open Policy Agent (OPA)](https://www.openpolicyagent.org/) as authorization | ||
engine for fine-grained access control to catalogs, schemas, tables, and more in | ||
Trino. Policies are defined in OPA, and Trino checks access control privileges | ||
in OPA. | ||
|
||
## Trino configuration | ||
|
||
To use only OPA access control, use the default file | ||
`etc/access-control.properties` with the following minimal content: | ||
|
||
```properties | ||
access-control.name=opa | ||
opa.policy.uri=https://your-opa-endpoint/v1/data/allow | ||
``` | ||
|
||
To combine OPA access control with file-based or other access control systems, configure multiple access control configuration file paths in `etc/config.properties`: | ||
|
||
```properties | ||
access-control.config-files=etc/trino/file-based.properties,etc/trino/opa.properties | ||
``` | ||
|
||
Ensure to configure each access-control system in the specified files. | ||
|
||
The following table details all configuration properties for the OPA access control: | ||
|
||
:::{list-table} | ||
:widths: 20, 80 | ||
:header-rows: 1 | ||
|
||
* - Name | ||
- Description | ||
* - `opa.policy.uri` | ||
- The **required** URI for the OPA endpoint, for example, | ||
`https://opa.example.com/v1/data/allow`. | ||
* - `opa.policy.batched-uri` | ||
- The **optional** URI for activating batch mode for certain authorization | ||
queries where batching is applicable, for example | ||
`https://opa.example.com/v1/data/batch`. Find more details in | ||
[](opa-batch-mode). | ||
* - `opa.log-requests` | ||
- Configure if request details, including URI, headers and the entire body, are | ||
logged prior to sending them to OPA. Defaults to `false`. | ||
* - `opa.log-responses` | ||
- Configure if OPA response details, including URI, status code, headers and | ||
the entire body, are logged. Defaults to `false`. | ||
* - `opa.allow-permissioning-operations` | ||
- Configure if permissioning operations are allowed. Find more details in | ||
[](opa-permissioning-operations). Defaults to `false`. | ||
* - `opa.http-client.*` | ||
- Optional HTTP client configurations for the connection from Trino to OPA, | ||
for example `opa.http-client.http-proxy` for configuring the HTTP proxy. | ||
Find more details in [](/admin/properties-http-client). | ||
::: | ||
|
||
### Logging | ||
|
||
When request or response logging is enabled, details are logged at the `DEBUG` | ||
level under the `io.trino.plugin.opa.OpaHttpClient` logger. The Trino logging | ||
configuration must be updated to include this class, to ensure log entries are | ||
created. | ||
|
||
Note that enabling these options produces very large amounts of log data. | ||
|
||
(opa-permissioning-operations)= | ||
### Permissioning operations | ||
|
||
The following operations are allowed or denied based on this setting, no request is sent | ||
to OPA. The following operations are controlled by the | ||
`opa.allow-permissioning-operations` setting. If set to `true`, these operations | ||
are allowed. If set to `false`, they are denied. In both cases, no request is | ||
sent to OPA. | ||
|
||
- `GrantSchemaPrivilege` | ||
- `DenySchemaPrivilege` | ||
- `RevokeSchemaPrivilege` | ||
- `GrantTablePrivilege` | ||
- `DenyTablePrivilege` | ||
- `RevokeTablePrivilege` | ||
- `CreateRole` | ||
- `DropRole` | ||
- `GrantRoles` | ||
- `RevokeRoles` | ||
|
||
The setting defaults to `false` due to the complexity and potential unexpected | ||
consequences of having SQL-style grants and roles together with OPA. | ||
|
||
Additionally, users are always allowed to show information about roles (`SHOW | ||
ROLES`), regardless of this setting. The following operations are _always_ | ||
allowed: | ||
|
||
- `ShowRoles` | ||
- `ShowCurrentRoles` | ||
- `ShowRoleGrants` | ||
|
||
## OPA configuration | ||
|
||
The OPA access control in Trino contacts OPA for each query and issues an | ||
authorization request. OPA must return a response containing a boolean `allow` | ||
field, which determines whether the operation is permitted or not. | ||
|
||
Policies in OPA are defined with the purpose built policy language Rego. Find | ||
more information in the [detailed | ||
documentation](https://www.openpolicyagent.org/docs/latest/policy-language/). | ||
|
||
A query from the OPA access control to Trino to OPA contains a `context` and an | ||
`action` as its top level fields. | ||
|
||
The `context` object contains all other contextual information about the query: | ||
|
||
- `identity`: The identity of the user performing the operation, containing the | ||
following two fields: | ||
- `user`: username | ||
- `groups`: list of groups this user belongs to | ||
- `softwareStack`: Information about the software stack issuing the request to | ||
OPA. The following information is included: | ||
- `trinoVersion`: Version of Trino used | ||
|
||
The `action` object contains information about what action is performed on what | ||
resources. The following fields are provided: | ||
|
||
- `operation`: the performed operation, for example `SelectFromColumns`. | ||
- `resource`: information about the accessed objects | ||
- `targetResource`: information about any newly created object, if applicable | ||
- `grantee`: grantee of a grant operation. | ||
|
||
Fields that are not applicable for a specific operation are set to null. | ||
Examples are an empty `targetResource` if not modifying a table or schema or | ||
catalog is modified, or an empty `grantee` if not granting permissions is set. | ||
Any null field is omitted altogether from the `action` object. | ||
|
||
### Example requests to OPA | ||
|
||
Accessing a table results in a query similar to the following example: | ||
|
||
```json | ||
{ | ||
"context": { | ||
"identity": { | ||
"user": "foo", | ||
"groups": ["some-group"] | ||
}, | ||
"softwareStack": { | ||
"trinoVersion": "434" | ||
} | ||
}, | ||
"action": { | ||
"operation": "SelectFromColumns", | ||
"resource": { | ||
"table": { | ||
"catalogName": "example_catalog", | ||
"schemaName": "example_schema", | ||
"tableName": "example_table", | ||
"columns": [ | ||
"column1", | ||
"column2", | ||
"column3" | ||
] | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
The `targetResource` is used in cases where a new resource, distinct from the one in | ||
`resource` is created. For example, when renaming a table. | ||
|
||
```json | ||
{ | ||
"context": { | ||
"identity": { | ||
"user": "foo", | ||
"groups": ["some-group"] | ||
}, | ||
"softwareStack": { | ||
"trinoVersion": "434" | ||
} | ||
}, | ||
"action": { | ||
"operation": "RenameTable", | ||
"resource": { | ||
"table": { | ||
"catalogName": "example_catalog", | ||
"schemaName": "example_schema", | ||
"tableName": "example_table" | ||
} | ||
}, | ||
"targetResource": { | ||
"table": { | ||
"catalogName": "example_catalog", | ||
"schemaName": "example_schema", | ||
"tableName": "new_table_name" | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
(opa-batch-mode)= | ||
## Batch mode | ||
|
||
A very powerful feature provided by OPA is its ability to respond to | ||
authorization queries with more complex answers than a `true` or `false` boolean | ||
value. | ||
|
||
Many features in Trino require _filtering_ to be performed to determine, given a | ||
list of resources, (e.g. tables, queries, views, etc...) which of those a user | ||
should be entitled to see/interact with. | ||
|
||
If `opa.policy.batched-uri` is _not_ configured, the plugin sends one | ||
request to OPA _per item_ being filtered, then use the responses from OPA to | ||
construct a filtered list containing only those items for which a `true` | ||
response was returned. | ||
|
||
Configuring `opa.policy.batched-uri` will allow the plugin to send a request to | ||
that _batch_ endpoint instead, with a **list** of the resources being filtered | ||
under `action.filterResources` (as opposed to `action.resource`). | ||
|
||
> The other fields in the request are identical to the non-batch endpoint. | ||
An OPA policy supporting batch operations should return a (potentially empty) | ||
list containing the _indices_ of the items for which authorization is granted | ||
(if any). Returning a `null` value instead of a list is equivalent to returning | ||
an empty list. | ||
|
||
> We may want to reconsider the choice of using _indices_ in the response as | ||
> opposed to returning a list containing copies of elements from the | ||
> `filterResources` field in the request for which access should be granted. | ||
> Indices were chosen over copying elements as it made validation in the plugin | ||
> easier, and from the few examples we tried, it also made certain policies a | ||
> bit simpler. Any feedback is appreciated! | ||
An interesting side effect of this is that we can add batching support for | ||
policies that didn't originally have it quite easily. Consider the following | ||
rego: | ||
|
||
```text | ||
package foo | ||
# ... rest of the policy ... | ||
# this assumes the non-batch response field is called "allow" | ||
batch contains i { | ||
some i | ||
raw_resource := input.action.filterResources[i] | ||
allow with input.action.resource as raw_resource | ||
} | ||
# Corner case: filtering columns is done with a single table item, and many columns inside | ||
# We cannot use our normal logic in other parts of the policy as they are based on sets | ||
# and we need to retain order | ||
batch contains i { | ||
some i | ||
input.action.operation == "FilterColumns" | ||
count(input.action.filterResources) == 1 | ||
raw_resource := input.action.filterResources[0] | ||
count(raw_resource["table"]["columns"]) > 0 | ||
new_resources := [ | ||
object.union(raw_resource, {"table": {"column": column_name}}) | ||
| column_name := raw_resource["table"]["columns"][_] | ||
] | ||
allow with input.action.resource as new_resources[i] | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters