Add row-level security filter in query #17564

cecemei · 2024-12-13T02:21:52Z

This PR adds the ability to attach row filters to a query, thus restrict row-level data access for given users.

Description

A query follows these steps: initialize -> authorize -> execute. In the authorize step, the permissions are checked for all the required resources in the query. Before this PR, the authorize step only returns allow or deny access on a table. Granting access to a table means a user can see all data in this table. After this PR, the authorize step can return allow access along with restrictions (i.e. a row filter that must be applied to the table ), which restrict users' data access at row level. For example, customers can only see rows relevant to their company.

The authorizeAllResourceActions now returns a AuthorizationResult instead of Access, this class also replaces DruidPlanner.AuthResult class. The main difference between AuthorizationResult and Access is that the former contains a map of table with DimFilter. It can also have ResourceAction Iterables which DruidPlanner cares about.

In the authorize step of QueryLifecycle, it would enforce the filters on tables in the datasource tree, transform TableDataSource to RestrictedDataSource. In the execute step, filters are applied through RestrictedSegment and RestrictedCursorFactory.

Key changed/added classes in this PR

a new class AuthorizationResult. The class should be used for all the authorization calls, while the Access class is still used in Authorizer interface. It has an static variable ALLOW_ALL, which should be used for all internal calls. getPermissionErrorMessage(boolean policyFilterNotPermitted) is called to get a failure message, which replaced access.toString(), access.toMessage(), access.getMessage()`. The class contains:
- deny/ allow results for checking permissions on a list of resource actions.
- failure message if authorization fails, this is null when auth is allowed, and is the error message of the first resource action authorization failure (there might be more failures, but we don't try further)
- a map of table name with row-level policy filters.
Access. Added Optional<DimFilter> rowFilter field, which represents a restrictions returned from authorizer. Also updated constructor.
AbstractStatement. Replace DruidPlanner.AuthResult with AuthorizationResult.
AuthConfig. Added flag enableStrictPolicyCheck, when enabled, it would check every table needs to have a restriction in place, meaning it has an entry in the restrictions map, could be Optional.empty().
AuthorizationUtils. It now consolidates all restrictions for authorizing resource actions into a restrictions map, which is included in AuthorizationResult. Also updated javadoc for all public methods.
a new class RestrictedDataSource, which wraps a TableDataSource with a DimFilter. If the filter is null, meaning there's no applied.
a new class RestrictedSegment, which represents a segment with a filter.
a new class RestrictedCursorFactory, can be created by RestrictedSegment.asCursorFactory, enforces the DimFilter on Cursor.
DataSource interface, added a sub type of restrict, added a default method mapWithRestriction.
TableDataSource, added the impl of mapWithRestriction.
JoinDataSource can accept RestrictedDataSource as left-hand side datasource.
Query interfaced, added a default method withPolicyRestrictions.
SegmentMetadataQuery, added the impl of withDataSource.
QueryLifeCycle, replace baseQuery with baseQuery.withPolicyRestrictions if authorization result is not ALLOW_ALL (calls from internal services).

Caveats

The restrictions don't apply to VIEWs.
UnionDataSource doesn't work with RestrictedDataSource, planning to fix that later.
The restrictions don't apply to DartQueryMaker and MSQTaskQueryMaker, for now they would throw an error if there's any policy restrictions.

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
[] added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
[] been tested in a test Druid cluster.

processing/src/test/java/org/apache/druid/query/RestrictedDataSourceTest.java

+public class RestrictedDataSourceTest
+{
+  @Rule
+  public ExpectedException expectedException = ExpectedException.none();


…it as the default interface for checking permission

…field now

indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java

          resourceAction,
          authorizerMapper
      );
-      if (!authResult.isAllowed()) {
+
+      authResult.getPermissionErrorMessage(true).ifPresent(error -> {


gianm

This review includes some high level design comments.

I haven't checked the tests yet (will do so in a follow up review). The things I'd be looking for in the tests are:

we should have tests for the resources, including negative tests for resources (such as the Dart resource) that don't support policies yet. The negative tests should verify that we get an error like "this endpoint doesn't support policies".
we should also have tests for the lower level pieces like QueryLifecycle, the DataSource mapping, and RestrictedSegment.

server/src/main/java/org/apache/druid/server/security/AuthorizationResult.java

gianm · 2024-12-17T23:21:55Z

server/src/main/java/org/apache/druid/server/QueryLifecycle.java

   * @return authorization result
   */
-  public Access authorize(HttpServletRequest req)
+  public AuthorizationResult authorize(HttpServletRequest req)


Because AuthorizationResult includes policies, this method signature makes it ambiguous as to whether the caller should apply policies.

In this case, the QueryLifecycle itself applies the policies, and the caller therefore doesn't need to. We should make that clear somehow. Javadoc could do it, or perhaps returning something other than AuthorizationResult.

Updated the javadoc here to indicate query/datasource is transformed here

gianm · 2024-12-18T16:41:12Z

server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java

 import java.util.Set;

 /**
 * Static utility functions for performing authorization checks.
 */
 public class AuthorizationUtils
 {
+  static final ImmutableSet<String> RESTRICTION_APPLICABLE_RESOURCE_TYPES = ImmutableSet.of(
+      ResourceType.DATASOURCE,
+      ResourceType.VIEW


IMO, we should do datasource-only for now, since VIEW would be another bundle of stuff to think about: views are resolved in the SQL planner, so the restrictions would need to be applied in a different place.

This does bring up a question about the model though. If a user has restricted access to a DATASOURCE, should those restrictions apply when the datasource is accessed through a SQL view? My stance is "yes" and I think the way we're doing it will achieve that. Please include some tests for this just to be sure it works as expected.

actually that's opposite of the status quo, see the restrictedView calcite tests. the view is created on forbiddenDatasource which users don't have access to, but they can query the restrictedView.

gianm · 2024-12-18T16:54:03Z

server/src/main/java/org/apache/druid/server/security/Access.java


  private final boolean allowed;
  private final String message;
+  // A row-level policy filter on top of table-level read access. It should be empty if there are no policy restrictions
+  // or if access is requested for an action other than reading the table.
+  private final Optional<DimFilter> rowFilter;


I feel like it would be good to keep the value as some more general class, like Policies rather than Optional<DimFilter>, so when we want to add other kinds of policies (such as column restrictions, possibly) they can fit right in without further changes to the Access object.

Added a Policy class, plz review

server/src/main/java/org/apache/druid/server/security/Access.java

processing/src/main/java/org/apache/druid/query/RestrictedDataSource.java

gianm · 2024-12-18T18:41:08Z

processing/src/main/java/org/apache/druid/query/DataSource.java

+   * @param rowFilters a mapping of table names to row filters, every table in the datasource tree must have an entry
+   * @return the updated datasource, with restrictions applied in the datasource tree
+   */
+  default DataSource mapWithRestriction(Map<String, Optional<DimFilter>> rowFilters, boolean enableStrictPolicyCheck)


We should design the strict check a bit differently. With this current design, a single literally TRUE filter would pass, but we don't want that. I think it would be better for this to not take enableStrictPolicyCheck, but instead for the strict check to happen in QueryLifecycle after the query is mapped. That would enable the check to be even stricter: it should really check not just that there is a filter, but also that the filter is actually doing something. To allow for the druid_internal or admin case, we can bypass the strict check if the user has permission for STATE READ (a broad administrative permission).

To allow for the druid_internal or admin case, we can bypass the strict check if the user has permission for STATE READ (a broad administrative permission).

Actually upon further reflection this seems too complex. We don't want to have to consider both policies and STATE permissions. Instead, let's introduce a Policy that is of type admin. It doesn't apply any restrictions, but it's something an authorizer can return to signify that the user is OK to query unrestricted.

Btw, the strict check in QueryLifecycle would need to happen even if the authorized returns ALLOW. (Strict check should fail in this case)

yes strict check should happen. btw, i added this flag because some tests in views are failing, which i assume won't be an easy solution to fix.

in theory if there's no views, we could default to strict check, it just wants to see table has an entry in policyMap (could Optional.empty() if authorizer says there's no policy).

Ah, I see. The strict check should be even stricter: there should be a mode that requires all authorization results to have some non-empty set of policies. The idea with that check is it's a defense against the authorizer being mis-configured in such a way that policies aren't being reported properly.

Ideally if we have a PolicyConfig class similar to AuthConfig, or put it in a policy context, maybe could be more flexible.

gianm · 2024-12-18T18:42:36Z

processing/src/main/java/org/apache/druid/query/DataSource.java

+   * Returns an updated datasource based on the policy restrictions on tables. If this datasource contains no table, no
+   * changes should occur.
+   *
+   * @param rowFilters a mapping of table names to row filters, every table in the datasource tree must have an entry


This should take Map<String, Policies> instead, so other types of policies can be applied in the future without changing the DataSource interface.

Takes Map<String, Optional> now, since policy is optional for druid tables, ppl can config any (or all) tables to be policy restricted. It's a single policy, since policy is returned from authorizer and it's a merged result of (policy rule, or policy template, which could be a serialized format and supports crud and stuff).

gianm · 2024-12-18T19:00:35Z

processing/src/main/java/org/apache/druid/segment/RestrictedSegment.java

+
+import javax.annotation.Nullable;
+
+public class RestrictedSegment extends WrappedSegmentReference


Please include javadoc describing the purpose of this class, and how its defensive mechanisms work.

The way they should work is something like: that if you call asCursorFactory or as(CursorFactory.class) (plus perhaps some other small list), restrictions are handled automatically. But if you call asQueryableIndex or as for something other than that small list, the query gets the unrestricted internal object and it needs to call some method on the RestrictedSegment confirming that it applied the restrictions on its own.

needs to call some method on the RestrictedSegment confirming that it applied the restrictions on its own

To make this robust, we should actually have a 1-1 relationship between as calls and "i know what i'm doing" calls. If a query engine calls as three different times it should call "i know what i'm doing" three times.

…AuthorizationResult class, dart sql, msq sql, fix bug, added restricted data source to calcite test data

extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQSelectTest.java

@@ -798,6 +799,20 @@

  }

+  @MethodSource("data")
+  @ParameterizedTest(name = "{index}:with context {0}")
+  public void testSelectRestricted(String contextName, Map<String, Object> context)


…rityLevel enum in Policy class, updated a bunch of tests

…internal won't be restricted

github-actions bot added Area - Batch Ingestion Area - Querying Area - Segment Format and Ser/De Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Dec 13, 2024

cecemei force-pushed the security2 branch from 85bb1f7 to 652a95c Compare December 13, 2024 04:34

add row-level security filter

3eb0b66

cecemei force-pushed the security2 branch from 652a95c to 3eb0b66 Compare December 13, 2024 19:44

github-advanced-security bot found potential problems Dec 13, 2024

View reviewed changes

cecemei marked this pull request as ready for review December 13, 2024 20:45

clintropolis added the Design Review label Dec 13, 2024

Add getPermissionErrorMessage method to AuthorizationResult, and use …

dd251c2

…it as the default interface for checking permission

cecemei force-pushed the security2 branch from 925b01f to dd251c2 Compare December 18, 2024 03:30

cecemei added 4 commits December 17, 2024 20:14

policy change

b19728b

minor change, Optional.stream is not supported java8

aae0bb1

Merge remote-tracking branch 'apache/master' into security3

c7dc930

update test case for JoinDataSource since it has a new JoinAlgorithm …

af51747

…field now

cecemei force-pushed the security2 branch from b19728b to af51747 Compare December 18, 2024 06:24

github-advanced-security bot found potential problems Dec 18, 2024

View reviewed changes

cecemei added 2 commits December 18, 2024 00:11

fix a bug in AuthorizationResult

92077ed

fix style and bug

a470236

gianm reviewed Dec 18, 2024

View reviewed changes

cecemei added 2 commits December 18, 2024 14:03

Policy change

0664dfc

Added a Policy class to wrap the filter, updated some Javadoc

e7ebcf8

cecemei force-pushed the security2 branch from a470236 to b19728b Compare December 19, 2024 02:31

cecemei added 4 commits December 18, 2024 18:38

Merge branch 'master' into security2

cfe8e73

fix test

5199354

remove VIEW from restricted applicable resource types, add tests for …

390caac

…AuthorizationResult class, dart sql, msq sql, fix bug, added restricted data source to calcite test data

Merge branch 'master' into security2

6bdc39d

github-advanced-security bot found potential problems Dec 19, 2024

View reviewed changes

cecemei added 5 commits December 19, 2024 21:05

add Permission enum in AuthorizationResult class, add TablePolicySecu…

e81a25d

…rityLevel enum in Policy class, updated a bunch of tests

reverted changes in SegmentMetadataQuery, since now query from druid-…

ec6e3b5

…internal won't be restricted

revert change in test as well

f1aa791

clean up a bit

a1b3b06

updated the AuthorizationResult class a bit, along with some tests

17313d9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add row-level security filter in query #17564

Add row-level security filter in query #17564

cecemei commented Dec 13, 2024 •

edited

Loading

gianm left a comment

gianm Dec 17, 2024

cecemei Dec 19, 2024

gianm Dec 18, 2024

cecemei Dec 19, 2024

gianm Dec 18, 2024

cecemei Dec 19, 2024

gianm Dec 18, 2024

gianm Dec 18, 2024

cecemei Dec 18, 2024

gianm Dec 19, 2024

cecemei Dec 19, 2024

gianm Dec 18, 2024

cecemei Dec 19, 2024

gianm Dec 18, 2024

gianm Dec 18, 2024


		import javax.annotation.Nullable;

		public class RestrictedSegment extends WrappedSegmentReference

Add row-level security filter in query #17564

Are you sure you want to change the base?

Add row-level security filter in query #17564

Conversation

cecemei commented Dec 13, 2024 • edited Loading

Description

Key changed/added classes in this PR

Caveats

gianm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cecemei commented Dec 13, 2024 •

edited

Loading