Skip to content

Commit

Permalink
Merge pull request #514 from umccr/fix/filemanager-perf-queries
Browse files Browse the repository at this point in the history
refactor: filemanager indexes
  • Loading branch information
mmalenic authored Aug 25, 2024
2 parents 8bbd692 + 3911207 commit 4144d4c
Show file tree
Hide file tree
Showing 10 changed files with 661 additions and 222 deletions.
1 change: 1 addition & 0 deletions lib/workload/stateless/stacks/filemanager/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-- A gin index on attributes supported the `@?` operator and jsonpath queries.
create index attributes_index on s3_object using gin (attributes jsonb_path_ops);
-- An index on keys helps querying by prefix.
create index key_index on s3_object (key text_pattern_ops);
18 changes: 10 additions & 8 deletions lib/workload/stateless/stacks/filemanager/docs/API_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,26 +104,28 @@ curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "attributes[portal
### Wilcard matching

The API supports using wildcards to match multiple characters in a value for most field. Use `%` to match multiple characters
and `_` to match one character. These queries get converted to postgres `like` queries under the hood. For example, query
on a key prefix:
The API supports using wildcards to match multiple characters in a value for most field. Use `*` to match multiple characters
and `?` to match one character. Use a backslash character to match a literal `*` or `?` in the query. Another backslash can be used
to escape itself. No other escape characters are supported.

These get converted to postgres `like` queries under the hood. For example, query on a key prefix:

```sh
curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "key=temp\_data%" \
curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "key=temp_data*" \
"https://file.dev.umccr.org/api/v1/s3" | jq
```

Case-insensitive wildcard matching, which gets converted to a postgres `ilike` statement, is supported by using `caseSensitive`:

```sh
curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "key=temp\_data%" \
curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "key=temp_data*" \
"https://file.dev.umccr.org/api/v1/s3?caseSensitive=false" | jq
```

Wildcard matching is also supported on attributes:
Wildcard matching is also supported on attributes, which get converted to jsonpath `like_regex` queries:

```sh
curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "attributes[portalRunId]=20240521%" \
curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "attributes[portalRunId]=20240521*" \
"https://file.dev.umccr.org/api/v1/s3" | jq
```

Expand All @@ -147,7 +149,7 @@ Or, update attributes for multiple records with the same key prefix:
```sh
curl -X PATCH -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
--data '[ { "op": "add", "path": "/portalRunId", "value": "portalRunIdValue" } ]' \
"https://file.dev.umccr.org/api/v1/s3?key=%25202405212aecb782%25" | jq
"https://file.dev.umccr.org/api/v1/s3?key=*202405212aecb782*" | jq
```

## Count objects
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ aws_lambda_events = "0.15"

[dev-dependencies]
lazy_static = "1"
percent-encoding = "2"

aws-smithy-runtime-api = "1"
aws-smithy-mocks-experimental = "0.2"
Expand Down
Loading

0 comments on commit 4144d4c

Please sign in to comment.