Skip to content

Commit

Permalink
Merge pull request #762 from umccr/fix/filemanager-mulitple-queries
Browse files Browse the repository at this point in the history
feat: filemanager multiple same-key queries
  • Loading branch information
mmalenic authored Dec 9, 2024
2 parents 44532c3 + 3cfcf5f commit 976909a
Show file tree
Hide file tree
Showing 26 changed files with 1,590 additions and 1,738 deletions.
2,445 changes: 986 additions & 1,459 deletions lib/workload/stateless/stacks/filemanager/Cargo.lock

Large diffs are not rendered by default.

22 changes: 17 additions & 5 deletions lib/workload/stateless/stacks/filemanager/docs/API_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,14 +141,26 @@ curl --get -H "Authorization: Bearer $TOKEN" --data-urlencode "attributes[portal

## Multiple keys

The API supports querying using multiple keys with the same name. This represents an `or` condition in the SQL query, where
The API supports querying using multiple keys with the same name. This represents an `or` condition in the SQL query by default, where
records are fetched if any of the keys match. For example, the following finds records where the bucket is either `bucket1`
or `bucket2`:

```sh
curl -H "Authorization: Bearer $TOKEN" "https://file.dev.umccr.org/api/v1/s3?bucket[]=bucket1&bucket[]=bucket2" | jq
```

To be more explicit, pass in `or` as a keyword when querying. For example, the following is equivalent:

```sh
curl -H "Authorization: Bearer $TOKEN" "https://file.dev.umccr.org/api/v1/s3?bucket[or][]=bucket1&bucket[or][]=bucket2" | jq
```

To express an `and` condition in the SQL query instead, use the `and` keyword:

```sh
curl -H "Authorization: Bearer $TOKEN" "https://file.dev.umccr.org/api/v1/s3?bucket[and][]=bucket1&bucket[and][]=bucket2" | jq
```

Multiple keys are also supported on attributes. For example, the following finds records where the `portalRunId` is
either `20240521aecb782` or `20240521aecb783`:

Expand All @@ -159,9 +171,10 @@ curl --get -H "Authorization: Bearer $TOKEN" \
"https://file.dev.umccr.org/api/v1/s3" | jq
```

Note that the extra `[]` is required in the query parameters to specify multiple keys with the same name. Specifying
multiple of the same key without `[]` results in an error. It is also an error to specify some keys with `[]` and some
without for keys with the same name.
Note that the extra `[]` is required in the query parameters to specify multiple keys with the same name. It is also
required to place the extra `[]` when explicitly specifying `or` or `and` conditions. Specifying multiple of the same
key without `[]` results in an error. It is also an error to specify some keys with `[]` and some without for keys with
the same name.

## Updating records

Expand Down Expand Up @@ -226,7 +239,6 @@ curl -H "Authorization: Bearer $TOKEN" "https://file.dev.umccr.org/api/v1/s3/pre
There are some missing features in the query API which are planned, namely:

* There is no way to compare values with `>`, `>=`, `<`, `<=`.
* There is no way to express `and` or `or` conditions in the API (except for multiple keys representing `or` conditions).

There are also some feature missing for attribute linking. For example, there is no way
to capture matching wildcard groups which can later be used in the JSON patch body.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,5 @@ tracing = { version = "0.1" }
axum = "0.7"

lambda_http = "0.13"
lambda_runtime = "0.13"

filemanager = { path = "../filemanager" }
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,6 @@ axum = "0.7"
dotenvy = "0.15"
http = "1"
clap = { version = "4", features = ["derive", "env"] }
sea-orm = { version = "1.1.0-rc.1", default-features = false, features = ["sqlx-postgres", "runtime-tokio-rustls"] }
sea-orm = { version = "1.1.2", default-features = false, features = ["sqlx-postgres", "runtime-tokio-rustls"] }

filemanager = { path = "../filemanager", features = ["migrate"] }
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,13 @@ authors.workspace = true
rust-version.workspace = true

[dependencies]
thiserror = "1"
thiserror = "2"
clap_builder = "4"
clap = "4"
dotenvy = "0.15"
sea-orm-cli = { version = "1.1.0-rc.1", default-features = false, features = ["cli", "codegen", "runtime-tokio-rustls"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread", "process"] }
miette = { version = "7", features = ["fancy"] }
serde = { version = "1", features = ["derive"] }
quote = "1"
syn = { version = "2", features = ["full", "extra-traits", "parsing", "visit-mut"] }
prettyplease = "0.2"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,31 +10,44 @@ use crate::Result;
use heck::AsPascalCase;
use prettyplease::unparse;
use quote::format_ident;
use std::collections::HashMap;
use std::fs::{read_dir, read_to_string, write};
use std::path::Path;
use syn::visit_mut::VisitMut;
use syn::{parse_file, parse_quote, Ident, ItemStruct};
use syn::{parse_file, parse_quote, Ident, ItemStruct, Type};
use tokio::process::Command;

/// OpenAPI definition generator implementing `VisitMut`.
#[derive(Debug)]
pub struct GenerateOpenAPI<'a> {
model_ident: &'a Ident,
override_types: &'a HashMap<Type, Type>,
name: &'a str,
}

impl<'a> VisitMut for GenerateOpenAPI<'a> {
impl VisitMut for GenerateOpenAPI<'_> {
fn visit_item_struct_mut(&mut self, i: &mut ItemStruct) {
if &i.ident == self.model_ident {
let path_ident: Ident = format_ident!("{}", self.name);
i.attrs.push(parse_quote! { #[schema(as = #path_ident)] });
}

i.fields.iter_mut().for_each(|field| {
if self.override_types.contains_key(&field.ty) {
field.ty = self.override_types[&field.ty].clone();
}
})
}
}

/// Generate OpenAPI utoipa definitions on top of the sea-orm entities.
pub async fn generate_openapi(out_dir: &Path) -> Result<()> {
let model_ident: Ident = parse_quote! { Model };
let override_types: HashMap<Type, Type> = HashMap::from_iter(vec![(
parse_quote! { Option<DateTimeWithTimeZone> },
parse_quote! { Option<chrono::DateTime<chrono::FixedOffset>> },
)]);

for path in read_dir(out_dir)? {
let path = path?.path();

Expand All @@ -54,6 +67,7 @@ pub async fn generate_openapi(out_dir: &Path) -> Result<()> {

GenerateOpenAPI {
model_ident: &model_ident,
override_types: &override_types,
name,
}
.visit_file_mut(&mut tokens);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ rust-version.workspace = true

[dependencies]
tokio = { version = "1", features = ["macros"] }
tracing = { version = "0.1" }

aws_lambda_events = "0.15"
lambda_runtime = "0.13"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ rust-version.workspace = true
[dependencies]
serde = { version = "1", features = ["derive"] }
tokio = { version = "1", features = ["macros"] }
tracing = { version = "0.1" }

lambda_runtime = "0.13"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,6 @@ authors.workspace = true
rust-version.workspace = true

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"

tokio = { version = "1", features = ["macros"] }
tracing = { version = "0.1" }

Expand All @@ -18,7 +15,3 @@ aws-sdk-cloudformation = "1"
lambda_runtime = "0.13"

filemanager = { path = "../filemanager", features = ["migrate"] }

[dev-dependencies]

serde_json = "1.0"
28 changes: 12 additions & 16 deletions lib/workload/stateless/stacks/filemanager/filemanager/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ tracing-subscriber = { version = "0.3", default-features = false, features = ["f

# Database
sqlx = { version = "0.8", default-features = false, features = ["postgres", "runtime-tokio", "tls-rustls", "chrono", "uuid", "macros"] }
sea-orm = { version = "1.1.0-rc.1", default-features = false, features = [
sea-orm = { version = "1.1", default-features = false, features = [
"sqlx-postgres",
"runtime-tokio-rustls",
"macros",
Expand All @@ -39,36 +39,36 @@ strum = { version = "0.26", features = ["derive"] }
# Query server
axum = "0.7"
axum-extra = "0.9"
utoipa = { version = "4", features = ["axum_extras", "chrono", "uuid", "url"] }
utoipa-swagger-ui = { version = "7", features = ["axum", "debug-embed", "url"] }
tower = "0.4"
tower-http = { version = "0.5", features = ["trace", "cors"] }
utoipa = { version = "5", features = ["axum_extras", "chrono", "uuid", "url"] }
utoipa-swagger-ui = { version = "8", features = ["axum", "debug-embed", "url"] }
tower = { version = "0.5", features = ["util"] }
tower-http = { version = "0.6", features = ["trace", "cors"] }
serde_qs = { version = "0.13", features = ["axum"] }
json-patch = "2"
json-patch = "3"

# General
chrono = { version = "0.4", features = ["serde"] }
thiserror = "1"
thiserror = "2"
uuid = { version = "1", features = ["v7"] }
mockall = "0.13"
mockall_double = "0.3"
itertools = "0.13"
url = { version = "2", features = ["serde"] }
bytes = "1.6"
envy = "0.4"
rand = "0.8"
parse-size = "1"
humantime = "2"
percent-encoding = "2"

# Inventory
csv = "1"
flate2 = "1"
md5 = "0.7"
hex = "0.4"
parquet = { version = "52", features = ["async"] }
arrow = { version = "52", features = ["chrono-tz"] }
arrow-json = "52"
orc-rust = "0.3"
parquet = { version = "53", features = ["async"] }
arrow = { version = "53", features = ["chrono-tz"] }
arrow-json = "53"
orc-rust = "0.5"

# AWS
aws-sdk-sqs = "1"
Expand All @@ -81,7 +81,6 @@ aws_lambda_events = "0.15"

[dev-dependencies]
lazy_static = "1"
percent-encoding = "2"

aws-smithy-runtime-api = "1"
aws-smithy-mocks-experimental = "0.2"
Expand All @@ -91,7 +90,4 @@ aws-sdk-s3 = { version = "1", features = ["test-util"] }
filemanager = { path = ".", features = ["migrate"] }

[build-dependencies]
filemanager-build = { path = "../filemanager-build" }
miette = { version = "7", features = ["fancy"] }
tokio = { version = "1", features = ["macros"] }
dotenvy = "0.15"
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.0-rc.1
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.2
pub mod prelude;
pub mod s3_object;
pub mod sea_orm_active_enums;
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.0-rc.1
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.2
pub use super::s3_object::Entity as S3Object;
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.0-rc.1
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.2
use super::sea_orm_active_enums::EventType;
use super::sea_orm_active_enums::StorageClass;
use sea_orm::entity::prelude::*;
Expand All @@ -19,11 +19,11 @@ pub struct Model {
pub key: String,
#[sea_orm(column_type = "Text")]
pub version_id: String,
pub event_time: Option<DateTimeWithTimeZone>,
pub event_time: Option<chrono::DateTime<chrono::FixedOffset>>,
pub size: Option<i64>,
#[sea_orm(column_type = "Text", nullable)]
pub sha256: Option<String>,
pub last_modified_date: Option<DateTimeWithTimeZone>,
pub last_modified_date: Option<chrono::DateTime<chrono::FixedOffset>>,
#[sea_orm(column_type = "Text", nullable)]
pub e_tag: Option<String>,
pub storage_class: Option<StorageClass>,
Expand All @@ -33,7 +33,7 @@ pub struct Model {
pub number_duplicate_events: i64,
#[sea_orm(column_type = "JsonBinary", nullable)]
pub attributes: Option<Json>,
pub deleted_date: Option<DateTimeWithTimeZone>,
pub deleted_date: Option<chrono::DateTime<chrono::FixedOffset>>,
#[sea_orm(column_type = "Text", nullable)]
pub deleted_sequencer: Option<String>,
pub number_reordered: i64,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.0-rc.1
//! `SeaORM` Entity, @generated by sea-orm-codegen 1.1.2
use sea_orm::entity::prelude::*;
use serde::{Deserialize, Serialize};
#[derive(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@ impl<'a> Collecter<'a> {

// Get the attributes from the old record to update the new record with.
let filter = S3ObjectsFilter {
ingest_id: vec![ingest_id],
ingest_id: vec![ingest_id].into(),
..Default::default()
};
let moved_object = ListQueryBuilder::new(database_client.connection_ref())
Expand Down Expand Up @@ -362,7 +362,7 @@ impl From<BuildError> for Error {
}

#[async_trait]
impl<'a> Collect for Collecter<'a> {
impl Collect for Collecter<'_> {
async fn collect(mut self) -> Result<EventSource> {
let (client, database_client, events, config) = self.into_inner();

Expand Down
Loading

0 comments on commit 976909a

Please sign in to comment.