Skip to content

Commit

Permalink
Readme changes
Browse files Browse the repository at this point in the history
  • Loading branch information
williamputraintan committed Sep 13, 2024
1 parent 57265ff commit 3488466
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 10 deletions.
12 changes: 7 additions & 5 deletions lib/workload/stateless/stacks/metadata-manager/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,10 @@ This is the current (WIP) schema that reflects the current implementation.

To modify the diagram, open the `docs/schema.drawio.svg` with [diagrams.net](https://app.diagrams.net/?src=about).

`orcabus_id` is the unique identifier for each record in the database. It is generated by the application where the first 3 characters are the model prefix followed by [ULID](https://pypi.org/project/ulid-py/) separated by a dot (.).
`orcabus_id` is the unique identifier for each record in the database. It is generated by the application where the
first 3 characters are the model prefix followed by [ULID](https://pypi.org/project/ulid-py/) separated by a dot (.).
The prefix is as follows:

- Library model are `lib`
- Specimen model are `spc`
- Subject model are `sbj`
Expand Down Expand Up @@ -72,13 +74,13 @@ from the Google tracking sheet and mapping it to its respective model as follows
| ProjectOwner | `Library` | project_owner |
| ProjectName | `Library` | project_name |


Some important notes of the sync:

1. The sync will only run from the current year.
2. The tracking sheet is the single source of truth, any deletion/update on any record (including the record that has
been
loaded) will also apply to the existing data.
2. The tracking sheet is the single source of truth for the current year. Any deletion or update to existing records
will be applied based on their internal IDs (`library_id`, `specimen_id`, and `subject_id`). For the library
model, the deletion will only occur based on the current year's prefix. For example, syncing the 2024 tracking
sheet will only query libraries with `library_id` starting with `L24` to determine whether to delete it.
3. `LibraryId` is treated as a unique value in the tracking sheet, so for any duplicated value (including from other
tabs) it will only recognize the last appearance.
4. In cases where multiple records share the same unique identifier (such as SampleId), only the data from the most
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,22 +40,21 @@ To query in a local terminal
gsheet_sync_lambda_arn=$(aws ssm get-parameter --name '/orcabus/metadata-manager/sync-gsheet-lambda-arn' --with-decryption | jq -r .Parameter.Value)
```

The lambda handler will accept an array of years from which sheet to run from the GSheet workbook. If no year is specified, it will run the current year.
The lambda handler will accept a single year from which sheet to run from the GSheet workbook. If no year is specified, it will run the current year.

```json
{
"year":["2024"]
"year": "2024"
}
```

Note that if you specify more than one year at a single invoke (e.g. `["2020", "2021"]`), there are high chances that lambda
would timeout and the sync is not completed properly.
Invoking lambda cmd:

```sh
aws lambda invoke \
--function-name $gsheet_sync_lambda_arn \
--invocation-type Event \
--payload '{ "year": ["2024"] }' \
--payload '{ "year": "2024" }' \
--cli-binary-format raw-in-base64-out \
res.json
```

0 comments on commit 3488466

Please sign in to comment.