diff --git a/lib/workload/stateless/stacks/metadata-manager/README.md b/lib/workload/stateless/stacks/metadata-manager/README.md
index 2143035ea..68f56b99b 100644
--- a/lib/workload/stateless/stacks/metadata-manager/README.md
+++ b/lib/workload/stateless/stacks/metadata-manager/README.md
@@ -37,19 +37,27 @@ to the URL: `.../library?libraryId=LIB001`
## Schema
-This is the current (WIP) schema that reflects the current implementation.
+This is the current (WIP) schema that reflects the current implementation. The schema is based on the
+draft [draw.io in Google Drive](https://app.diagrams.net/#G10ryWSXORMo7Qj7ghvj37LHYqmMm4hXW-#%7B%22pageId%22%3A%22vfe626awnvWGlhOGvxTV%22%7D)
+.
![schema](docs/schema.drawio.svg)
To modify the diagram, open the `docs/schema.drawio.svg` with [diagrams.net](https://app.diagrams.net/?src=about).
-`orcabus_id` is the unique identifier for each record in the database. It is generated by the application where the
-first 3 characters are the model prefix followed by [ULID](https://pypi.org/project/ulid-py/) separated by a dot (.).
-The prefix is as follows:
+The `orcabus_id` serves as the unique identifier for each record in the database. It is generated by the application
+using the [ULID](https://pypi.org/project/ulid-py/) library. When a record is accessed via the API, the `orcabus_id`
+is presented with a prefix consisting of three characters followed by a dot (.). The specific prefix varies depending
+on the model of the record.
-- Library model are `lib`
-- Specimen model are `spc`
-- Subject model are `sbj`
+| Model | Prefix |
+|------------|--------|
+| Subject | `sbj.` |
+| Sample | `smp.` |
+| Library | `lib.` |
+| Individual | `idv.` |
+| Contact | `ctc.` |
+| Project | `prj.` |
## How things work
@@ -59,22 +67,22 @@ In the near future, we might introduce different ways to load data into the appl
loading data
from the Google tracking sheet and mapping it to its respective model as follows.
-| Sheet Header | Table | Field Name |
-|-------------------|------------|----------------------|
-| SubjectID | `Subject` | lab_subject_id |
-| ExternalSubjectID | `Subject` | subject_id |
-| SampleID | `Specimen` | sample_id |
-| ExternalSampleID | `Specimen` | external_specimen_id |
-| Source | `Specimen` | source |
-| LibraryID | `Library` | library_id |
-| Phenotype | `Library` | phenotype |
-| Workflow | `Library` | workflow |
-| Quality | `Library` | quality |
-| Type | `Library` | type |
-| Coverage (X) | `Library` | coverage |
-| Assay | `Library` | assay |
-| ProjectOwner | `Library` | project_owner |
-| ProjectName | `Library` | project_name |
+| Sheet Header | Table | Field Name |
+|-------------------|--------------|--------------------|
+| SubjectID | `Individual` | individual_id |
+| ExternalSubjectID | `Subject` | subject_id |
+| SampleID | `Sample` | sample_id |
+| ExternalSampleID | `Sample` | external_sample_id |
+| Source | `Sample` | source |
+| LibraryID | `Library` | library_id |
+| Phenotype | `Library` | phenotype |
+| Workflow | `Library` | workflow |
+| Quality | `Library` | quality |
+| Type | `Library` | type |
+| Coverage (X) | `Library` | coverage |
+| Assay | `Library` | assay |
+| ProjectName | `Project` | project_id |
+| ProjectOwner | `Contact` | contact_id |
Some important notes of the sync:
diff --git a/lib/workload/stateless/stacks/metadata-manager/docs/schema.drawio.svg b/lib/workload/stateless/stacks/metadata-manager/docs/schema.drawio.svg
index 0984839ec..5ca6d206e 100644
--- a/lib/workload/stateless/stacks/metadata-manager/docs/schema.drawio.svg
+++ b/lib/workload/stateless/stacks/metadata-manager/docs/schema.drawio.svg
@@ -1,4 +1,4 @@
-
\ No newline at end of file
+
\ No newline at end of file
diff --git a/lib/workload/stateless/stacks/metadata-manager/proc/service/tracking_sheet_srv.py b/lib/workload/stateless/stacks/metadata-manager/proc/service/tracking_sheet_srv.py
index 8913c4e74..aad78f4f8 100644
--- a/lib/workload/stateless/stacks/metadata-manager/proc/service/tracking_sheet_srv.py
+++ b/lib/workload/stateless/stacks/metadata-manager/proc/service/tracking_sheet_srv.py
@@ -76,6 +76,7 @@ def persist_lab_metadata(df: pd.DataFrame, sheet_year: str):
# The data frame is to be the source of truth for the particular year
# So we need to remove db records which are not in the data frame
# Only doing this for library records and (dangling) sample/subject may be removed on a separate process
+ # Note: We do not remove many-to-many relationships if current df has changed
# For the library_id we need craft the library_id prefix to match the year
# E.g. year 2024, library_id prefix is 'L24' as what the Lab tracking sheet convention