Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fi-2084 Load resources from JSONs instead of initdb.sql #129

Merged
merged 15 commits into from
Dec 29, 2023

Conversation

dehall
Copy link
Contributor

@dehall dehall commented Dec 20, 2023

Summary

This change addresses a major pain point in updating the data on the reference server by removing initdb.sql and loading resources from JSON files instead.
This change is based heavily on Gravity-SDOHCC/gravity-sdoh-ehr-server#7 and Steve's prior work in #93

New behavior

Ultimately from an end-user perspective this should have 0 impact. The changes should only be apparent to developers, when we want to add/change the data.
Instead of reading DB state from an sql file, this now reads and loads resources from JSON files. Files should only be loaded when the server is empty, but this can be overriden with an env var.

IMPORTANT: Changes to HAPI's ID settings mean that the DB must be cleared out before this is run. You can't use an existing DB with this code, otherwise you will see very strange behavior (ex GET Patient/85 might return an error along the lines of "Resource Patient/42 is not known") If that's a concern I can look into ways to detect when the server is in that invalid state and have it fail to start with a useful error.

Code changes

  • Resource JSONs are migrated from ./us-core-resources to ./src/main/resources which allows them to be read in the same way even if the code is run as a standalone java app or as a WAR file. The files were already transaction Bundles with PUTs for the Patient resources, so no changes were made. I added a file for the Group resource since it was missing
    • EDIT - there were a couple changes I needed to make after the rebase, so for posterity:
    • For Patient 85, I changed the references on the MedicationDispense so they pointed to resources in that Bundle (and changed the code & display so it was the same medication as the referenced MedicationRequest. ...not sure that consistency really matters but it felt strange to leave it)
    • For 355, Observation.interpretation is 0..*
  • New code in MitreJpaServer actually loads the resources at startup. Note the loadResources function takes a Path, so we could theoretically allow for configuring the location to read from, but I didn't feel like that was necessary here.
  • Because our fixed Patient ids are numeric (85 and 355), in order to PUT them into a clean DB we need some changes to HAPI configuration.
  • No longer using initdb.sql means the docker configuration can be simplified - we no longer need a custom docker image, we can just use the postgres base image that our image was based off.
  • I took this opportunity to disable running tests when building the docker image since they are very slow (especially on multi-arch builds and they fail half the time anyway)

Testing guidance

  • The only 3 resources that have fixed IDs are Patient/85, Patient/355, and Group/1a. Are there any presets anywhere that might reference other fixed IDs?
  • Are there other resources needed for the server that I'm missing?
  • Be sure to delete any existing DB docker volume before running this in docker (docker volume rm inferno-reference-server_fhir-pgdata), or delete any local DB you may have

@dehall
Copy link
Contributor Author

dehall commented Dec 20, 2023

Looks like I may need to add something that prevents it from loading the resources when running tests

@dehall dehall force-pushed the fi-2084-load-resources-from-json branch from d9d5abb to 4446e11 Compare December 22, 2023 13:49
@arscan
Copy link
Contributor

arscan commented Dec 22, 2023

Since we are cleaning things up a bit, would it make sense to rename the transaction files, which are now UUIDs, to something more descriptive like uscore_bundle_patient_85.json, just to make it a little more readable? And the group to something like uscore_group_1a.json?

Copy link
Contributor

@Jammjammjamm Jammjammjamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the resources live in src, docker has to redo the java build whenever the data changes. It would be really great if they could get added to the image after the java build so that the docker rebuilds are fast when no actual source is changing.

Also, I now see some errors that aren't present on main when running the g10 tests with US Core 6 against this branch.

Screenshot 2023-12-22 at 10 42 47 AM

Screenshot 2023-12-22 at 10 42 30 AM

@dehall
Copy link
Contributor Author

dehall commented Dec 22, 2023

Updated a couple more fields in the resources to match the live DB versions, so those new errors should be resolved. I notice there are still some errors in the g10 tests but I get the same errors running against the prod instance so I assume they are ok? (DiagnosticReport MS references and Observation Pregnancy Intent profile validation)
image

image

Still trying to think of the best place to put the files so they don't require a full docker rebuild. Under src/main/resources is nice because it's easy to access them when running the app in different ways but maybe there are other options that maintain that.

@dehall
Copy link
Contributor Author

dehall commented Dec 22, 2023

Update: migrated the files back out of src/main/resources and back into ./resources. Running the server via mvn jetty:run or in an IDE will point to these local files, and for running in docker they will get copied to the appropriate location as the last step in the build. (I can also change this to use a volume, which I had it as originally, but this is probably easier for our deployment strategy)
I also made this ./resources path configurable via hapi.properties but that was mostly just so the tests don't load them.

@dehall dehall requested a review from Jammjammjamm December 22, 2023 21:31
Copy link
Contributor

@Jammjammjamm Jammjammjamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something is wrong with your local g10 setup, as this and prod both pass for me.

README.md Outdated
@@ -21,7 +21,7 @@ Note that sometimes on the initial start up, the database initialization might c
You can delete the server's data by stopping the containers with `docker-compose down` and then running `docker volume rm inferno-reference-server_fhir-pgdata` to remove the existing volume. Note that the default data will be reloaded when starting the containers.


The database will be initially populated by the default initdb.sql script. To update the default initial data with the data in the current db container, run `docker-compose exec db pg_dump -U postgres postgres > initdb.sql`
The database will be initially populated with the resources in `src/main/resources/fhir_resources`. If the server contains any `Patient` resources this process will be skipped, but you can force loading the files in this folder by setting the `FORCE_LOAD_RESOURCES` environment variable to `true`. Note that if the default files are still present this will result in duplicate data being populated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below need to be updated since they're no longer in src/main/resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Note that if the default files are still present this will result in duplicate data being populated." What are the "default files"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"default files" meant the original 3 files in the ./resources folder, the 2 patient bundles and the Group. I wasn't sure how much anybody besides us actually uses and configures this server, so the instruction or even that whole "FORCE_LOAD_RESOURCES" feature may be unnecessary. I can think of other wording

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means "additional" resources can be add into "src/main/resources/fhir_resources". Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good point. I should be more clear about that too. How's this?

The database will be initially populated with the resources in ./resources/ the next time the server starts. This folder by default contains 3 files, but you can add additional files in the form of transaction Bundles or individual resources, or you can remove the original files to start with an empty server.
If the server contains any Patient resources the initial loading process will be skipped, but you can force loading the files in this folder by setting the FORCE_LOAD_RESOURCES environment variable to true. Note that if the original files are re-loaded in this way, this will result in duplicate data being populated.

@yunwwang
Copy link
Contributor

Tested with US Core 3.1.0, 4.0.0, 5.0.1, and 6.1.0. Test results are as expected. Wait for tx.fhir.org to be back online for US Core 7 testing. Will raise separate ticket(s) for US Core 7 issues. Approved

@dehall dehall merged commit 77fb41a into main Dec 29, 2023
1 check passed
@dehall dehall deleted the fi-2084-load-resources-from-json branch December 29, 2023 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants