-
Notifications
You must be signed in to change notification settings - Fork 33
How to clean‐up and reload test data from a CCW pipeline load
It may be useful to test pipeline updates in an ephemeral environment to ensure everything is functioning properly.
To test out a full pipeline load, you'll need to find a data set that includes beneficiary updates. Most updates in the test environment don't include this.
- Check the database to see when a beneficiary load was last ran:
select max(last_updated) from ccw.beneficiaries
-
Check the
bfd-test-etl-577373831711
bucket for a dataset that matches the date (the date on the object prefix may be slightly different, but it should be pretty close). Note that the data must not be more than 60 days old. If it is, you'll need to change the object prefix to use a more recent date and update thetimestamp
field on the manifest to match. -
Remove the data in the database that was created from the load. This can be done by running this SQL script. This will remove any beneficiaries and associated claims that were added or updated during the load. You will need to supply the S3 bucket prefix that you located earlier.
-
Copy the files from their current location into the
Incoming
folder that's associated with your ephemeral environment. For example, if your environment istest-1000
, and the data load you want to use has the prefix2024-01-01T00:00:00Z
then you would copy the files frombfd-test-etl-577373831711/Synthetic/Done/2024-01-01T00:00:00Z
tobfd-1000-test-etl{timestamp}/Synthetic/Incoming/2024-01-01T00:00:00Z
. -
The pipeline should pick up and ingest the files. You can repeat this as many times as you'd like by re-running the SQL script to clear out the data and then either re-copying the files or restarting the pipeline.
- Home
- For BFD Users
- Making Requests to BFD
- API Changelog
- Migrating to V2 FAQ
- Synthetic and Synthea Data
- BFD SAMHSA Filtering