-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop #90
Merged
Merged
Develop #90
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In addition, scrub hardcoded credentials. Not importing everything from the EC2 instance, since there are files with PII and other helper scripts. We can bring those back in one at a time. Co-authored-by: Ariana <[email protected]>
Create a reproducible and portable development environment
Add pull request template and contributing guidelines
Store processed UUIDs with SQLAlchemy
Also, log more when downloading Orthanc studies and reuse already-downloaded ZIP files.
Sources cbtn-all CSV from s3, changed inputs to affected functions to the pandas df instead of the local file path. Update to use boto3 with default profile
Update the source of cbtn-all CSV to be from s3
* Changed deprecated "fw import" command to "fw ingest" * Removed all references to JPGs (no longer needed) * Added copy of NIfTIs-to-check to S3
Update study description modification in structure_nifti_files in order to handle edge case.
The Flywheel CLI is now available in Nixpkgs, so we don't need to grab it via a special input.
Update to use fw ingest command for Flywheel upload
I made the following changes to support the eventual deployment of the ETL on AWS: - Update `download_unpack_copy` to expect a single UUID as input rather than a list - Add `--raw` option to the `check` command that prints a newline- delimited list of UUIDs to process - Separate the validate, upload-to-Flywheel, and backup NIfTI steps into subcommands - Rename `run_pipeline` command → `run` - Update `run` to expect a list of UUIDs as input rather than generating the list on its own Also, I removed the `delete_local` command because the working directories on EC2 will be ephemeral, and it seems simple enough to run `rm -r $your_program_directory` on your local machine.
…tionality Split up the functionality of run_pipeline
To produce a container image using Nix, I had to re-package the Python project in a way that satisfied Nix. I found myself falling down a rabbit hole, re-packaging the Python project using Poetry and trying to use poetry2nix to turn the project into a Nix derivation. We use a lot of Python dependencies that depend on C libraries (like `pyparsing`, `python-magic`, `pillow`, etc.), and I had to patch each one to fit the Nix way of making shared libraries available to packages. This research was OK because we intentionally baked in some time for exploring Nix, but I began to feel like my wheels were spinning, and I wouldn’t arrive at a workable solution in time for our early May deadline. I decided to pivot to a more traditional approach using Docker and Docker Compose to manage the development environment and produce the container image. This approach feels a lot more straightforward. As a result, I can deliver faster, and we better minimize the divergence between development and production. However, we lose the reproducible builds and native performance of Nix. Also, we now have to deal with some remote debugger trickery with PyCharm. I made the following changes to support this switch: - Replace `flake.nix` with a `Dockerfile` for building a container image containing all of the CLI’s dependencies (e.g., the AWS CLI, dcm2niix, and the Flywheel CLI). - Package the Python project using `setuptools` to wrap the application in an `image-deid-etl` executable. - Add service definitions for `docker-compose.yml` that mimic how we’d run the container on AWS. - Wrap the `upload2fw` step in a fake user home directory so that the Flywheel CLI can consume credentials from the environment. - Install the `psycopg2` and PostgreSQL client libraries and update the database schema to support PostgreSQL.
Switch to a container-based development environment
…flywheel metadata after upload
- Bootstrap a Terraform project that consumes the VPC created by CHOP IS and creates an RDS instance and Bastion host. - Utilize our Terraform container image and the `infra` wrapper script to enable consistent and portable Terraform execution.
Handle broken DICOMS due to JPEG2000 compression
…sion-label Keep sessions acquired on same day separate by appending study-time to Flywheel session label
Catch sessions with no valid NIfTIs (even if has valid DICOMs)
add functionality for dataset versioning
…andling Add catch for invalid input UUIDs
Create continuous_integration_manual.yml
Update custom_flywheel.py
Add Step Functions workflow that wraps the ETL
in generating session label for a study & moving nifti files to output folder
remove dependency on 'accession number' field
📌 Update postgresql from 14.2 -> 14.11
📌 Update rollbar/github-deploy-action 2.1.1 -> 2.1.2
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.