-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: minimal full translation run #101
base: main
Are you sure you want to change the base?
Conversation
Submission and processing is running fine. Some adjustments were needed in the s3/silo modules to run it bare metal i.e. no docker. Next, the read object will get the metadata they deserve.
Modify README install // the WIP run instructions in the README for no Docker. |
Just ran SILO - note that the schema is wrong: it hast to match the database config obviously, add a check for that:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (2)
tests/test_database_config_validation.py:17
- Typo: 'Validateds' should be 'Validates'.
"""Validateds that the schema of the database config file matches the ReadMetadata model
tests/test_database_config_validation.py:19
- Typo: 'nameing' should be 'naming'.
config file matches the ReadMetadata model at least in the nameing of the fields
@Taepper requested review just FYI. This PR generates a full NDJSON with Nucliotides and Amino Acids with accurate Indel handling. Am implementing a bit of validation for the format of the database schema here. The core workflow is still in a script, but will migrate into the package in the next PRs. |
Co-authored-by: Copilot <[email protected]>
Integrating the complete translation and insertion handling into the workflow based on diamond.
This PR is still for small data, i.e. thousands of reads. Beyond the current architecture, it will fail because of memory.
For the first time, this current state generates nucleotides and amino acids with proper insertion handling on both.