-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let the insdc_ingest_user submit data with missing required fields. #2281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -513,7 +515,11 @@ def process_single( | |||
) | |||
output_metadata[output_field] = processing_result.datum | |||
# TODO(#2249): Do not throw an error if the submitter is insdc_ingest_user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have also deleted the TODO. Seems to be done now ;)
if ( | ||
null_per_backend(processing_result.datum) | ||
and spec.required | ||
and unprocessed.inputMetadata["submitter"] != "insdc_ingest_user" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether it's actually a good idea to hardcode a username here? What if someone else installs an own Loculus instance with its own Keycloak, where that ingest user has a different name?
Wouldn't it be better to configure something like a "user that may submit broken data" list per config/Helm chart to the pipeline?
resolves #2249
preview URL: https://insdc-ingest-user-allow-e.loculus.org/
Summary
Now that #2268 is merged we can add a check if the submitter is 'insdc_ingest_user' and not throw a required metadata field missing error if data from INSDC is missing required fields.
Screenshot and Tests
First test - check if the CCHF reference sequence is now in the database: NC_005301.3
Check that the number of sequences seen is comparable to the state of loculus prior to enforcement of field requirement, can be seen in the screenshot I took for this PR: fix(config): Add back required field to prepro #2243. We have 2 more sequences on CCHF and indeed there were 2 new releases on 2024-07-06. Also 18 new Mpox seqences from the 2024-07-09 and 1 new on the 2024-07-07. So the numbers match.