Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let the insdc_ingest_user submit data with missing required fields. #2281

Merged
merged 1 commit into from
Jul 10, 2024

Conversation

anna-parker
Copy link
Contributor

@anna-parker anna-parker commented Jul 10, 2024

resolves #2249

preview URL: https://insdc-ingest-user-allow-e.loculus.org/

Summary

Now that #2268 is merged we can add a check if the submitter is 'insdc_ingest_user' and not throw a required metadata field missing error if data from INSDC is missing required fields.

Screenshot and Tests

  1. First test - check if the CCHF reference sequence is now in the database: NC_005301.3
    image

  2. Check that the number of sequences seen is comparable to the state of loculus prior to enforcement of field requirement, can be seen in the screenshot I took for this PR: fix(config): Add back required field to prepro #2243. We have 2 more sequences on CCHF and indeed there were 2 new releases on 2024-07-06. Also 18 new Mpox seqences from the 2024-07-09 and 1 new on the 2024-07-07. So the numbers match.

@anna-parker anna-parker added the preview Triggers a deployment to argocd label Jul 10, 2024
@anna-parker anna-parker marked this pull request as ready for review July 10, 2024 12:06
Copy link
Contributor

@corneliusroemer corneliusroemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@anna-parker anna-parker merged commit a02a8fc into main Jul 10, 2024
7 checks passed
@anna-parker anna-parker deleted the insdc_ingest_user_allow_empty branch July 10, 2024 13:50
@@ -513,7 +515,11 @@ def process_single(
)
output_metadata[output_field] = processing_result.datum
# TODO(#2249): Do not throw an error if the submitter is insdc_ingest_user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have also deleted the TODO. Seems to be done now ;)

if (
null_per_backend(processing_result.datum)
and spec.required
and unprocessed.inputMetadata["submitter"] != "insdc_ingest_user"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering whether it's actually a good idea to hardcode a username here? What if someone else installs an own Loculus instance with its own Keycloak, where that ingest user has a different name?

Wouldn't it be better to configure something like a "user that may submit broken data" list per config/Helm chart to the pipeline?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Triggers a deployment to argocd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Let insdc_ingest_user submit sequences with missing required fields
3 participants