-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error reporting #20
Error reporting #20
Conversation
Co-authored-by: Ben Capodanno <[email protected]>
Co-authored-by: Ben Capodanno <[email protected]>
…variant Currently, the only case where this will occur is when the variant falls outside of the region of the target sequence that aligns to the genomic reference sequence. Post-map protein variants are still required to output the corresponding pre-map protein variant, because we don't expect a protein variant to fall outside the region of the target sequence that aligns to the reference transcript, although this assumption should be evaluated as a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work on this Sally, going to be great to get this up soon. Most of these are pretty minor and focused around the added TODOs or surfaced errors. Flow/logic/design all seemed good to me.
@@ -205,6 +205,7 @@ def _get_best_hit(output: QueryResult, urn: str, chromosome: str | None) -> Hit: | |||
else: | |||
if list(output): | |||
hit_chrs = [h.id for h in output] | |||
# TODO should this be an error rather than a warning? it seems like a problem if we can't find a hit on the expected chromosome |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My sense is that this is fine as a warning since the chromosome isn't explicitly provided by the user, and is instead inferred via score set metadata.
sequence_id = f"ga4gh:{mapped_score.post_mapped.location.sequenceReference.refgetAccession}" | ||
accession = get_chromosome_identifier_from_vrs_id(sequence_id) | ||
if accession is None: | ||
raise ValueError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a message to these raised ValueErrors? Even if we don't expect them to occur it's easier to read what went wrong while looking at the flow.
accession = accession[7:] | ||
else: | ||
if tx_results is None: | ||
raise ValueError # impossible by definition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a message to these raised ValueErrors? Even if we don't expect them to occur it's easier to read what went wrong while looking at the flow.
Same as above
error_message=mapped_score.error_message | ||
if mapped_score.error_message | ||
else None, # TODO might not need if statement here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the way the model is defined, I agree you shouldn't need the if statement.
Same as above.
""" # noqa: S608 | ||
result = await uta.execute_query(query) | ||
except Exception as e: | ||
raise DataLookupError from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we surface a message here?
""" # noqa: S608 | ||
result = await uta.execute_query(query) | ||
except Exception as e: | ||
raise DataLookupError from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we surface a message here?
Same as above
click.echo(msg) | ||
_logger.warning(msg) | ||
return None | ||
# TODO address this hardcoding, and if we keep it, this should be a score set mapping error message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this appropriately handled now? If so, we can remove the commented block.
relation: Literal[ | ||
"SO:is_homologous_to" | ||
] = "SO:is_homologous_to" # TODO this should probably be None if pre_mapped is false? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me. Does pre-mapped being false imply that people should just ignore this? Is it a lot of effort to change?
if ( | ||
row.hgvs_nt in {"_wt", "_sy", "="} | ||
or "fs" | ||
in row.hgvs_nt # TODO I think this line can be taken out, I don't think fs nomenclature can be used for nt anyway |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were there any score sets you encountered where this was a problem? I am not well positioned to answer, but figured I'd comment in case we should discuss with someone.
Since ScoresetMapping is returned as an API response, the typical way of setting default values does not work.
FastAPI uses json serialization by default, which flattens the Allele sequence location field. To avoid this, use pydantic's serializer with python mode instead, and return the response directly as a JSON response.
3f8d6d1
to
bd70256
Compare
bd70256
to
cc8f3ad
Compare
Make pre-map and post-map objects optional in final JSON output. Report per-variant and per-score-set errors. Add versioning.