Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reporting #20

Merged
merged 17 commits into from
Sep 13, 2024
Merged

Error reporting #20

merged 17 commits into from
Sep 13, 2024

Conversation

sallybg
Copy link
Collaborator

@sallybg sallybg commented Aug 22, 2024

Make pre-map and post-map objects optional in final JSON output. Report per-variant and per-score-set errors. Add versioning.

sallybg and others added 13 commits July 5, 2024 11:30
…variant

Currently, the only case where this will occur is when the variant falls
outside of the region of the target sequence that aligns to the genomic
reference sequence. Post-map protein variants are still required to
output the corresponding pre-map protein variant, because we don't
expect a protein variant to fall outside the region of the target
sequence that aligns to the reference transcript, although
this assumption should be evaluated as a separate issue.
@sallybg sallybg marked this pull request as ready for review August 26, 2024 21:44
@sallybg sallybg requested review from bencap and jstone-dev August 27, 2024 17:22
Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work on this Sally, going to be great to get this up soon. Most of these are pretty minor and focused around the added TODOs or surfaced errors. Flow/logic/design all seemed good to me.

src/dcd_mapping/__init__.py Show resolved Hide resolved
@@ -205,6 +205,7 @@ def _get_best_hit(output: QueryResult, urn: str, chromosome: str | None) -> Hit:
else:
if list(output):
hit_chrs = [h.id for h in output]
# TODO should this be an error rather than a warning? it seems like a problem if we can't find a hit on the expected chromosome
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My sense is that this is fine as a warning since the chromosome isn't explicitly provided by the user, and is instead inferred via score set metadata.

sequence_id = f"ga4gh:{mapped_score.post_mapped.location.sequenceReference.refgetAccession}"
accession = get_chromosome_identifier_from_vrs_id(sequence_id)
if accession is None:
raise ValueError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a message to these raised ValueErrors? Even if we don't expect them to occur it's easier to read what went wrong while looking at the flow.

accession = accession[7:]
else:
if tx_results is None:
raise ValueError # impossible by definition
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a message to these raised ValueErrors? Even if we don't expect them to occur it's easier to read what went wrong while looking at the flow.

Same as above

Comment on lines +348 to +350
error_message=mapped_score.error_message
if mapped_score.error_message
else None, # TODO might not need if statement here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the way the model is defined, I agree you shouldn't need the if statement.

Same as above.

""" # noqa: S608
result = await uta.execute_query(query)
except Exception as e:
raise DataLookupError from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we surface a message here?

""" # noqa: S608
result = await uta.execute_query(query)
except Exception as e:
raise DataLookupError from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we surface a message here?

Same as above

click.echo(msg)
_logger.warning(msg)
return None
# TODO address this hardcoding, and if we keep it, this should be a score set mapping error message
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this appropriately handled now? If so, we can remove the commented block.

Comment on lines +171 to +173
relation: Literal[
"SO:is_homologous_to"
] = "SO:is_homologous_to" # TODO this should probably be None if pre_mapped is false?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. Does pre-mapped being false imply that people should just ignore this? Is it a lot of effort to change?

if (
row.hgvs_nt in {"_wt", "_sy", "="}
or "fs"
in row.hgvs_nt # TODO I think this line can be taken out, I don't think fs nomenclature can be used for nt anyway
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there any score sets you encountered where this was a problem? I am not well positioned to answer, but figured I'd comment in case we should discuss with someone.

sallybg and others added 2 commits September 4, 2024 15:41
Since ScoresetMapping is returned as an API response, the typical
way of setting default values does not work.
@bencap bencap added the type: feature New feature label Sep 13, 2024
FastAPI uses json serialization by default, which flattens the Allele
sequence location field. To avoid this, use pydantic's serializer with python mode
instead, and return the response directly as a JSON response.
@bencap bencap mentioned this pull request Sep 13, 2024
@bencap bencap merged commit 511dfa7 into dev Sep 13, 2024
6 checks passed
@bencap bencap deleted the error-reporting branch September 13, 2024 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature New feature
Projects
None yet
2 participants