Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update examples to VRS 2.0 #151

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

Update examples to VRS 2.0 #151

wants to merge 6 commits into from

Conversation

jarbesfeld
Copy link
Contributor

No description provided.

@jarbesfeld jarbesfeld added the priority:medium Medium priority label Jul 8, 2024
@jsstevenson jsstevenson self-requested a review July 8, 2024 18:36
"type": "LocationDescriptor",
"location": {
"type": "SequenceLocation",
"sequence_id": "refseq:NP_005148.2",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we retain this identifier as a mapping, or something of that nature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so? I was more following the convention in element_genomic_end and element_genomic_start where I removed the identifier

@jsstevenson
Copy link
Member

note that the gene descriptor in the ABL1 functional domain example should be updated to a GKS Gene as well

"gene_descriptor": {
"type": "GeneDescriptor",
"gene": {
"type": "Gene",
"id": "normalize.gene:BCR",
"gene_id": "hgnc:1014",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field does not exist. I think we need to use mappings here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be a Coding object as used in mappings.

@ahwagner ahwagner self-assigned this Jul 9, 2024
Copy link
Member

@ahwagner ahwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the BCR::ABL1 example. I recommend trying to replicate this on the TPM3::NTRK1 example to check understanding.

"gene_descriptor": {
"type": "GeneDescriptor",
"gene": {
"type": "Gene",
"id": "normalize.gene:BCR",
"gene_id": "hgnc:1014",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be a Coding object as used in mappings.

"id": "normalize.gene:BCR",
"gene_id": "hgnc:1014",
"label": "BCR"
"structure": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Structure field should be a VRS Adjacency (for transcript junction descriptions) or a VRS DerivativeMolecule (under development; use for full fusion transcript representations). The former are much more common, as illustrated in this BCR::ABL1 example.

"structure": {
"type": "Adjacency",
"adjoinedSequences": [{
"type": "SequenceLocation",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adjoinedSequence SequenceLocation objects for the adjacency are based on genomic coordinates corresponding to the fusion junction; for categorical fusions and many reported assayed fusions, these are typically at (or close to) transcript boundaries as aligned to a chromosome.

"end": 23290413,
"extensions": [
{
"name": "NM_004327.4:e._14",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extension to the genomic coordinate shows the representation of the transcript segment boundary in terms of the VICC exon representation.

"end": {
"type": "Number",
"value": 23253981
"end": 23290413,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that here we use end (appropriately) but the 5' partner expressed in genomic coordinates should use start if the transcribed sequence is on the negative strand. Using end represents a boundary that includes sequence to the left (here, the sequence left of the aligned genomic coordinates at position 23290413, corresponding to the BCR exon 14).

image

"type": "Number",
"value": 130854065
}
"end": 3234
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we again use end, as we always expect to see in the transcript sequence representation for the 5' partner. For transcripts aligning to the negative strand of a chromsome, this would remain end, even though the chromosome representation (on line 13) would use start.

"r_frame_preserved": true,
"critical_functional_domains": [
},
"readingFramePreserved": true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using camelCase, similar to usage in GKS specs

Comment on lines +47 to +49
"code": "hgnc:1014",
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/",
"label": "BCR"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a GKS Coding object.

Copy link

This PR is stale because it has been open 3 day(s) with no activity. Please review this PR.

@github-actions github-actions bot added the stale label Jul 17, 2024
@korikuzma korikuzma removed the stale label Jul 17, 2024
Copy link

This PR is stale because it has been open 3 day(s) with no activity. Please review this PR.

@github-actions github-actions bot added the stale label Jul 22, 2024
"type": "GeneDescriptor",
"label": "ABL1",
"gene_id": "hgnc:76"
"gene": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly noting this to help myself as I am working on the updates to fusor - this should be associatedGene

@jsstevenson jsstevenson removed the stale label Jul 23, 2024
Copy link

This PR is stale because it has been open 3 day(s) with no activity. Please review this PR.

Copy link

This PR is stale because it has been open 3 day(s) with no activity. Please review this PR.

@github-actions github-actions bot added the stale label Jul 30, 2024
@korikuzma
Copy link
Member

@jarbesfeld @katiestahl @jsstevenson I think this issue still needs to be addressed. I'm also looking at the nomenclature tests and I think TranscriptSegmentElement are incorrect. They should only be using start or end, not both.

@jsstevenson
Copy link
Member

Yeah, there's a few things here. We also need to tackle the big adjacency change that alex was originally proposing.

@github-actions github-actions bot removed the stale label Aug 22, 2024
@korikuzma
Copy link
Member

@jsstevenson IIRC we're holding off on adjacency for a bit. I re-opened #172 and made a PR to update the examples

Copy link

This PR is stale because it has been open 3 day(s) with no activity. Please review this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants