-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update examples to VRS 2.0 #151
base: main
Are you sure you want to change the base?
Changes from all commits
8c115da
dbe8349
2b77021
5656a6d
3521a4d
5fb4383
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,111 +1,131 @@ | ||
{ | ||
"type": "CategoricalFusion", | ||
"structural_elements": [ | ||
{ | ||
"type": "TranscriptSegmentElement", | ||
"transcript": "refseq:NM_004327.3", | ||
"gene_descriptor": { | ||
"type": "GeneDescriptor", | ||
"id": "normalize.gene:BCR", | ||
"gene_id": "hgnc:1014", | ||
"label": "BCR" | ||
"structure": { | ||
"type": "Adjacency", | ||
"adjoinedSequences": [{ | ||
"type": "SequenceLocation", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The adjoinedSequence SequenceLocation objects for the adjacency are based on genomic coordinates corresponding to the fusion junction; for categorical fusions and many reported assayed fusions, these are typically at (or close to) transcript boundaries as aligned to a chromosome. |
||
"sequenceReference": { | ||
"id": "GRCh38:chr22", | ||
"type": "SequenceReference", | ||
"refgetAccession": "SQ.7B7SHsmchAR0dFcDCuSFjJAo7tX87krQ", | ||
"residueAlphabet": "na" | ||
}, | ||
"element_genomic_end": { | ||
"id": "fusor.location_descriptor:NC_000022.11", | ||
"type": "LocationDescriptor", | ||
"label": "NC_000022.11", | ||
"location": { | ||
"type": "SequenceLocation", | ||
"sequence_id": "refseq:NC_000022.11", | ||
"interval": { | ||
"type": "SequenceInterval", | ||
"start": { | ||
"type": "Number", | ||
"value": 23253980 | ||
}, | ||
"end": { | ||
"type": "Number", | ||
"value": 23253981 | ||
"end": 23290413, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that here we use |
||
"extensions": [ | ||
{ | ||
"name": "NM_004327.4:e._14", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This extension to the genomic coordinate shows the representation of the transcript segment boundary in terms of the VICC exon representation. |
||
"description": "VICC exon representation of the aligned transcript boundary.", | ||
"value": { | ||
"exon_end": 14, | ||
"exon_end_offset": 0, | ||
"sequenceReference":{ | ||
"type": "SequenceReference", | ||
"id": "NM_004327.4", | ||
"refgetAccession": "SQ.kpytJsXw3BwLC3oBSjHQS1kwxs4WO3I3", | ||
"residueAlphabet": "na" | ||
} | ||
} | ||
} | ||
}, | ||
"exon_end": 2, | ||
"exon_end_offset": 182 | ||
}, | ||
{ | ||
"type": "LinkerSequenceElement", | ||
"linker_sequence": { | ||
"id": "sequence:ACTAAAGCG", | ||
"type": "SequenceDescriptor", | ||
"sequence": "ACTAAAGCG", | ||
"residue_type": "SO:0000348" | ||
} | ||
}, | ||
{ | ||
"type": "TranscriptSegmentElement", | ||
"transcript": "refseq:NM_005157.5", | ||
"exon_start": 2, | ||
"exon_start_offset": -173, | ||
"gene_descriptor": { | ||
"id": "normalize.gene:ABL1", | ||
"type": "GeneDescriptor", | ||
"label": "ABL1", | ||
"gene_id": "hgnc:76" | ||
}, | ||
"element_genomic_start": { | ||
"id": "fusor.location_descriptor:NC_000009.12", | ||
"type": "LocationDescriptor", | ||
"label": "NC_000009.12", | ||
"location": { | ||
"type": "SequenceLocation", | ||
"sequence_id": "refseq:NC_000009.12", | ||
"interval": { | ||
"type": "SequenceInterval", | ||
"start": { | ||
"type": "Number", | ||
"value": 130854064 | ||
}, | ||
{ | ||
"name": "NM_004327.4:c._2782", | ||
"description": "Transcript SequenceLocation of the aligned transcript boundary.", | ||
"value": { | ||
"type": "SequenceLocation", | ||
"sequenceReference": { | ||
"id": "NM_004327.4", | ||
"type": "SequenceReference", | ||
"refgetAccession": "SQ.kpytJsXw3BwLC3oBSjHQS1kwxs4WO3I3", | ||
"residueAlphabet": "na" | ||
}, | ||
"end": { | ||
"type": "Number", | ||
"value": 130854065 | ||
} | ||
"end": 3234 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here, we again use |
||
} | ||
}, | ||
{ | ||
"name": "gene", | ||
"description": "The gene concept (BCR) associated with this fusion partner.", | ||
"value": { | ||
"code": "hgnc:1014", | ||
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/", | ||
"label": "BCR" | ||
Comment on lines
+47
to
+49
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a GKS Coding object. |
||
} | ||
} | ||
} | ||
]}, | ||
{ | ||
"type": "SequenceLocation", | ||
"sequenceReference": { | ||
"id": "GRCh38:chr9", | ||
"type": "SequenceReference", | ||
"refgetAccession": "SQ.KEO-4XBcm1cxeo_DIQ8_ofqGUkp4iZhI", | ||
"residueAlphabet": "na" | ||
}, | ||
"start": 130854064, | ||
"extensions": [ | ||
{ | ||
"name": "NM_005157.6:e.2_", | ||
"description": "VICC exon representation of the aligned transcript boundary.", | ||
"value": { | ||
"exon_start": 2, | ||
"exon_start_offset": 0, | ||
"sequenceReference":{ | ||
"id": "NM_005157.6", | ||
"type": "SequenceReference", | ||
"refgetAccession": "SQ.w8Qg3x-PQ2akJrJQeGEN-_eBUMo1H1CL", | ||
"residueAlphabet": "na" | ||
} | ||
} | ||
}, | ||
{ | ||
"name": "NM_005157.6:c.80_", | ||
"description": "Transcript SequenceLocation of the aligned transcript boundary.", | ||
"value": { | ||
"type": "SequenceLocation", | ||
"sequenceReference": { | ||
"id": "NM_005157.6", | ||
"type": "SequenceReference", | ||
"refgetAccession": "SQ.w8Qg3x-PQ2akJrJQeGEN-_eBUMo1H1CL", | ||
"residueAlphabet": "na" | ||
}, | ||
"end": 273 | ||
} | ||
}, | ||
{ | ||
"name": "gene", | ||
"description": "The gene concept (ABL1) associated with this fusion partner.", | ||
"value": { | ||
"code": "hgnc:76", | ||
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/", | ||
"label": "ABL1" | ||
} | ||
} | ||
] | ||
}], | ||
"linker": { | ||
"type": "LiteralSequenceExpression", | ||
"sequence": "CCCGTC" | ||
} | ||
], | ||
"r_frame_preserved": true, | ||
"critical_functional_domains": [ | ||
}, | ||
"readingFramePreserved": true, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. using camelCase, similar to usage in GKS specs |
||
"criticalFunctionalDomains": [ | ||
{ | ||
"type": "FunctionalDomain", | ||
"status": "preserved", | ||
"associated_gene": { | ||
"id": "normalize.gene:hgnc%3A76", | ||
"type": "GeneDescriptor", | ||
"label": "ABL1", | ||
"gene_id": "hgnc:76" | ||
"gene": { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. mostly noting this to help myself as I am working on the updates to fusor - this should be |
||
"code": "hgnc:76", | ||
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/", | ||
"label": "ABL1" | ||
}, | ||
"_id": "interpro:IPR000980", | ||
"id": "interpro:IPR000980", | ||
"label": "SH2 domain", | ||
"sequence_location": { | ||
"id": "fusor.location_descriptor:NP_005148.2", | ||
"type": "LocationDescriptor", | ||
"location": { | ||
"type": "SequenceLocation", | ||
"sequence_id": "refseq:NP_005148.2", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we retain this identifier as a mapping, or something of that nature? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think so? I was more following the convention in |
||
"interval": { | ||
"type": "SequenceInterval", | ||
"start": { | ||
"type": "Number", | ||
"value": 127 | ||
}, | ||
"end": { | ||
"type": "Number", | ||
"value": 202 | ||
} | ||
} | ||
} | ||
"sequenceLocation": { | ||
"type": "SequenceLocation", | ||
"sequenceReference": { | ||
"id": "GRCh38:chr22", | ||
"type": "SequenceReference", | ||
"refgetAccession": "SQ.7B7SHsmchAR0dFcDCuSFjJAo7tX87krQ", | ||
"residueAlphabet": "na" | ||
}, | ||
"start": 127, | ||
"end": 202 | ||
} | ||
} | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the Structure field should be a VRS Adjacency (for transcript junction descriptions) or a VRS DerivativeMolecule (under development; use for full fusion transcript representations). The former are much more common, as illustrated in this BCR::ABL1 example.