Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: update to vrs 2.0 models #166

Merged
merged 89 commits into from
Aug 2, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
442083c
build!: remove vrsatile
katiestahl Jul 17, 2024
4a538bb
wip: remove gene descriptor
katiestahl Jul 17, 2024
c0e8626
wip: remove gene descriptor
katiestahl Jul 17, 2024
24e1a4c
progress updating models and adding back gene element wrapper
katiestahl Jul 17, 2024
3e52ff2
adding back gene element
katiestahl Jul 17, 2024
12ee931
Revert "progress updating models and adding back gene element wrapper"
katiestahl Jul 17, 2024
6120473
Revert "adding back gene element"
katiestahl Jul 17, 2024
6573780
converting descriptors
katiestahl Jul 18, 2024
44e0574
remove todo
katiestahl Jul 18, 2024
7796732
wip: adding back GeneElement wrapper, updating to camelCase, removing…
katiestahl Jul 18, 2024
c67d588
updating models
katiestahl Jul 18, 2024
6ad8e33
fix: gene element type
katiestahl Jul 18, 2024
c1e8fad
wip: update constructors with updated param names from models
katiestahl Jul 18, 2024
42d1224
Merge branch 'main' into issue-95-take2
katiestahl Jul 18, 2024
7eabc25
update constructors from model changes
katiestahl Jul 18, 2024
6cd7cfa
Merge branch 'issue-95-take2' of https://github.com/cancervariants/fu…
katiestahl Jul 18, 2024
1e18144
minor fixes
katiestahl Jul 18, 2024
76ef031
fix: updating variable casing
katiestahl Jul 18, 2024
6f740e9
updating docstring
katiestahl Jul 18, 2024
c928b35
fix: variable casing and error messages
katiestahl Jul 18, 2024
a2d2e10
revert featureId back to string
katiestahl Jul 18, 2024
fe85297
Update src/fusor/models.py
katiestahl Jul 19, 2024
eb9da54
Update pyproject.toml
katiestahl Jul 19, 2024
bbae4bc
Update pyproject.toml
katiestahl Jul 19, 2024
7634327
Update src/fusor/models.py
katiestahl Jul 19, 2024
a593c3b
fixes from pr comments
katiestahl Jul 19, 2024
1c3959b
fixes from pr comments
katiestahl Jul 19, 2024
c8c90a7
Merge branch 'issue-95-take2' of https://github.com/cancervariants/fu…
katiestahl Jul 19, 2024
838e2a2
wip: updating test examples with new models
katiestahl Jul 19, 2024
f5f5689
adding back unreachable else because ruff will complain otherwise
katiestahl Jul 19, 2024
c3136ce
fix: update example models with placeholders for sequence location an…
katiestahl Jul 19, 2024
686b000
fix: casing for data to/from cool-seq-tool
katiestahl Jul 22, 2024
1ce7f23
Update src/fusor/fusor.py
katiestahl Jul 22, 2024
485035c
Update src/fusor/fusor.py
katiestahl Jul 22, 2024
9f1ee60
fix: minimal gene response when creating gene
katiestahl Jul 22, 2024
5bbc77c
fix: naming
katiestahl Jul 22, 2024
5a8bea2
Update src/fusor/models.py
katiestahl Jul 22, 2024
ca97171
Update src/fusor/models.py
katiestahl Jul 22, 2024
92fe98f
Update src/fusor/models.py
katiestahl Jul 22, 2024
43f347f
Update src/fusor/models.py
katiestahl Jul 22, 2024
dba14ca
Update src/fusor/fusor.py
katiestahl Jul 22, 2024
a3e67d1
Update src/fusor/models.py
katiestahl Jul 22, 2024
74db34c
updating constructor for SequenceLocation and adding SequenceReference
katiestahl Jul 22, 2024
9cd280c
Merge branch 'issue-95-take2' of https://github.com/cancervariants/fu…
katiestahl Jul 22, 2024
cce6159
removing comment
katiestahl Jul 22, 2024
58b0899
wip: start updates to nomenclature using new models
katiestahl Jul 22, 2024
634ad7f
wip: progress on sequence location constructor
katiestahl Jul 22, 2024
7334ac8
fix: tests and add sequence location id
katiestahl Jul 22, 2024
9d08adf
wip: update test examples
katiestahl Jul 22, 2024
fc649b5
removing incorrect test cases- adding placeholders for now
katiestahl Jul 22, 2024
1e0c7df
fix constructing sequence location
katiestahl Jul 22, 2024
33e932b
fix: casing for sequencelocation
katiestahl Jul 22, 2024
091e23a
updating sequence locations examples
katiestahl Jul 22, 2024
1fbee7c
updating tests
katiestahl Jul 22, 2024
c873d6a
updating tests and adding option to getch gene id from alternate field
katiestahl Jul 22, 2024
ac72ecd
fix: json schema examples
katiestahl Jul 22, 2024
3a9c20a
wip: updating fusor tests
katiestahl Jul 22, 2024
ab3de62
update nomenclature to use new models
katiestahl Jul 22, 2024
6bf5f01
remove completed todo
katiestahl Jul 22, 2024
b952c9c
wip: updating fusor tests
katiestahl Jul 23, 2024
16def6b
Update locations for mane transcript segment fixture/tests
jarbesfeld Jul 23, 2024
a624de5
wip: update tests with new models
katiestahl Jul 23, 2024
620dc1a
Merge branch 'issue-95-take2' of https://github.com/cancervariants/fu…
katiestahl Jul 24, 2024
6a6211b
update tests and examples with new models
katiestahl Jul 24, 2024
3fd9615
update tests and examples with new models
katiestahl Jul 24, 2024
9af8c72
update tests and examples with new models
katiestahl Jul 24, 2024
871c3d9
update tests and examples with new models
katiestahl Jul 24, 2024
01ef722
refactor: moving around logic to make more readable
katiestahl Jul 24, 2024
b811599
model updates
katiestahl Jul 24, 2024
812a21b
Merge branch 'main' into issue-95-take2
katiestahl Jul 24, 2024
9c083b9
model updates
katiestahl Jul 24, 2024
db4d6cb
updating models and tests
katiestahl Jul 24, 2024
6c3b663
fix name
katiestahl Jul 24, 2024
c78ee2a
pin gene normalizer version where CURIE is still defined
katiestahl Jul 25, 2024
fb05793
updating json schema examples for models, removing labael from sequen…
katiestahl Jul 25, 2024
2572534
remove sequencelocation label
katiestahl Jul 25, 2024
62745b8
fix ruff errors
katiestahl Jul 25, 2024
8732135
pinning pydantic version to stop validation error in tests
katiestahl Jul 25, 2024
acab073
test: updating test to fail with unexpected sequence id provided
katiestahl Jul 25, 2024
a0a4529
Update test fixtures for correct use of start and end"
jarbesfeld Jul 25, 2024
f1b82a0
update comment
katiestahl Jul 25, 2024
e9a70b1
Revert "Update test fixtures for correct use of start and end""
katiestahl Jul 25, 2024
dd3f7c1
Update src/fusor/models.py
katiestahl Jul 29, 2024
774f559
Update src/fusor/fusor.py
katiestahl Jul 29, 2024
8fd8d3f
Update src/fusor/fusor.py
katiestahl Jul 29, 2024
c664b77
fix: example data
katiestahl Jul 29, 2024
f00ddb2
Merge branch 'issue-95-take2' of https://github.com/cancervariants/fu…
katiestahl Jul 29, 2024
2880e18
update fusion constructor to accept the body of a valid fusion (same …
katiestahl Jul 30, 2024
b9bff00
Revert "update fusion constructor to accept the body of a valid fusio…
katiestahl Jul 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,9 @@ description = "Computable object representation and validation for gene fusions"
license = {file = "LICENSE"}
dependencies = [
"pydantic == 2.*",
"ga4gh.vrsatile.pydantic ~=0.2.0",
"ga4gh.vrs ~=0.8.1",
"ga4gh.vrs ~=2.0.0a8",
katiestahl marked this conversation as resolved.
Show resolved Hide resolved
"biocommons.seqrepo",
"gene-normalizer ~=0.1.40-dev1",
"gene-normalizer ~=0.4.0",
katiestahl marked this conversation as resolved.
Show resolved Hide resolved
"cool-seq-tool ~=0.5.0",
]
dynamic=["version"]
Expand Down
213 changes: 117 additions & 96 deletions src/fusor/examples/bcr_abl1.json
Original file line number Diff line number Diff line change
@@ -1,112 +1,133 @@
{
"type": "CategoricalFusion",
"structural_elements": [
{
"type": "TranscriptSegmentElement",
"transcript": "refseq:NM_004327.3",
"gene_descriptor": {
"type": "GeneDescriptor",
"id": "normalize.gene:BCR",
"gene_id": "hgnc:1014",
"label": "BCR"
},
"element_genomic_end": {
"id": "fusor.location_descriptor:NC_000022.11",
"type": "LocationDescriptor",
"label": "NC_000022.11",
"location": {
"type": "SequenceLocation",
"sequence_id": "refseq:NC_000022.11",
"interval": {
"type": "SequenceInterval",
"start": {
"type": "Number",
"value": 23253980
},
"end": {
"type": "Number",
"value": 23253981
"structure": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should make sure to revert any changes in /examples/ and have them handled separately in the @jarbesfeld PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do! :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ended up changing these for this PR since they are needed for nomenclature tests, which I wanted to ensure still functioned properly after the model updates

"type": "Adjacency",
"adjoinedSequences": [
{
"type": "SequenceLocation",
"sequenceReference": {
"id": "GRCh38:chr22",
"type": "SequenceReference",
"refgetAccession": "SQ.7B7SHsmchAR0dFcDCuSFjJAo7tX87krQ",
"residueAlphabet": "na"
},
"end": 23290413,
"extensions": [
{
"name": "NM_004327.4:e._14",
"description": "VICC exon representation of the aligned transcript boundary.",
"value": {
"exon_end": 14,
"exon_end_offset": 0,
"sequenceReference":{
"type": "SequenceReference",
"id": "NM_004327.4",
"refgetAccession": "SQ.kpytJsXw3BwLC3oBSjHQS1kwxs4WO3I3",
"residueAlphabet": "na"
}
}
}
},
"exon_end": 2,
"exon_end_offset": 182
},
{
"type": "LinkerSequenceElement",
"linker_sequence": {
"id": "sequence:ACTAAAGCG",
"type": "SequenceDescriptor",
"sequence": "ACTAAAGCG",
"residue_type": "SO:0000348"
}
},
{
"type": "TranscriptSegmentElement",
"transcript": "refseq:NM_005157.5",
"exon_start": 2,
"exon_start_offset": -173,
"gene_descriptor": {
"id": "normalize.gene:ABL1",
"type": "GeneDescriptor",
"label": "ABL1",
"gene_id": "hgnc:76"
},
"element_genomic_start": {
"id": "fusor.location_descriptor:NC_000009.12",
"type": "LocationDescriptor",
"label": "NC_000009.12",
"location": {
"type": "SequenceLocation",
"sequence_id": "refseq:NC_000009.12",
"interval": {
"type": "SequenceInterval",
"start": {
"type": "Number",
"value": 130854064
},
{
"name": "NM_004327.4:c._2782",
"description": "Transcript SequenceLocation of the aligned transcript boundary.",
"value": {
"type": "SequenceLocation",
"sequenceReference": {
"id": "NM_004327.4",
"type": "SequenceReference",
"refgetAccession": "SQ.kpytJsXw3BwLC3oBSjHQS1kwxs4WO3I3",
"residueAlphabet": "na"
},
"end": {
"type": "Number",
"value": 130854065
}
"end": 3234
}
},
{
"name": "gene",
"description": "The gene concept (BCR) associated with this fusion partner.",
"value": {
"code": "hgnc:1014",
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/",
"label": "BCR"
}
}
}
]},
{
"type": "SequenceLocation",
"sequenceReference": {
"id": "GRCh38:chr9",
"type": "SequenceReference",
"refgetAccession": "SQ.KEO-4XBcm1cxeo_DIQ8_ofqGUkp4iZhI",
"residueAlphabet": "na"
},
"start": 130854064,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just be "start": 130854063

"extensions": [
{
"name": "NM_005157.6:e.2_",
"description": "VICC exon representation of the aligned transcript boundary.",
"value": {
"exon_start": 2,
"exon_start_offset": 0,
"sequenceReference":{
"id": "NM_005157.6",
"type": "SequenceReference",
"refgetAccession": "SQ.w8Qg3x-PQ2akJrJQeGEN-_eBUMo1H1CL",
"residueAlphabet": "na"
}
}
},
{
"name": "NM_005157.6:c.80_",
"description": "Transcript SequenceLocation of the aligned transcript boundary.",
"value": {
"type": "SequenceLocation",
"sequenceReference": {
"id": "NM_005157.6",
"type": "SequenceReference",
"refgetAccession": "SQ.w8Qg3x-PQ2akJrJQeGEN-_eBUMo1H1CL",
"residueAlphabet": "na"
},
"end": 273
}
},
{
"name": "gene",
"description": "The gene concept (ABL1) associated with this fusion partner.",
"value": {
"code": "hgnc:76",
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/",
"label": "ABL1"
}
}
]
}],
"linker": {
"type": "LiteralSequenceExpression",
"sequence": "CCCGTC"
}
],
"r_frame_preserved": true,
"critical_functional_domains": [
},
"readingFramePreserved": true,
"criticalFunctionalDomains": [
{
"type": "FunctionalDomain",
"status": "preserved",
"associated_gene": {
"id": "normalize.gene:hgnc%3A76",
"type": "GeneDescriptor",
"label": "ABL1",
"gene_id": "hgnc:76"
"gene": {
"code": "hgnc:76",
"system": "https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/",
"label": "ABL1"
},
"_id": "interpro:IPR000980",
"id": "interpro:IPR000980",
"label": "SH2 domain",
"sequence_location": {
"id": "fusor.location_descriptor:NP_005148.2",
"type": "LocationDescriptor",
"location": {
"type": "SequenceLocation",
"sequence_id": "refseq:NP_005148.2",
"interval": {
"type": "SequenceInterval",
"start": {
"type": "Number",
"value": 127
},
"end": {
"type": "Number",
"value": 202
}
}
}
"sequenceLocation": {
"type": "SequenceLocation",
"sequenceReference": {
"id": "GRCh38:chr22",
"type": "SequenceReference",
"refgetAccession": "SQ.7B7SHsmchAR0dFcDCuSFjJAo7tX87krQ",
"residueAlphabet": "na"
},
"start": 127,
"end": 202
}
}
]
}
}
Loading
Loading