Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to get the CDS structure for a given transcript? #750

Open
wlymanambry opened this issue Nov 20, 2024 · 0 comments
Open

Is there a way to get the CDS structure for a given transcript? #750

wlymanambry opened this issue Nov 20, 2024 · 0 comments

Comments

@wlymanambry
Copy link

HGVS provides the following transcript information:

Example transcript: NM_001126049.1

"exon_structure": "89618918,89623194",
"relative structure": "[[1, 4277]]",
"tl_start_site": 951,
"tl_stop_site": 1487

I also want the corresponding CDS structure, just as we have the exon structure, something like:
"cds_structure": "89621708,89622244",

We were computing this as (first exon starting position) + (tl_start_site) and (last exon ending position + tl_stop_site) or opposite as in this case, the given that this transcript is on the reverse strand: 89623194 - 951 = 89,622,243 + 1 so the beginning of our CDS structure is: "cds_structure": "89621708,89622244",

The problem we hit doing it this way is that it doesn't account for partially aligning exons. Is there a straightforward way to grab this CDS structure information using the library that will take this into consideration? One out of the box idea I had was checking for the genomic position of C.1 (and C.*1 -1 ). This would at least get the start and stop correct but I'm not sure about all of the other positions.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant