Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicates in fusion-annoFuse.tsv.gz file #558 #19

Open
jharenza opened this issue Nov 11, 2024 · 1 comment
Open

Duplicates in fusion-annoFuse.tsv.gz file #558 #19

jharenza opened this issue Nov 11, 2024 · 1 comment

Comments

@jharenza
Copy link

What data file(s) does this issue pertain to?

fusion-annoFuse.tsv.gz

What release are you using?

v13-v15

Put your question or report your issue here.

Per the comment below, there are fusions duplicated in the fusion-annoFuse.tsv.gz OPC release file. The only difference seems to be the Gene1A_anno. I just started using this file to derive the putative-oncogenic.tsv as of v13, so I am not sure if this has been happening since the beginning or not since it was not used prior to this. I am also not sure the extent of this (how many fusions it affects) and whether this is happening at a patient level or somehow upon merge. I suspect this may be occurring at a patient level since I do not think there is any analysis prior to merge. The arriba file for the patient below only has one entry for this fusion. Can someone investigate the cause of the gene annotation duplicate rows?

@jharenza , not sure what to do about this. In the fusions file, there are repeat entries that are slightly different. For example:
BS_0HW7W7SD NSD3--TRIP12 8:38347497 2:229785855 NA NA NSD3 NA TRIP12 NA CosmicCensus, Oncogene NA NA NA NA ARRIBA 1 FALSE PT_C73C5BBZ [INTERCHROMOSOMAL[chr8--chr2]], translocation Genic in-frame
BS_0HW7W7SD NSD3--TRIP12 8:38347497 2:229785855 NA NA NSD3 NA TRIP12 NA TumorSuppressorGene NA NA NA NA ARRIBA 1 FALSE PT_C73C5BBZ [INTERCHROMOSOMAL[chr8--chr2]], translocation Genic in-frame
Exact same call from same caller, but it seems Gene1A_anno is somehow different...the cbio validation script reports 810 instances of this.

Originally posted by @migbro in https://github.com/d3b-center/bixu-tracker/issues/2248#issuecomment-1985959937

Duplicate of https://github.com/d3b-center/bixu-tracker/issues/2325

@jharenza
Copy link
Author

jira ticket: https://d3b.atlassian.net/browse/BIXU-2325

this is fixed in https://github.com/d3b-center/annoFuseData/tree/v1.0.0, we will need to update the docker image and rerun annofuse annotation for the OPC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant