Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train-data-creator doesn't take multiple SYMBOL (ID's) into account #54

Open
SietsmaRJ opened this issue Feb 14, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@SietsmaRJ
Copy link
Contributor

Describe the bug

A "GENEINFO" entry can contain multiple SYMBOL (ID's):

##INFO=<ID=GENEINFO,Number=1,Type=String,Description="Gene(s) for the variant reported as gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a vertical bar (|)">

Example:

1 1474871 1295591 G C . . ALLELEID=1285386;CLNDISDB=MedGen:CN517202;CLNDN=not_provided;CLNHGVS=NC_000001.10:g.1474871G>C;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=TMEM240:339453|LOC121967044:121967044;MC=SO:0001627|intron_variant;ORIGIN=1

In this case, "LOC121967044" is discarded, which could lead to mapping problems in process-vep.

System information

  • OS: Not applicable
  • Version: 5.0.0.dev0
  • Python version: Not applicable
  • Shell: Not applicable

How to Reproduce

Steps to reproduce the behavior:

  1. Run train-data-creator with a VCF containing a sample with multiple GENEINFO entries.
  2. Run VEP.
  3. Convert VEP output VCF to TSV.
  4. Run process-vep and see that only 1 of the entries has been mapped.

Expected behavior

Currently, process-vep maps 1 to 1 from the initially supplied SYMBOL to the VEP output SYMBOL. This needs to be changed so that it maps back 1 to many (1 being the VEP output SYMBOL, many being the "ID" column SYMBOLs)

Logs

If available, the generated logging information and/or error message (can also be attached as a file if very large).

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

#51 (comment)

@SietsmaRJ SietsmaRJ added the bug Something isn't working label Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant