Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaped doublequotes in INFO descriptions result in invalid VCF file #1661

Open
bartcharbon opened this issue Mar 13, 2023 · 2 comments
Open

Comments

@bartcharbon
Copy link

bartcharbon commented Mar 13, 2023

Edit 14/03: verified that this also occurs in version 3.0.4

Description of the issue:

When I add a header including a description containing escaped double quotes, sometimes the "escape slash" goes missing, resulting in a invalid VCF file.

Your environment:

  • version of htsjdk: 1.24.1 aand 3.0.4
  • version of java: OpenJDK 17.0.1
  • which OS: Windows and CentOS

Steps to reproduce

VCFHeader newHeader = annotator.annotateHeader(vcfFileReader.getFileHeader());    

newHeader(new VCFFormatHeaderLine("TEST", VCFHeaderLineCount.A, VCFHeaderLineType.String,"\"TEST\""));

writer.writeHeader(newHeader);
//... write variants

Expected behaviour

A VCF file is written with an INFO header:
##FORMAT=<ID=TEST,Number=A,Type=String,Description="\"TEST\"">

Actual behaviour

A VCF file is written with an INFO header:
##FORMAT=<ID=TEST,Number=A,Type=String,Description=""TEST\"">

The slash for the first escaped double quote is missing

@bartcharbon
Copy link
Author

Addition: this seems to be happening only for escaped quotes at the very start of the description

@cmnbroad
Copy link
Collaborator

Thanks for the bug report. Looks like the internal representation is correct ("""TEST""), but it gets serialized as ""TEST\"" by VCFHeaderLine.escapeQuotes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants