Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

END and POS do not agree #4447

Open
asylvz opened this issue Nov 18, 2024 · 0 comments
Open

END and POS do not agree #4447

asylvz opened this issue Nov 18, 2024 · 0 comments

Comments

@asylvz
Copy link

asylvz commented Nov 18, 2024

1. What were you trying to do?
I'm trying to add 1K genomes variants (merged VCF of each chr into one VCF from the link below using bcftools) to the GRCh38 genome using:

${vg} construct -f -S -a -r ${ref} -v ${vars} > merged_graph.vg

ref= https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz

vars= https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20220422_3202_phased_SNV_INDEL_SV/

2. What did you want to happen?
Generate .vg file with 1K variants embedded (SNP, Indel and SV)

3. What actually happened?
It seems to be working but generates lots of warnings for INS, INV and INS:ME variants such as:

Warning: insertion END and POS do not agree (complex insertions not canonicalizeable) [canonicalize] chr10	34746632	HGSV_146800	C	<INS>	0	.	AC=79;AF=0.012336;CM=61.367;AN=6404;AN_EAS=1170;AN_AMR=980;AN_EUR=1266;AN_AFR=1786;AN_SAS=1202;AN_EUR_unrel=1006;AN_EAS_unrel=1008;AN_AMR_unrel=694;AN_SAS_unrel=978;AN_AFR_unrel=1322;AF_EAS=0;AF_AMR=0.00306122;AF_EUR=0;AF_AFR=0.0425532;AF_SAS=0;AF_EUR_unrel=0;MAF_EUR_unrel=0;AF_EAS_unrel=0;MAF_EAS_unrel=0;AF_AMR_unrel=0.00288184;MAF_AMR_unrel=0.00288184;AF_SAS_unrel=0;MAF_SAS_unrel=0;AF_AFR_unrel=0.0408472;MAF_AFR_unrel=0.0408472;AC_EAS=0;AC_AMR=3;AC_EUR=0;AC_AFR=76;AC_SAS=0;AC_EUR_unrel=0;AC_EAS_unrel=0;AC_AMR_unrel=2;AC_SAS_unrel=0;AC_AFR_unrel=54;AC_Het_EAS=0;AC_Het_AMR=3;AC_Het_EUR=0;AC_Het_AFR=76;AC_Het_SAS=0;AC_Het_EUR_unrel=0;AC_Het_EAS_unrel=0;AC_Het_AMR_unrel=2;AC_Het_SAS_unrel=0;AC_Het_AFR_unrel=54;AC_Het=79;AC_Hom_EAS=0;AC_Hom_AMR=0;AC_Hom_EUR=0;AC_Hom_AFR=0;AC_Hom_SAS=0;AC_Hom_EUR_unrel=0;AC_Hom_EAS_unrel=0;AC_Hom_AMR_unrel=0;AC_Hom_SAS_unrel=0;AC_Hom_AFR_unrel=0;AC_Hom=0;HWE_EAS=1;ExcHet_EAS=1;HWE_AMR=1;ExcHet_AMR=0.996936;HWE_EUR=1;ExcHet_EUR=1;HWE_AFR=0.399721;ExcHet_AFR=0.188865;HWE_SAS=1;ExcHet_SAS=1;HWE=1;ExcHet=0.614413;END=34746652;SVTYPE=INS;SVLEN=313;CHR2=chr10;ALGORITHMS=manta;SOURCE=gatksv;EVIDENCE=SR;SPAN=313
END: 34746652  POS: 34746632
Warning: insertion END and POS do not agree (complex insertions not canonicalizeable) [canonicalize] chr10	35049936	HGSV_146825	T	<INS:ME:SVA>	0	.	AC=8;AF=0.00124922;CM=61.5853;AN=6404;AN_EAS=1170;AN_AMR=980;AN_EUR=1266;AN_AFR=1786;AN_SAS=1202;AN_EUR_unrel=1006;AN_EAS_unrel=1008;AN_AMR_unrel=694;AN_SAS_unrel=978;AN_AFR_unrel=1322;AF_EAS=0;AF_AMR=0.00306122;AF_EUR=0;AF_AFR=0.00279955;AF_SAS=0;AF_EUR_unrel=0;MAF_EUR_unrel=0;AF_EAS_unrel=0;MAF_EAS_unrel=0;AF_AMR_unrel=0.00432277;MAF_AMR_unrel=0.00432277;AF_SAS_unrel=0;MAF_SAS_unrel=0;AF_AFR_unrel=0.00226929;MAF_AFR_unrel=0.00226929;AC_EAS=0;AC_AMR=3;AC_EUR=0;AC_AFR=5;AC_SAS=0;AC_EUR_unrel=0;AC_EAS_unrel=0;AC_AMR_unrel=3;AC_SAS_unrel=0;AC_AFR_unrel=3;AC_Het_EAS=0;AC_Het_AMR=3;AC_Het_EUR=0;AC_Het_AFR=5;AC_Het_SAS=0;AC_Het_EUR_unrel=0;AC_Het_EAS_unrel=0;AC_Het_AMR_unrel=3;AC_Het_SAS_unrel=0;AC_Het_AFR_unrel=3;AC_Het=8;AC_Hom_EAS=0;AC_Hom_AMR=0;AC_Hom_EUR=0;AC_Hom_AFR=0;AC_Hom_SAS=0;AC_Hom_EUR_unrel=0;AC_Hom_EAS_unrel=0;AC_Hom_AMR_unrel=0;AC_Hom_SAS_unrel=0;AC_Hom_AFR_unrel=0;AC_Hom=0;HWE_EAS=1;ExcHet_EAS=1;HWE_AMR=1;ExcHet_AMR=0.996936;HWE_EUR=1;ExcHet_EUR=1;HWE_AFR=1;ExcHet_AFR=0.994402;HWE_SAS=1;ExcHet_SAS=1;HWE=1;ExcHet=0.995632;END=35049987;SVTYPE=INS;SVLEN=873;CHR2=chr10;ALGORITHMS=melt;SOURCE=gatksv;EVIDENCE=SR;SPAN=873
Warning: inversion SVLEN specifies nonzero length change (complex inversions not canonicalizeable) [canonicalize] chr10	77265130	HGSV_150485	C	<INV>	0	.	AC=5;AF=0.000780762;CM=100.464;AN=6404;AN_EAS=1170;AN_AMR=980;AN_EUR=1266;AN_AFR=1786;AN_SAS=1202;AN_EUR_unrel=1006;AN_EAS_unrel=1008;AN_AMR_unrel=694;AN_SAS_unrel=978;AN_AFR_unrel=1322;AF_EAS=0;AF_AMR=0.00204082;AF_EUR=0.000789889;AF_AFR=0.00111982;AF_SAS=0;AF_EUR_unrel=0.000994036;MAF_EUR_unrel=0.000994036;AF_EAS_unrel=0;MAF_EAS_unrel=0;AF_AMR_unrel=0.00144092;MAF_AMR_unrel=0.00144092;AF_SAS_unrel=0;MAF_SAS_unrel=0;AF_AFR_unrel=0.00075643;MAF_AFR_unrel=0.00075643;AC_EAS=0;AC_AMR=2;AC_EUR=1;AC_AFR=2;AC_SAS=0;AC_EUR_unrel=1;AC_EAS_unrel=0;AC_AMR_unrel=1;AC_SAS_unrel=0;AC_AFR_unrel=1;AC_Het_EAS=0;AC_Het_AMR=2;AC_Het_EUR=1;AC_Het_AFR=2;AC_Het_SAS=0;AC_Het_EUR_unrel=1;AC_Het_EAS_unrel=0;AC_Het_AMR_unrel=1;AC_Het_SAS_unrel=0;AC_Het_AFR_unrel=1;AC_Het=5;AC_Hom_EAS=0;AC_Hom_AMR=0;AC_Hom_EUR=0;AC_Hom_AFR=0;AC_Hom_SAS=0;AC_Hom_EUR_unrel=0;AC_Hom_EAS_unrel=0;AC_Hom_AMR_unrel=0;AC_Hom_SAS_unrel=0;AC_Hom_AFR_unrel=0;AC_Hom=0;HWE_EAS=1;ExcHet_EAS=1;HWE_AMR=1;ExcHet_AMR=0.998979;HWE_EUR=1;ExcHet_EUR=1;HWE_AFR=1;ExcHet_AFR=0.99944;HWE_SAS=1;ExcHet_SAS=1;HWE=1;ExcHet=0.998439;END=77266093;SVTYPE=INV;SVLEN=963;SOURCE=svtools;SPAN=963

5. What data and command can the vg dev team use to make the problem happen?

${vg} construct -f -S -a -r ${ref} -v ${vars} > merged_graph.vg

6. What does running vg version say?

vg 1.61.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant