bcftools mpileup -Ou -f ref.fa aln.bam | \ bcftools call -Ou -mv | \ - bcftools filter -s LowQual -e '%QUAL<20 || DP>100' > var.flt.vcf+ bcftools filter -s LowQual -e 'QUAL<20 || DP>100' > var.flt.vcf
diff --git a/doc/bcftools.1 b/doc/bcftools.1
index fcbbcf2f..480ee9c3 100644
--- a/doc/bcftools.1
+++ b/doc/bcftools.1
@@ -1,13 +1,13 @@
'\" t
.\" Title: bcftools
.\" Author: [see the "AUTHOR(S)" section]
-.\" Generator: Asciidoctor 2.0.16.dev
-.\" Date: 2024-09-12
+.\" Generator: Asciidoctor 2.0.15.dev
+.\" Date: 2024-12-16
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
-.TH "BCFTOOLS" "1" "2024-09-12" "\ \&" "\ \&"
+.TH "BCFTOOLS" "1" "2024-12-16" "\ \&" "\ \&"
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.ss \n[.ss] 0
@@ -51,7 +51,7 @@ standard input (stdin) and outputs to the standard output (stdout). Several
commands can thus be combined with Unix pipes.
.SS "VERSION"
.sp
-This manual page was last updated \fB2024\-09\-12\fP and refers to bcftools git version \fB1.21\fP.
+This manual page was last updated \fB2024\-12\-16 09:31 GMT\fP and refers to bcftools git version \fB1.21\-58\-g6559a12a+\fP.
.SS "BCF1"
.sp
The obsolete BCF1 format output by versions of samtools <= 0.1.19 is \fBnot\fP
@@ -422,7 +422,6 @@ abbreviation of "\fB\-c\fP \fIindels\fP\~ \fB\-c\fP \fIsnps\fP"
\fIid\fP
.RS 4
only records with identical ID column are compatible.
-Supported by \fBbcftools merge\fP only.
.RE
.RE
.sp
@@ -596,7 +595,7 @@ Such a file can be easily created from a VCF using:
.if n .RS 4
.nf
.fam C
- bcftools query \-f\*(Aq%CHROM\(rst%POS\(rst%REF,%ALT\(rsn\*(Aq file.vcf | bgzip \-c > als.tsv.gz && tabix \-s1 \-b2 \-e2 als.tsv.gz
+ bcftools query \-f\(aq%CHROM\(rst%POS\(rst%REF,%ALT\(rsn\(aq file.vcf | bgzip \-c > als.tsv.gz && tabix \-s1 \-b2 \-e2 als.tsv.gz
.fam
.fi
.if n .RE
@@ -745,7 +744,7 @@ See also the \fB\-l, \-\-merge\-logic\fP option.
^INFO/TAG .. transfer all INFO annotations except "TAG"
TAG .. add or overwrite existing target value if source is not "." and skip otherwise
- +TAG .. add or overwrite existing target value only it is "."
+ +TAG .. add or overwrite existing target value only if it is "."
.TAG .. add or overwrite existing target value even if source is "."
.+TAG .. add new but never overwrite existing tag, regardless of its value; can transfer "." if target does not exist
\-TAG .. overwrite existing value, never add new if target does not exist
@@ -805,7 +804,7 @@ one can use
.if n .RS 4
.nf
.fam C
- bcftools annotate \-\-set\-id +\*(Aq%CHROM\(rs_%POS\(rs_%REF\(rs_%FIRST_ALT\*(Aq file.vcf
+ bcftools annotate \-\-set\-id +\(aq%CHROM\(rs_%POS\(rs_%REF\(rs_%FIRST_ALT\(aq file.vcf
.fam
.fi
.if n .RE
@@ -825,13 +824,13 @@ file dynamically for each record:
.if n .RS 4
.nf
.fam C
- # The field \*(AqSTR\*(Aq from the \-a file is required to match INFO/TAG in VCF. In the first example
+ # The field \(aqSTR\(aq from the \-a file is required to match INFO/TAG in VCF. In the first example
# the alleles REF,ALT must match, in the second example they are ignored. The option \-k is required
# to output also records that are not annotated. The third example shows the same concept with
# a numerical expression.
- bcftools annotate \-a annots.tsv.gz \-c CHROM,POS,REF,ALT,SCORE,~STR \-i\*(AqTAG={STR}\*(Aq \-k input.vcf
- bcftools annotate \-a annots.tsv.gz \-c CHROM,POS,\-,\-,SCORE,~STR \-i\*(AqTAG={STR}\*(Aq \-k input.vcf
- bcftools annotate \-a annots.tsv.gz \-c CHROM,POS,\-,\-,SCORE,~INT \-i\*(AqTAG>{INT}\*(Aq \-k input.vcf
+ bcftools annotate \-a annots.tsv.gz \-c CHROM,POS,REF,ALT,SCORE,~STR \-i\(aqTAG={STR}\(aq \-k input.vcf
+ bcftools annotate \-a annots.tsv.gz \-c CHROM,POS,\-,\-,SCORE,~STR \-i\(aqTAG={STR}\(aq \-k input.vcf
+ bcftools annotate \-a annots.tsv.gz \-c CHROM,POS,\-,\-,SCORE,~INT \-i\(aqTAG>{INT}\(aq \-k input.vcf
.fam
.fi
.if n .RE
@@ -862,7 +861,7 @@ This is an experimental feature.
annotate sites which are present ("+") or absent ("\-") in the \fB\-a\fP file with a new INFO/TAG flag
.RE
.sp
-\fB\-\-min\-overlap\fP \fIANN\fP:\*(AqVCF\*(Aq
+\fB\-\-min\-overlap\fP \fIANN\fP:\(aqVCF\(aq
.RS 4
minimum overlap required as a fraction of the variant in the annotation \fB\-a\fP file (\fIANN\fP), in the
target VCF file (\fI:VCF\fP), or both for reciprocal overlap (\fIANN:VCF\fP).
@@ -886,7 +885,7 @@ see \fBCommon Options\fP
see \fBCommon Options\fP
.RE
.sp
-\fB\-\-pair\-logic\fP \fIsnps\fP|\fIindels\fP|\fIboth\fP|\fIall\fP|\fIsome\fP|\fIexact\fP
+\fB\-\-pair\-logic\fP \fIsnps\fP|\fIindels\fP|\fIboth\fP|\fIall\fP|\fIsome\fP|\fIexact\fP|\fIid\fP
.RS 4
Controls how to match records from the annotation file to the target VCF.
Effective only when \fB\-a\fP is a VCF or BCF. The option replaces the former
@@ -1139,10 +1138,10 @@ workflow looks like this:
.nf
.fam C
# Extract AN,AC values from an existing VCF, such 1000Genomes
- bcftools query \-f\*(Aq%CHROM\(rst%POS\(rst%REF\(rst%ALT\(rst%AN\(rst%AC\(rsn\*(Aq 1000Genomes.bcf | bgzip \-c > AFs.tab.gz
+ bcftools query \-f\(aq%CHROM\(rst%POS\(rst%REF\(rst%ALT\(rst%AN\(rst%AC\(rsn\(aq 1000Genomes.bcf | bgzip \-c > AFs.tab.gz
# If the tags AN,AC are not already present, use the +fill\-tags plugin
- bcftools +fill\-tags 1000Genomes.bcf | bcftools query \-f\*(Aq%CHROM\(rst%POS\(rst%REF\(rst%ALT\(rst%AN\(rst%AC\(rsn\*(Aq | bgzip \-c > AFs.tab.gz
+ bcftools +fill\-tags 1000Genomes.bcf | bcftools query \-f\(aq%CHROM\(rst%POS\(rst%REF\(rst%ALT\(rst%AN\(rst%AC\(rsn\(aq | bgzip \-c > AFs.tab.gz
tabix \-s1 \-b2 \-e2 AFs.tab.gz
# Create a VCF header description, here we name the tags REF_AN,REF_AC
@@ -2154,7 +2153,7 @@ An example of a minimal working GFF file:
.fam C
# The program looks for "CDS", "exon", "three_prime_UTR" and "five_prime_UTR" lines,
# looks up their parent transcript (determined from the "Parent=transcript:" attribute),
- # the gene (determined from the transcript\*(Aqs "Parent=gene:" attribute), and the biotype
+ # the gene (determined from the transcript\(aqs "Parent=gene:" attribute), and the biotype
# (the most interesting is "protein_coding").
#
# Empty and commented lines are skipped, the following GFF columns are required
@@ -2339,7 +2338,7 @@ one of "tbi" or "csi" depending on output file format.
# %TBCSQ{0} .. print the first haplotype only
# %TBCSQ{1} .. print the second haplotype only
# %TBCSQ{*} .. print a list of unique consequences present in either haplotype
- bcftools query \-f\*(Aq[%CHROM\(rst%POS\(rst%SAMPLE\(rst%TBCSQ\(rsn]\*(Aq out.bcf
+ bcftools query \-f\(aq[%CHROM\(rst%POS\(rst%SAMPLE\(rst%TBCSQ\(rsn]\(aq out.bcf
.fam
.fi
.if n .RE
@@ -2418,7 +2417,7 @@ exclude sites for which \fIEXPRESSION\fP is true. For valid expressions see
\fBEXPRESSIONS\fP.
.RE
.sp
-\fB\-g, \-\-SnpGap\fP \fIINT\fP[:\*(Aqindel\*(Aq,\fImnp\fP,\fIbnd\fP,\fIother\fP,\fIoverlap\fP]
+\fB\-g, \-\-SnpGap\fP \fIINT\fP[:\(aqindel\(aq,\fImnp\fP,\fIbnd\fP,\fIother\fP,\fIoverlap\fP]
.RS 4
filter SNPs within \fIINT\fP base pairs of an indel or other other variant type. The following example
demonstrates the logic of \fB\-\-SnpGap\fP \fI3\fP applied on a deletion and
@@ -2584,7 +2583,7 @@ in\-memory sorting and DIR is the temporary directory for external sorting. This
Stop after first record to estimate required time.
.RE
.sp
-\fB\-e, \-\-exclude\fP [\fIqry\fP|\fIgt\fP]:\*(AqEXPRESSION\*(Aq
+\fB\-e, \-\-exclude\fP [\fIqry\fP|\fIgt\fP]:\(aqEXPRESSION\(aq
.RS 4
Exclude sites from query file (\fIqry:\fP) or genotype file (\fIgt:\fP) for which \fIEXPRESSION\fP is true.
For valid expressions see \fBEXPRESSIONS\fP.
@@ -2626,7 +2625,7 @@ VCF/BCF file with reference genotypes to compare against
Homozygous genotypes only, useful with low coverage data (requires \fB\-g, \-\-genotypes\fP)
.RE
.sp
-\fB\-i, \-\-include\fP [\fIqry\fP|\fIgt\fP]:\*(AqEXPRESSION\*(Aq
+\fB\-i, \-\-include\fP [\fIqry\fP|\fIgt\fP]:\(aqEXPRESSION\(aq
.RS 4
Include sites from query file (\fIqry:\fP) or genotype file (\fIgt:\fP) for which \fIEXPRESSION\fP is true.
For valid expressions see \fBEXPRESSIONS\fP.
@@ -2674,7 +2673,7 @@ from the query file, the second from the genotypes file when \fB\-g\fP is given
Restrict to comma\-separated list of regions, see \fBCommon Options\fP
.RE
.sp
-*\-R, \-\-regions\-file\*(Aq \fIFILE\fP
+*\-R, \-\-regions\-file\(aq \fIFILE\fP
.RS 4
Restrict to regions listed in a file, see \fBCommon Options\fP
.RE
@@ -2684,11 +2683,11 @@ Restrict to regions listed in a file, see \fBCommon Options\fP
see \fBCommon Options\fP
.RE
.sp
-\fB\-s, \-\-samples\fP [\fIqry\fP|\fIgt\fP]:\*(AqLIST\*(Aq:
+\fB\-s, \-\-samples\fP [\fIqry\fP|\fIgt\fP]:\(aqLIST\(aq:
List of query samples or \fB\-g\fP samples. If neither \fB\-s\fP nor \fB\-S\fP are given, all possible sample
pair combinations are compared
.sp
-\fB\-S, \-\-samples\-file\fP [\fIqry\fP|\fIgt\fP]:\*(AqFILE\*(Aq
+\fB\-S, \-\-samples\-file\fP [\fIqry\fP|\fIgt\fP]:\(aqFILE\(aq
File with the query or \fB\-g\fP samples to compare. If neither \fB\-s\fP nor \fB\-S\fP are given, all possible sample
pair combinations are compared
.sp
@@ -2837,7 +2836,7 @@ on the options, the program can output records from one (or more) files
which have (or do not have) corresponding records with the same position
in the other files.
.sp
-\fB\-c, \-\-collapse\fP \fIsnps\fP|\fIindels\fP|\fIboth\fP|\fIall\fP|\fIsome\fP|\fInone\fP
+\fB\-c, \-\-collapse\fP \fIsnps\fP|\fIindels\fP|\fIboth\fP|\fIall\fP|\fIsome\fP|\fInone\fP|\fIid\fP
.RS 4
see \fBCommon Options\fP
.RE
@@ -2956,7 +2955,7 @@ the files after filters have been applied
.if n .RS 4
.nf
.fam C
- bcftools isec \-e\*(AqMAF<0.01\*(Aq \-i\*(AqdbSNP=1\*(Aq \-e\- A.vcf.gz B.vcf.gz C.vcf.gz \-n +2 \-p dir
+ bcftools isec \-e\(aqMAF<0.01\(aq \-i\(aqdbSNP=1\(aq \-e\- A.vcf.gz B.vcf.gz C.vcf.gz \-n +2 \-p dir
.fam
.fi
.if n .RE
@@ -3105,7 +3104,7 @@ if two asterisks \fI**\fP are appended, the unobserved allele will be removed al
\-m both,* .. same as above but remove <*> (or This manual page was last updated 2024-09-12 and refers to bcftools git version 1.21. This manual page was last updated 2024-12-16 09:31 GMT and refers to bcftools git version 1.21-58-g6559a12a+. only records with identical ID column are compatible.
-Supported by bcftools merge only. only records with identical ID column are compatible. see Common Options Controls how to match records from the annotation file to the target VCF.
Effective only when -a is a VCF or BCF. The option replaces the former
@@ -2530,7 +2529,7 @@ see Common OptionsDESCRIPTION
VERSION
Common Options
bcftools annotate [OPTIONS] FILE
^INFO/TAG .. transfer all INFO annotations except "TAG"
TAG .. add or overwrite existing target value if source is not "." and skip otherwise
- +TAG .. add or overwrite existing target value only it is "."
+ +TAG .. add or overwrite existing target value only if it is "."
.TAG .. add or overwrite existing target value even if source is "."
.+TAG .. add new but never overwrite existing tag, regardless of its value; can transfer "." if target does not exist
-TAG .. overwrite existing value, never add new if target does not exist
@@ -674,7 +673,7 @@ bcftools annotate [OPTIONS] FILE
bcftools isec [OPTIONS] A.vcf.gz B.vcf.gz
-
Examples:
bcftools mpileup -Ou -f ref.fa aln.bam | \
bcftools call -Ou -mv | \
- bcftools filter -s LowQual -e '%QUAL<20 || DP>100' > var.flt.vcf
+ bcftools filter -s LowQual -e 'QUAL<20 || DP>100' > var.flt.vcf
bcftools norm [OPTIONS] file.vcf.gz
cannot be stressed enough, that s will NOT fix strand issues in
your VCF, do NOT use it for that purpose!!! (Instead see
http://samtools.github.io/bcftools/howtos/plugin.af-dist.html and
-<http://samtools.github.io/bcftools/howtos/plugin.fixref.html>.)
If a record is present in multiple files, output only the first instance. -Alias for -d none, deprecated.
+Alias for -d exact, deprecated.collect AF deviation stats and GT probability distribution given AF and assuming HWE
assess site noisiness (allelic frequency score) from a large number of unaffected parental samples
+count the frequency of the length of REF, ALT and REF+ALT
@@ -4244,6 +4247,7 @@variables calculated on the fly if not present: number of alternate alleles; number of samples; count of alternate alleles; minor allele count (similar to -AC but is always smaller than 0.5); frequency of alternate alleles (AF=AC/AN); +AC but always picks the allele with frequency smaller than 0.5); frequency of alternate alleles (AF=AC/AN); frequency of minor alleles (MAF=MAC/AN); number of alleles in called genotypes; number of samples with missing genotype; fraction of samples with missing genotype; indel length (deletions negative, insertions positive, balanced substitutions zero)
@@ -5542,7 +5546,7 @@bcftools view -i '%ID!="." & MAF[0]<0.01'+
bcftools view -i 'ID!="." & MAF[0]<0.01'