-
Notifications
You must be signed in to change notification settings - Fork 1
Usage
Paula Ruiz Rodriguez edited this page Oct 9, 2024
·
1 revision
get_mnv [OPTIONS] --vcf <VCF_FILE> --fasta <FASTA_FILE> --genes <GENES_FILE>
- -v, --vcf <VCF_FILE>: VCF file containing the SNVs. (Required)
- -b, --bam <BAM_FILE>: BAM file with aligned reads. (Optional)
- -f, --fasta <FASTA_FILE>: FASTA file with the reference sequence. (Required)
- -g, --genes <GENES_FILE>: File containing gene information. (Required)
- -q, --quality : Minimum Phred quality score (default: 20).
get_mnv \
--vcf variants.vcf \
--bam reads.bam \
--fasta reference.fasta \
--genes genes.txt \
--quality 30
- VCF File: Should contain the identified SNVs.
- BAM File: (Optional) Genomic reads aligned to the reference sequence.
- FASTA File: Reference genomic sequence.
- Gene File: A tab-delimited text file with the following structure per line (GeneName,GeneStart,GeneEnd,Strand):
Rv0007_Rv0007 9914 10828 +
ileT_Rvnt01 10887 10960 +
alaT_Rvnt02 11112 11184 +
Rv0008c_Rv0008c 11874 12311 -
ppiA_Rv0009 12468 13016 +
Rv0010c_Rv0010c 13133 13558 -
The program generates a TSV file named <vcf_filename>.MNV.tsv containing the following information:
- Gene: Name of the gene.
- Positions: Positions of the variants.
- Base Changes: Nucleotide base changes.
- AA Changes: Resulting amino acid changes.
- SNP AA Changes: Amino acid changes if considering individual SNVs.
- Variant Type: Type of variant (SNP, MNV, or SNP/MNV).
- Change Type: Type of change at the protein level (Synonymous, Non-synonymous, Stop gained).
- SNP Reads: (If BAM provided) Count of reads supporting each SNP.
- MNV Reads: (If BAM provided) Count of reads supporting the MNV.
Example:
Gene Positions Base Changes AA Changes SNP AA Changes Variant Type Change Type SNP Reads MNV Reads
Rv0095c_Rv0095c 104838 T Asp126Glu Asp126Glu SNP Non-synonymous 0 16
Rv0095c_Rv0095c 104941,104942 T,G Gly92Gln Gly92Glu; Gly92Arg MNV Non-synonymous 0,0 25
esxL_Rv1198 1341044 C His13His His13His SNP Synonymous 0 41
esxL_Rv1198 1341083 G Ala26Ala Ala26Ala SNP Synonymous 0 12
esxL_Rv1198 1341102,1341103 T,C Arg33Ser Arg33Cys; Arg33Pro MNV Non-synonymous 0,0 11
PathoGenOmics