-
Notifications
You must be signed in to change notification settings - Fork 3
dsComputeGCCoverage
dsComputeGCCoverage aims at calculating the GC content along a binned genome, and outputs it as a bedGraph. Multiple genomes can be passed to this tool as fasta files (not necessarily indexed) and the bins size is provided by the user.
This tool works in a memory efficient way, since sequences are pulled bin by bin from the fasta files, thus allowing to not load the entire fasta in the RAM. It is therefore unnecessary to split genomes fasta files per chromosome to use this tool.
Command | Description |
---|---|
--input -i |
Fasta files from which you want the GC content to be calculated. |
--windowSize -w |
Size of the window used to binify the genome and calculate the GC content. Default: 1000. |
--output -o |
bedGraph file(s) output prefix name(s) ('.bedGraph' is automatically added at the end of the given prefix, one bedGraph per input file). |
dsComputeGCCoverage -i data/genome.fa data/genome_mitoc̀hondria.fa data/genome_chloroplast.fa -w 100 -o results/genome results/mitochondria results/chloroplast
This command will output three bedGraph files with the first lines resembling the following example:
Chr1 0 100 0.542
Chr1 100 200 0.657
Chr1 200 300 0.478
Gautier RICHARD. (2019, August 6). gtrichard/deepStats: deepStats 0.3.1 (Version 0.3.1). Zenodo. http://doi.org/10.5281/zenodo.3361799