-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update examples + check if window was already processed. (#36)
* update readmes * update version number * remove unused packages * change default viloca mode * b2w and inference: check if window was processed already before adding process to process list * reference to manuscript * add viloca conda package
- Loading branch information
1 parent
b3498b3
commit bb554a7
Showing
10 changed files
with
73 additions
and
255 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,9 @@ | ||
[tool.poetry] | ||
name = "VILOCA" | ||
version = "0.1.0" | ||
description = "SHOrt Reads Assembly into Haplotypes" | ||
version = "1.0.0" | ||
description = "VIral LOcal haplotype reconstruction and mutation CAlling for short and long read data" | ||
license = "GPL-3.0-only" | ||
authors = ["Benjamin Langer <[email protected]>, Lara Fuhrmann <[email protected]>"] | ||
authors = ["Ivan Topolsky", "Benjamin Langer <[email protected]>, Lara Fuhrmann <[email protected]>"] | ||
build = "build.py" | ||
packages = [ | ||
{ include = "viloca" } | ||
|
@@ -18,9 +18,7 @@ biopython = "^1.79" | |
numpy = "^1.21.4" | ||
pysam = "^0.18.0" | ||
pybind11 = "^2.9.0" | ||
PyYAML = "^6.0" | ||
scipy = "^1.7.3" | ||
bio = "^1.3.3" | ||
pandas = "^1.3.5" | ||
|
||
[tool.poetry.dev-dependencies] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
#!/bin/bash | ||
|
||
viloca run -a 0.1 -w 201 -x 100000 -p 0.9 -c 0 \ | ||
viloca run -a 0.1 -w 201 --mode shorah -x 100000 -p 0.9 -c 0 \ | ||
-r HXB2:2469-3713 -R 42 -f test_ref.fasta -b test_aln.cram --out_format csv "$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,9 @@ | ||
### Test to check fil.cpp implementation accounting for long deletions | ||
|
||
Test files `SNV.txt` and `SNVs_0.010000.txt` are obtained by running `shorah shutgun`, e.g: | ||
Test files `SNV.txt` and `SNVs_0.010000.txt` are obtained by running `viloca run`, e.g: | ||
|
||
``` | ||
shorah shotgun -a 0.1 -w 42 -x 100000 -p 0.9 -c 0 -r REF:42-272 -R 42 -b test_aln.cram -f ref.fasta | ||
viloca run -a 0.1 -w 42 -x 100000 -p 0.9 -c 0 -r REF:42-272 -R 42 -b test_aln.cram -f ref.fasta | ||
``` | ||
|
||
The test script `test_long_deletions.py` uses `pysam` and `NumPy`, which can be installed using pip or conda. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,27 +1,18 @@ | ||
### Sample files to test `shorah shotgun` | ||
### Sample files to test `VILOCA` | ||
|
||
Use files in this directory to test shorah in shotgun mode. The reads data have been generated with V-pipe's benchmarking framework (simulated with parameters: ```seq_tech~illumina__seq_mode~shotgun__seq_mode_param~nan__read_length~90__genome_size~90__coverage~100__haplos~5@5@10@5@10@[email protected]```) | ||
|
||
The reads are from one single amplicon of length 90, meaning the reference is of length 90 and each read is of length 90bps. | ||
|
||
To run ShoRAH's original Gibbs sampler use the following command: | ||
``` | ||
poetry run shorah shotgun -f reference.fasta -b reads.shotgun.bam -w 90 --sampler shorah | ||
``` | ||
or | ||
``` | ||
poetry run shorah shotgun -f reference.fasta -b reads.shotgun.bam -z scheme.insert.bed --sampler shorah | ||
``` | ||
|
||
To use the new inference method using the sequencing quality scores use: | ||
``` | ||
poetry run shorah shotgun -f reference.fasta -b reads.shotgun.bam -w 90 --sampler use_quality_scores --alpha 0.0001 --n_max_haplotypes 100 --n_mfa_starts 1 --conv_thres 0.0001 | ||
viloca run -f reference.fasta -b reads.shotgun.bam -w 90 --mode use_quality_scores | ||
``` | ||
To use the new inference method learning the sequencing error parameter: | ||
To use the model that is learning the sequencing error parameter: | ||
``` | ||
poetry run shorah shotgun -f reference.fasta -b reads.shotgun.bam -w 90 --sampler -learn_error_params --alpha 0.0001 --n_max_haplotypes 100 --n_mfa_starts 1 --conv_thres 0.0001 | ||
viloca run -f reference.fasta -b reads.shotgun.bam -w 90 --mode -learn_error_params | ||
``` | ||
|
||
In the new inference method reads are filtered (and weighted respectively) such that only a set of unique reads are processed. This mode can be switch off by setting the parameter `--non-unique_modus`, e.g.: | ||
To run VILOCA with the insert file run: | ||
``` | ||
viloca run -f reference.fasta -b reads.shotgun.bam -w 90 --mode use_quality_scores -z scheme.insert.bed | ||
``` | ||
poetry run shorah shotgun -f reference.fasta -b reads.shotgun.bam -w 90 --sampler -learn_error_params --alpha 0.0001 --n_max_haplotypes 100 --n_mfa_starts 1 --conv_thres 0.0001 --non-unique_modus |
Oops, something went wrong.