Releases · Gabaldonlab/jloh

12 Apr 14:14

MatteoSchiavinato

v0.20.0

f4d2195

v0.20.0

Log of changes:

changed --hybrid to --assign-blocks in jloh extract to facilitate understanding of the parameter's function
Fixed issue in stats when having zero homozygous SNPs, which would assign everything to "homo"
Fixed issue in stats which would return "nan" for quantiles when having no homozygous SNP
Fixed issue in stats which was returning 0 SNPs as a threshold, the minimum is now set to 1
Minor changes in the stderr readout of jloh extract, increased readability
Changed behavior of pybedtools in terms of tmp folder. Now associated with an alphanumeric string to avoid overwriting of tmp files when working in the same tmpdir with two jloh instances (shell issue).

Assets 2

15 Mar 14:31

MatteoSchiavinato

v0.19.0

a3d59cf

v0.19.0

Log of changes:

Updated g2g module. Now it presents output as BED file, and filters it by minimum length of g2g block.
Changed algorithm of the sim module. Now the simulation resembles more a real genome. The genome is broken into haplotypes of variable size sampled from a distribution centered around a mean that the user can pass as a parameter (--mean-haplotype-size). Each haplotype is assigned a divergence rate. The divergence is variable, and like the haplotype size, is sampled from a distribution centered around the one declared in --divergence. Then, based on the fraction of --loh declared by the user, some of these haplotypes are assigned divergence = 0. Then, the algorithm adjusts the divergence rate of the non-LOH haplotypes to bring the total average back to --divergence. Finally, random mutations are introduced in each haplotype based on their assigned divergence rate.
Added GPL3.0 licence

Assets 2

23 Nov 10:56

MatteoSchiavinato

v0.18.0

97807eb

v0.18.0

Log of changes:

using --min-snps-kbp instead of simple --min-snps. This because a small block with 10 SNPs is much more SNP-dense than a 10 kbp blockwith only 10 SNPs. This applies to jloh stats and jloh extract.
fixed help section of jloh stats which still said "density"
removed the --snp-distance parameter. It is now inferred from the values of --min-snps-kbp. This reduces the parameters for the user, making it a more comfortable tool to use.
added the --skip-trimming parameter to the nextflow workflow
made jloh stats multithreaded
changed some syntax in jloh stats to make it compatible with python from version 3.6 instead of from 3.9
Fixed the usage of jloh extract in the workflows, according to the modifications listed above
Added a step where the bam file is subdivided in separate files by chromosome, which made the code run ~10x faster
Changed default suggested --min-snps-kbp parameter setting from 5% quantile to 50% quantile in the jloh stats module, following our findings reported in the manuscript

Assets 2

17 Aug 08:37

MatteoSchiavinato

v0.17.0

6b47ad2

v0.17.0

Log of changes:

Added the intersect module to perform intersections/removals with the output files of two runs (similar to bedtools)
Added the chimeric module to find genes harboring chimeras between two different haplotypes
Added the junctions module to calculate statistics on neighboring blocks from different origins (REF, ALT, or simply from two different calls)
Fixed a small bug in the default mode, that was not producing the heterozygous BED file.
Modified the density module to produce more values. Specifically, besides the mean snp density, it calculates a distribution and extracts the quantiles. It does the same for SNP distances. These values are useful to set as thresholds in JLOH extract, so it produces an estimation of which parameters would fit the best. Module has been renamed as jloh stats.
modified the --min-snps and the --snp-distance parameter in a way that two values have to be passed, one for heterozygous and one for homozygous SNPs. These values can be estimated with jloh stats, or passed by the user.
All heterozygous blocks within --min-length - 1 from each other are now merged before the generation of REF and ALT LOH blocks.

Assets 2

06 May 14:31

MatteoSchiavinato

v0.16.2

8d24517

v0.16.2

Log of changes:

Fixed bug in printing the *tsv file that was generating two lines per record on the B subgenome in a few cases
Fixed small bugs in the run_with_real_data.nf workflow script
Fixed small bug in jloh sim

Assets 2

04 Apr 13:56

MatteoSchiavinato

v0.16.0

b670bdf

v0.16.0

Log of changes:

Updated nextflow workflow with --hybrid mode
Fixed jloh sim in how it finds regions of relevance, reducing false positives in jloh extract
Fixed bug in default mode that was not using the VCF files properly
Re-introduced writing of BED file with heterozygous regions that are discarded in the first step of jloh extract.
New parameter in jloh sim: --loh-mean-length, which controls the average length of any introduced LOH block, defaults at 5000 (before it was hardcoded as 1000 bp)
Adjusted parameters in the --default/--sensitive/--relaxed modes of jloh g2g
Fixed bug in jloh extract at the stage of the LOH candidates that saw intervals starting from position "-1".
Adjusted output of jloh extract so that the tsv file has blocks in 1-based coordinates while the bed file has blocks in 0-based half-open coordinates.
Fixed bug in jloh sim that was not introducing variants when divergence was < 0.05
Added output file in jloh sim: the non-divergent file, containing regions where no variation has been introduced.

Assets 2

16 Mar 12:56

MatteoSchiavinato

v0.15.0

dec3f06

v0.15.0

Log of changes:

jloh g2g now creates two separate BED files with regions, one per parental subgenome. This fixes an issue arising when they have chromosomes with the same names.
The same is done for jloh extract as well, now. Streams of LOH blocks are now left separate so that when same chromosome names are there in the two parents, they don't get confused.
In jloh extract, the "candidates" file is tsv now, not bed.
Fixed an issue in jloh extract and jloh g2g when removing the temporary folder.
Change jloh sim script entirely to make the code more readable. Now works with parallel threads too.
g2g now has a --sensitive and a --relaxed mapping parameter, that defines how nucmer matches are found.

Assets 2

14 Mar 16:43

MatteoSchiavinato

v0.14.0

56c13e2

v0.14.0

Log of changes:

New module: JLOH sim. This module creates a copy of a genome with some divergence and LOH introduced. It is based on a script that is part of the redundans tool: fasta2diverged.py. This script has been updated to account for more functions and to work with hybrid genomes as well (i.e. producing two copies of a genome, with some mutations each, some of which homozygous).
JLOH g2g now uses --est-divergence instead of --min-identity, so the user can specify a value of divergence between subgenomes. The value goes from 0 to 1, where e.g. 0.01 is 1% divergence. This value is used to limit which mapping results to keep (1-min-identity = divergence)
JLOH g2g now produces a BED file with regions to KEEP, not to discard. See next comment.
JLOH extract now uses a --regions BED file instead of a --mask BED file. This file is still produced by JLOH g2g. This allows for more control with BEDtools intersect.

Assets 2

11 Mar 16:47

MatteoSchiavinato

v0.13.0

20cbfac

v0.13.0

Log of changes:

new parameter of JLOH extract: --mask, which allows you to pass a BED file containing regions that you don't want to include in the final list of blocks.
new module: JLOH g2g: allows you to map one genome onto the other and produce the --mask file.

Assets 2

10 Mar 16:35

MatteoSchiavinato

v0.12.3

1fd16ac

v0.12.3

Log of changes:

Fixed treatment of temporary files, now a folder is created within --output-dir and this folder is removed at the end of the process

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: Gabaldonlab/jloh

v0.20.0

v0.19.0

v0.18.0

v0.17.0

v0.16.2

v0.16.0

v0.15.0

v0.14.0

v0.13.0

v0.12.3