Skip to content

Releases: Gabaldonlab/jloh

v0.20.0

12 Apr 14:14
Compare
Choose a tag to compare

Log of changes:

  • changed --hybrid to --assign-blocks in jloh extract to facilitate understanding of the parameter's function
  • Fixed issue in stats when having zero homozygous SNPs, which would assign everything to "homo"
  • Fixed issue in stats which would return "nan" for quantiles when having no homozygous SNP
  • Fixed issue in stats which was returning 0 SNPs as a threshold, the minimum is now set to 1
  • Minor changes in the stderr readout of jloh extract, increased readability
  • Changed behavior of pybedtools in terms of tmp folder. Now associated with an alphanumeric string to avoid overwriting of tmp files when working in the same tmpdir with two jloh instances (shell issue).

v0.19.0

15 Mar 14:31
Compare
Choose a tag to compare

Log of changes:

  • Updated g2g module. Now it presents output as BED file, and filters it by minimum length of g2g block.
  • Changed algorithm of the sim module. Now the simulation resembles more a real genome. The genome is broken into haplotypes of variable size sampled from a distribution centered around a mean that the user can pass as a parameter (--mean-haplotype-size). Each haplotype is assigned a divergence rate. The divergence is variable, and like the haplotype size, is sampled from a distribution centered around the one declared in --divergence. Then, based on the fraction of --loh declared by the user, some of these haplotypes are assigned divergence = 0. Then, the algorithm adjusts the divergence rate of the non-LOH haplotypes to bring the total average back to --divergence. Finally, random mutations are introduced in each haplotype based on their assigned divergence rate.
  • Added GPL3.0 licence

v0.18.0

23 Nov 10:56
Compare
Choose a tag to compare

Log of changes:

  • using --min-snps-kbp instead of simple --min-snps. This because a small block with 10 SNPs is much more SNP-dense than a 10 kbp blockwith only 10 SNPs. This applies to jloh stats and jloh extract.
  • fixed help section of jloh stats which still said "density"
  • removed the --snp-distance parameter. It is now inferred from the values of --min-snps-kbp. This reduces the parameters for the user, making it a more comfortable tool to use.
  • added the --skip-trimming parameter to the nextflow workflow
  • made jloh stats multithreaded
  • changed some syntax in jloh stats to make it compatible with python from version 3.6 instead of from 3.9
  • Fixed the usage of jloh extract in the workflows, according to the modifications listed above
  • Added a step where the bam file is subdivided in separate files by chromosome, which made the code run ~10x faster
  • Changed default suggested --min-snps-kbp parameter setting from 5% quantile to 50% quantile in the jloh stats module, following our findings reported in the manuscript

v0.17.0

17 Aug 08:37
6b47ad2
Compare
Choose a tag to compare

Log of changes:

  • Added the intersect module to perform intersections/removals with the output files of two runs (similar to bedtools)
  • Added the chimeric module to find genes harboring chimeras between two different haplotypes
  • Added the junctions module to calculate statistics on neighboring blocks from different origins (REF, ALT, or simply from two different calls)
  • Fixed a small bug in the default mode, that was not producing the heterozygous BED file.
  • Modified the density module to produce more values. Specifically, besides the mean snp density, it calculates a distribution and extracts the quantiles. It does the same for SNP distances. These values are useful to set as thresholds in JLOH extract, so it produces an estimation of which parameters would fit the best. Module has been renamed as jloh stats.
  • modified the --min-snps and the --snp-distance parameter in a way that two values have to be passed, one for heterozygous and one for homozygous SNPs. These values can be estimated with jloh stats, or passed by the user.
  • All heterozygous blocks within --min-length - 1 from each other are now merged before the generation of REF and ALT LOH blocks.

v0.16.2

06 May 14:31
8d24517
Compare
Choose a tag to compare

Log of changes:

  • Fixed bug in printing the *tsv file that was generating two lines per record on the B subgenome in a few cases
  • Fixed small bugs in the run_with_real_data.nf workflow script
  • Fixed small bug in jloh sim

v0.16.0

04 Apr 13:56
b670bdf
Compare
Choose a tag to compare

Log of changes:

  • Updated nextflow workflow with --hybrid mode
  • Fixed jloh sim in how it finds regions of relevance, reducing false positives in jloh extract
  • Fixed bug in default mode that was not using the VCF files properly
  • Re-introduced writing of BED file with heterozygous regions that are discarded in the first step of jloh extract.
  • New parameter in jloh sim: --loh-mean-length, which controls the average length of any introduced LOH block, defaults at 5000 (before it was hardcoded as 1000 bp)
  • Adjusted parameters in the --default/--sensitive/--relaxed modes of jloh g2g
  • Fixed bug in jloh extract at the stage of the LOH candidates that saw intervals starting from position "-1".
  • Adjusted output of jloh extract so that the tsv file has blocks in 1-based coordinates while the bed file has blocks in 0-based half-open coordinates.
  • Fixed bug in jloh sim that was not introducing variants when divergence was < 0.05
  • Added output file in jloh sim: the non-divergent file, containing regions where no variation has been introduced.

v0.15.0

16 Mar 12:56
Compare
Choose a tag to compare

Log of changes:

  • jloh g2g now creates two separate BED files with regions, one per parental subgenome. This fixes an issue arising when they have chromosomes with the same names.
  • The same is done for jloh extract as well, now. Streams of LOH blocks are now left separate so that when same chromosome names are there in the two parents, they don't get confused.
  • In jloh extract, the "candidates" file is tsv now, not bed.
  • Fixed an issue in jloh extract and jloh g2g when removing the temporary folder.
  • Change jloh sim script entirely to make the code more readable. Now works with parallel threads too.
  • g2g now has a --sensitive and a --relaxed mapping parameter, that defines how nucmer matches are found.

v0.14.0

14 Mar 16:43
56c13e2
Compare
Choose a tag to compare

Log of changes:

  • New module: JLOH sim. This module creates a copy of a genome with some divergence and LOH introduced. It is based on a script that is part of the redundans tool: fasta2diverged.py. This script has been updated to account for more functions and to work with hybrid genomes as well (i.e. producing two copies of a genome, with some mutations each, some of which homozygous).
  • JLOH g2g now uses --est-divergence instead of --min-identity, so the user can specify a value of divergence between subgenomes. The value goes from 0 to 1, where e.g. 0.01 is 1% divergence. This value is used to limit which mapping results to keep (1-min-identity = divergence)
  • JLOH g2g now produces a BED file with regions to KEEP, not to discard. See next comment.
  • JLOH extract now uses a --regions BED file instead of a --mask BED file. This file is still produced by JLOH g2g. This allows for more control with BEDtools intersect.

v0.13.0

11 Mar 16:47
20cbfac
Compare
Choose a tag to compare

Log of changes:

  • new parameter of JLOH extract: --mask, which allows you to pass a BED file containing regions that you don't want to include in the final list of blocks.
  • new module: JLOH g2g: allows you to map one genome onto the other and produce the --mask file.

v0.12.3

10 Mar 16:35
1fd16ac
Compare
Choose a tag to compare

Log of changes:

  • Fixed treatment of temporary files, now a folder is created within --output-dir and this folder is removed at the end of the process