Skip to content

Releases: bluenote-1577/sylph

v0.6.1

29 Apr 14:44
Compare
Choose a tag to compare

2024-04-09

  • Made -u estimation with short-reads slightly more robust. See CHANGELOG.
  • v0.6.0 has a conda install issue. Hopefully v0.6.1 fixes...

latest

29 Apr 14:46
Compare
Choose a tag to compare

Commits

  • d0d44f9: Update README.md (Jim Shaw)
  • 62a32f1: Update README.md (Jim Shaw)
  • ca28402: Update README.md (Jim Shaw)
  • f57e486: Update README.md (Jim Shaw)
  • 8158e6c: v0.6.1 initial - refactored some code and added automatic diversity detection for estimating identity (bluenote-1577)
  • 5db2d41: Update README.md (Jim Shaw)
  • 14d3580: Update README.md (Jim Shaw)
  • cc39ba4: Update README.md (Jim Shaw)
  • a931df9: Update README.md (Jim Shaw)
  • 83409c2: Update README.md (Jim Shaw)
  • 02232c7: v0.6.1 fixed bug with -I. pushing now to try and fix conda... (bluenote-1577)
  • 41a4a50: Merge branch 'main' of https://github.com/bluenote-1577/sylph (bluenote-1577)
  • 2c75d89: README (bluenote-1577)

v0.6.0

06 Apr 21:27
2a01389
Compare
Choose a tag to compare

sylph v0.6.0 release: New output column, lazy raw paired fastq profiling: 2024-04-06

Major

  • A new column called kmers_reassigned is now in the profile output. This states how many k-mers are lost due to reassignment for that particular genome.
  • -1, -2 options are now available for sylph profile. You can now do sylph profile database.syldb -1 1.fq -2 2.fq ...

v0.5.1

27 Dec 22:04
d3f035b
Compare
Choose a tag to compare

sylph v0.5.1 release: Memory improvement and bug fixes : Dec 27 2023

Major

  • Scalable cuckoo filters are now used for read deduplication for memory savings.
  • Deduplication algorithm improved. **v0.5.0 worked poorly on highly (>15%) duplicated read sets. **
  • Shorter reads can be sketched now. Down to 32bp instead of 63 bp before.

v0.5.0

23 Dec 22:46
67bd032
Compare
Choose a tag to compare

sylph v0.5.0 release: Big improvements on real illumina data : Dec 23 2023

Major

In previous versions, sylph was underperforming on real illumina data sets. See #5

This is because many real illumina datasets have a non-trivial number of duplicate reads. Duplicate reads mess up sylph's statistical model.

For the single and paired sketching options, a new deduplication routine has been added. This will be described in version 2 of our preprint.

This increases sketching memory by 3-4x but greatly increases performance on real datasets with > 1-2% of duplication, especially for low-abundance genomes.

For paired-end illumina reads with non-trivial (> 1% duplication), sylph can now

  1. detect up to many more species low-abundance species below 0.3x coverage
  2. give better coverage/abundance estimates for low-abundance species

BREAKING

  • sequence sketches (sylsp) have changed formats. Sequences will need to be re-sketched.
  • --read-length option removed and incorporated into the sketches by default. (suggested by @fplaza)

Other changes

  • New warning when -o specified and only reads are sketched (#7)
  • You can now rename sylph samples by specifing a sample naming file with --sample-names or --lS (suggested by @jolespin)
  • Newline delimited files are available in profile and query now (suggested by @jolespin)

v0.4.1

15 Nov 13:46
Compare
Choose a tag to compare

sylph v0.4.1 - getting ready for preprinting

MINOR

  • A few minor changes to help texts and options. Also fixed versioning issue.

v0.4.0

05 Nov 11:31
Compare
Choose a tag to compare

sylph v0.4.0 release: major interface changes

BREAKING

  • renamed sylph contain to sylph query.
  • methods for sketching are drastically different now. E.g. we use -g genome1.fa genom2.fa for specifying genomes and -r read1.fa read2.fq for specifying reads when sketching.

Major

  • -u or --estimate-unknown options are now present for estimating unknown organisms in the sample.
  • When using -u, associated options --read-seq-id and --read-len are available for calculating true coverages with sylph, i.e., coverages concordant with read mapping

Minor

  • Coverage calculation is slightly different now.

v0.3.0

01 Oct 08:47
70ba452
Compare
Choose a tag to compare

sylph v0.3.0 release: first class support for pseudotax, now called "profile" - 2023-10-01

Continuing development of sylph taxonomic profiling.

BREAKING

  • --pseudotax option in previous version is now a new command called profile.
  • Databases are enabled for profiling by default.
  • Changed file suffices to syldb and sylsp.

Major

  • Default parameter changes. --min-spacing is set to 30 now.
  • Made profiling faster with some algorithmic tweaks.
  • Coverage calculated slightly differently
  • Many small software changes with respect to threading and outputs

v0.2.0

18 Sep 15:30
Compare
Choose a tag to compare

sylph v0.2.0 release: pseudotax improved - 2023-09-19

BREAKING

  • Sylph's *.sylqueries are no longer compatible with older versions of sylph (< v0.2). Files will need to be resketched.

Major

  • Fixed a major bug for the --pseudotax option that required redesigning file formats. Please use --enable-pseudotax when using using contain --pseudotax from now on.
  • --pseudotax option gives relative abundances now. We are gaining some confidence that this approach gives a rough, but surprisingly decent taxonomic classification.
  • Changed how Eff_cov is calculated. We just use the median coverage now, except when we apply coverage-adjustment

Minor

  • Fixed command line ambiguity for sketching outputs. -s has been replaced with -d for sylph sketch.
  • Sylph outputs the results after processing every sample, instead of batching results, now

v0.1.0

07 Sep 03:32
3dd44ba
Compare
Choose a tag to compare

First major release of sylph. See CHANGELOG.md for information.