Releases: labgem/PPanGGOLiN
PPanGGOLiN release 1.0.13
- Better handling of input files (will raise errors before doing anything if any name is duplicated or if any input file does not exist)
- Added the '--defrag' option to use the defragmentation pipeline in the 'workflow' command (rather than having to use each subcommand separately only to use this option)
Bugfixes :
- --soft-core option in 'evolution' and in 'write' commands actually works now when you change it
- Cope with .gbk/.gbff files with :
- duplicate 'locus_tag' fields within assemblies and between different assemblies (happens when genomes are downloaded from Genbank) (fix for #25 )
- no contig identifier in VERSION field (happens with Prokka annotated genomes)
PPanGGOLiN release 1.0.1
PPanGGOLiN release 1.0.0
New features:
- Can choose the number of partitions in the 'workflow' subcommand
- Can customize identity and coverage thresholds in the 'cluster' subcommand
- Added 4 new possible outputs :
- proteic fasta for representative sequences of the gene families
- nucleic fasta for representative sequences of the gene families
- nucleic fasta of all the CDS
- a list containing the gene family IDs and the gene IDs alike the .tsv file format of MMseqs2 - Added unit tests for the different classes thank to @sletort
bug fixes :
- Do not take into account the Markov Random Field if its criteria reaches infinity (problem of large dimensionality in statistics, PPanGGOLiN should crash less on VERY fragmented datasets.)
- now properly reading .gbff/.gbk files
- Improved compatibility for the .gexf files
Pre release version
PPanGGOLiN can annotate and build gene families by itself for an easier use, or use annotated genomes and formerly built gene families directly.
PPanGGOLiN can have more than 3 partition and can estimate the optimal number of partitions.
PPanGGOLiN can run parts of its pipeline separatly for better parameter tuning.
PPanGGOLiN can provide a number of output files that will illustrate or describe your pangenome.
PPanGGOLiN uses a HDF5 file to store all the informations related to a pangenome, and reuse or re-generate any of those data for further analysis.
PPanGGOLiN makes a better use of CPUs in a multithreaded run.
PPanGGOLiN can project the pangenome's partitions on a given protein set.
PPanGGOLiN is compatible with macOS
and a lot of bugfixes. (and maybe some new)