15 Mar 15:27

91722a6

Version v2.2.0

Version 2.2.0

Removed Cython from the requirements.txt file. This allows to perform the tests correctly in a Conda environment (as Conda disallows installing Cython as part of a distributed package).
As a result of this change, the preferred installation procedure from source has to be slightly amended:

either install using pip wheel -w dist . && pip install dist/Mikado*whl
or install with python setup.py bdist_wheel after having forcibly installed Cython, with pip install Cython or the like.

Other changes:

Fix #381: now Mikado will be able to guess correctly
the input file format, instead of relying on the file name extension or user's settings. Sniffing for files
provided as a stream is disabled though.
Fix #382: now Mikado can accept generic BED12 files
as input junctions, not just Portcullis junctions. This allows e.g. a user to provide a set of gene models
in BED12 format as sources of valid junctions.
Fix #384: now Mikado convert deals properly with
unsorted GTFs/GFFs.
Fix #386: dealing better with unsorted GFFs/GTFs for
the stats utility.
Fix #387: now Mikado will always use a static seed,
rather than generating a new one per call unless specifically instructed to do so. The old behaviour can still be
replicated by either setting the seed parameter to null (ie None) in the configuration file, or by
specifying --random-seed during the command invocation.
General increase in code unit-test coverage; in particular:
- Slightly increased the unit-test coverage for the locus classes, e.g. properly covering the as_dict and load_dict
  methods. Minor bugfixes related to the introduction of these unit-tests.
Mikado.parsers.to_gff has been renamed to Mikado.parsers.parser_factory.
The code related to the transcript padding has been moved to the submodule Mikado.transcripts.pad, rather than
being part of the Mikado.loci.locus submodule.
Mikado will error informatively if the scoring configuration file is malformed.

Assets 2

23 Feb 23:06

lucventurini

v2.1.1

ad1ac6a

Patch release

Hotfix release:

IMPORTANT Mikado now uses correctly the scores associated to a given source.
IMPORTANT Mikado was not forwarding the original source to transcripts derived by chimera splitting. This compounded the issue above.
Corrected the issue that caused the issues above, ie transcripts where not dumping and reloading all relevant fields. Now implemented properly and tested with specific new routines.
Corrected an issue that caused Mikado to erroneously calculate twice the metrics and scores of loci, therefore reporting some wrong ones in the output files.
- affected metrics where e.g. selected_cds_intron_fraction and combined_cds_intron_fraction.
Removed quicksect from the requirements.

Assets 2

22 Feb 23:22

lucventurini

v2.1.0

8b887ab

v2.1.0: Issue 375 (#379)

Bugfix and speed improvement release.

Fix a bug that prevented Mikado from reporting the correct metrics/scores in the output of loci files. This bug only affected reporting, not the results themselves. See issue 376
Fix a bug in printing out the statistics for an annotation file with mikado util stats (issue 378)
When doing serialising, Mikado now by default will drop and reload everything. The previous default behaviour results in hard-to-parse errors and is not what is usually desired anyway.
Improved the performance of pick in multiple ways (issue 375):
- now only external metrics that are requested in the scoring file will be printed out in the final metrics files. This reduces runtime in e.g. Minos. The new CLI switch --report-all-external-metrics (both in configure and pick) can be used to revert to the old behaviour.
- the external table in the Mikado database now is indexed properly, increasing speed.
- batch and compress the results before sending them through a queue (@ljyanesm)
- @brentp enhanced the bcbio intervaltree.pyx into quicksect. Copied this new version of interval tree and adapted it to Mikado.
- Using sqlalchemy bakeries for the SQLite queries, as well as LRU caches in various parts of Mikado.
- Removed excessive copying in multiple parts of the program, especially regarding the configuration objects and during padding.
- Using operator.attrgetter instead of a custom (and slower) recursive getattr function.
Removed unsafe calls to tempfile.mktemp and the like, for increased security according to CodeQL.

Assets 2

13 Feb 18:09

lucventurini

v2.0.2

4f5571c

2.0.2

Bugfix release.

Fix infinite recursion bug when trying to recover lost transcripts
Fix performance regression by passing the configuration to Excluded locus objects.

Assets 2

09 Feb 16:27

lucventurini

v2.0.1

6c7ba51

Marshmallow mate

Fixed a bug that caused Mikado configure (but not daijin configure, or "mikado configure --daijin") to print out invalid configuration files.
Restored the functionality of "--full" - now Mikado can print out both partial (but still valid) or fully-fledged configuration files.
Ported also the scoring configuration to MarshMallow dataclass. As a direct results, removed from the dependencies jsonschema.
Configured bumpversion
Corrected a small bug in parsing EnsEMBL GFF3
Cured some deprecation warning messages from marshmallow and numpy
Small bug fix in the CLIs of mikado/daijin configure.
Default value of the seed is now 0 (ie: undefined, a random one will be selected). Only integers are allowed values.
Small bugfixes/extensions in the test suite.
Minor code reorganisation, without changes to the API.

Assets 2

28 Jan 17:57

lucventurini

v2.0

bdbfbd9

Mikado version 2

Official second release of Mikado. All users are advised to update as soon as possible.

See https://github.com/EI-CoreBioinformatics/mikado/milestone/22?closed=1 for a non-comprehensive list of all the issues closed in
relation to this release.

Assets 4

13 Apr 09:27

lucventurini

2.0prc2

eb98a3d

Mikado 2, public release candidate 2

Minor amendments to 2.0rc1 - in order to get Mikado to install properly in BioConda.

Assets 2

09 Apr 08:18

lucventurini

2.0prc1

fcb5e5f

Mikado 2, public release candidate 1

This version of Mikado is finally ready to go into Conda, DockerHub, PyPI and Singularity Hub.
Many thanks to @ljyanesm, thanks to whom Mikado has become much more performant.

Most notable changes:

Mikado serialise will now accept tabular BLAST files (with the extra columns ppos and btop). Both XML and TSV loading have parts written in Cython. Thank you to @srividya22 for first asking about improvements in this sense. #280
Mikado prepare now will remove redundancies based on intron chains, not perfect to-the-base identity. This should massively reduce the input data. The redundancy filter can be controlled per-source: ie, Mikado is able to keep all transcripts from certain input files (reference annotations, ab initio predictions, transcript assemblies, etc) while removing any redundant transcript from others (long-read alignments). Thanks to @lijing28101. #270
Mikado prepare now will try to split transcripts with very long introns, rather than outright discard them.
Mikado pick will now operate in stringent mode by default (ie: only split transcripts when there is strong evidence of them being chimeras, as per the BLAST data).
Mikado now uses TOML as default configuration language, as it is much more human-readable than either YAML or JSON (#239).
Various bugfixes.

Assets 5

15 Oct 10:34

lucventurini

2.0rc6

48b6c0c

Version 2.0, release candidate 6 Pre-release

Pre-release

#216: now mikado prepare will explicitly tell users to use the mikado_prepared.fasta for the serialise step. Moreover, mikado serialise will informatively crash if users try to do something different (a common mistake seems to be to use a FASTA file derived directly from the input assemblies).
#220: Fixed a bug in mikado serialise
#222: now daijin will make prodigal or TransDecoder use alternative genetic codes, upon request. IMPORTANT: TransDecoder does not support all of the known genetic codes listed by NCBI.
#223: fixed the start-adjustment method in the ORF module.
#226: mikado compare, mikado util stats and mikado util grep are now compatible with non-standard NCBI GFF3 files (having e.g. pseudogene features without any associated transcript but associated exons, or rRNA transcript features without any parent gene)
#227: now mikado compare will always consider valid transcripts, even if they are multiexonic yet missing a defined strand orientation.
#229:
- mikado pick will now:
  - report the padding as INFO, not as WARNING
  - report on finishing the analysis of a chromosome, not the parsing
  - report the temporary analysis directory
  - provide --max-intron-length as a command line option
- fixed a small bug in mikado serialise
- fixed a bug in the ORF module that caused a crash when the sequence was not completely uppercase
#230: fixed some bugs related to the daijin conda environments and to updates to the snakemake code upstream.
Fix a small bug in reference_gene.py and transcript.py, related to sys.intern
#232: typo in the help for mikado serialise.

Assets 2

26 Sep 15:30

lucventurini

2.0rc5

ab172f1

Version 2.0, release candidate 5 Pre-release

Pre-release

Switched from ujson to rapidjson (actively maintained and as performant)
Fix #209: daijin has been debugged and it is now properly tested. Also, when using daijin mikado, the number of XMLs will be equal or greater than the number of requested threads.
#177: mikado serialise is now completely parallelised. This allows for very significant speed-ups, especially when loading a large number of ORFs.
Speedups for mikado pick: now the GTF will be parsed much more quickly, by avoiding to create a full GTFline object for each line during the parsing (which was extra-slow).
daijin can now optionally use conda environments, using the conda directive of snakemake.
Speedup in mikado pick: now everything is written to databases (#218). This allows for cleaner temporary directories and parsing of the partial outputs.
mikado pick now will not, by default, print out the subloci file.
Speed up in mikado pick: now using a lightweight graph also for the splicing.
Amend #134 - now the minimum CDS overlap is 50%, not 75%.
Fixed a bug for mikado compare in multiprocessing mode
Fixed a bug in mikado configure - the scoring file will not be embedded within the printed file (otherwise it will be impossible to change it dynamically).

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 2.2.0

Releases: EI-CoreBioinformatics/mikado

Version v2.2.0

Version 2.2.0

Patch release

v2.1.0: Issue 375 (#379)

2.0.2

Marshmallow mate

Mikado version 2

Mikado 2, public release candidate 2

Mikado 2, public release candidate 1

Version 2.0, release candidate 6

Version 2.0, release candidate 5