All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Reconcile template with v5.3.0 and v5.3.1
- IGV output files are only output if
--igv
is used
- Error caused by numeric reference chromosome identifiers.
- Combined_refs.mmi is only published to the output directory when
reference_mmi_file
is not supplied. - Combined reference MMI is now output as
combined_refs.mmi
to match the declared output_definition.json.
- Update to ezcharts 0.11.2
- Streamlined and simplified the report.
- The per-read stats TSV file is no longer created by default. Instead, several histogram TSV files of read / alignment statistics (read length and mean quality; alignment accuracy and coverage) are output. The original per-read TSV can still be created with the
--per_read_stats
parameter. It is now gzip-compressed.
- IGV config JSON file to the outputs (in order to visualise the alignments and called variants).
- The Summary section of the report now only lists the first 7 sample names and reference files, instead of listing them all.
- Reduced the memory requested by some processes to avoid failing in WSL (since there is slightly less memory available in WSL than specified in
.wslconfig
).
- Some formatting changes to github issue template.
- Produce a MMI index file.
--reference_mmi_file
option to use a pre-generated MMI index file as reference.
- The limit of
20
for the--threads
parameter.
- Fix regression in depth plots that concatenated the curves of the different samples, rather than displaying them as a multi-line plot
- BAM tags from uBAM inputs are now carried over to the resulting BAM files.
- Regression that caused failing on compressed references.
- The
alignReads
process requesting too little memory in some cases.
- Reduced the minimum memory requirement from 16 to 12 GB.
- The workflow failing due to commas in reference sequence names.
- How samples, reference files, and reference sequence names are listed in the summary section at the beginning of the report.
- Memory requirements for each process.
- Reworked docs to follow new layout.
- Mangled depth plots when there are multiple reference sequences.
- Report generation failing when there is only a single read or a small number of reads with near-identical mean quality for a sample or reference file.
- Report generation failing when a sample name begins with the name of another sample (e.g. 'sample_A' and 'sample_A_2').
- Default local executor CPU and RAM limits.
- Names of barcoded directories in the sample sheet now need to be of format
barcodeXY
.
- Workflow failing when using a large number of reference sequences.
--ubam
option.--bam
can now be used for both BAM and uBAM files. The workflow will determine if files are aligned or not (and align them against the provided reference in that case).
- Read length histogram only displaying a small number of bins when there are a few outlier reads a lot longer than the other reads.
- configure-jbrowse breaking on unescaped spaces
- x-axis limits for accuracy, mean read quality, and read alignment coverage histograms to be more dynamic.
- Workflow will no longer crash when running with
--bam
on an input directory containing more than one.bam
file.
- Removed no longer used
--concat_fastq
parameter.
- Updated GitHub issue templates to force capture of more information.
- Example command to use demo data.
- Tooltips in depth plots not showing.
- Bumped minimum required Nextflow version to 22.10.8.
- Enum choices are enumerated in the
--help
output. - Enum choices are enumerated as part of the error message when a user has selected an invalid choice.
- Workflow aborting on
fastcat_or_mv
process.
- Replaced
--threads
option with--mapping_threads
and--sorting_threads
, which control the number of threads used during the alignment process.--mapping_threads
controls the number of threads used byminimap2
.--sorting_threads
controls the number of threads used to sort the aligned reads.- The total number of threads used by the alignment process is the sum of the two values.
- Other processes use a hard-coded number of threads ranging between 1 and 3.
- Parameters
--minimap_args
and--minimap_preset
to expose additionalminimap2
options to the user.- For RNA data sets,
--minimap_preset
can be set to'rna'
to automatically configure the workflow accordingly ('dna'
is the default preset). - Advanced users can provide
--minimap_args
to pass additional overriding arguments tominimap2
- For RNA data sets,
- Configuration for running demo data in AWS
- Bug crashing the report when running on AWS without a
--counts
file.
- Now uses ONT Public License.
- Report now uses dropdown menus instead of tabs.
- Missing
seqkit
ingetVersions
process.
-y
flag fromminimap2
command
- format to 'directory-path' for parameters fastq, bam, ubam, references
- missing header for 'Useful links' in docs
- description about references in schema (now only mentions an input directory)
- uses bamstats instead of mapula
- uses ezcharts for report
- legacy option 'demultiplex'
- sample_sheet format in schema to expect a file
- Updated description in manifest
- Harmonized line plot colours in report.
- Expanded explanation for coverage plots.
- Changed plot layout and margins to avoid overflowing plots
- Workflow will now output a JBrowse2
jbrowse.json
configuration
- Output combined reference file to
out_dir
-profile conda
is no longer supported, users should use-profile standard
(Docker) or-profile singularity
instead- Removed option for specifying report suffix
- Restructured workflow parameter schema
- Input params and handling for bam and ubam formats
- Bumped base container to v0.2.0
- Fastqingress metadata map
- Set out_dir option type to ensure output is written to correct directory on Windows.
- Argument Parser for fastqingress.
- Coloring with less than 3 samples
- run id and barcode output correctly
- concat_fastq boolean parameter
- Better help text on cli
- Mosdepth 0 step
- Depth coverage steps parameters
- Cumulative coverage plotting incorrect numbers
- Cumulative coverage plot
- reference can be either a directory or single file.
- output one merged CSV vs one for each barcode.
- speed up a few steps including mosdepth and report creation.
- run_id in mapula output json.
- Only accept certain format files as references.
- reduce storage required for workspace.
- Handling for no alignments.
- Integration with EPI2ME Labs notebook environment.
- Error message if no references in directory provided.
- Singularity profile.
- Ping telemetry file.
- Calculate depth coverage graph steps based on length of reference.
- Sample name to sample id
- Option to add suffix to HTML report name.
- Unmapped QC statistics
- Depth coverage graph per reference
- Help message now uses JSON schema
- Updated fastqingress
- Correct conda profile environment file path
- Remove erroneous --prefix messages
- Increase default batch_size to 1000
- Increase default max local executor cpus to 8
- Retag of v0.0.4, updated sample reports
- Make prefix optional
- Barcode awarenesss support with --demultiplex flag (requires guppy_barcoder to be installed)
- Output naming via new required --prefix argument
- Standardised report name.
- Make docker executor default.
- Initial release
- Basic running of alignment workflow and reporting