This Nextflow workflow is designed to perform the following tasks:
- Generate a samplesheet for each sample using the provided input files.
- Check the quality of FastQ files using the Skewer tool.
- Publish the final CSV file containing the results in the specified output directory.
To give you a samplesheet.csv from your *.fastq.gz files to be assembled with the nf-core/bacass pipeline nf-core/bacass pipeline.
To run this workflow, you need to have the following installed:
- Nextflow
- Python3 (for running the Python scripts)
- Singularity to run skewer (for checking the FastQ files)
This workflow requires the input FastQ files to be placed in the specified input directory. The input files should have a .fastq.gz
extension.
Overwrite the input directory using the params.in
parameter in the workflow with:
--in=.../path_to_your_data/...
The output directory for the final CSV file can be overwritten using the params.out
parameter in the workflow:
--out=.../path_to_where_to_store_your_file
The final CSV file will be named result.csv
and stored in the specified output directory.
The workflow consists of the following main steps:
write_samplesheet_p
: Generates a samplesheet for each sample using the provided input files.check_fastq_files_with_skewer_p
: Checks the quality of FastQ files using the Skewer tool.publish_csv_p
: Publishes the final CSV file containing the results in the specified output directory.
To run the workflow, navigate to the directory containing the Nextflow script and execute the following command:
nextflow run write_samplesheet.nf
or to run it on a slurm cluster
nextflow run write_samplesheet.nf -profile cluster
This will start the workflow using the default input and output directories.
This workflow can be customized by modifying the params
values or by adding additional processes or steps to the existing workflow.
The write_samplesheet_p
-process is as unefficent as it gets.