Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Available as bioconda package? #1

Open
d4straub opened this issue Mar 21, 2023 · 1 comment
Open

Available as bioconda package? #1

d4straub opened this issue Mar 21, 2023 · 1 comment

Comments

@d4straub
Copy link

Hi there,

I browsed your interesting paper (that isnt mentioned in the README) and was intrigued by the speed improvement while promising a similar performance compared to DADA2. I'd like to ask whether you are going to add the tool to bioconda, that would make it available as conda package and container (bioconda packages are also available as docker & singularity containers). Packaging & containerization would make the tool even more useful by using the package/container instead of going through an installation procedure. This allows efficient use in pipelines and makes analyses reproducible.

Small side questions: the README states

  -f FILEPATH, --filepath FILEPATH
                        Path to the file containing 16S sequences. At the
                        moment we support only fatsa file containing the 16S
                        reads.
  • is that fasta file, as in not fastq?
  • Path to the file containing 16S sequences. = Path to raw sequencing read file in fasta format.?
  • what about multiple samples and therefore multiple raw sequencing read files?
  • will it only work for 16S rRNA amplicon sequencing data or also any other amplicon (I would expect the latter, but that isnt reflected in the README)

Best, Daniel

@hsmurali
Copy link
Owner

hsmurali commented Mar 22, 2023

Thank you! We don't have it on bioconda yet. We will seriously consider making it available as a bioconda package.

  1. The current version of SCRAPT takes as input a fasta file since it assumes the sequences are preprocessed and quality filtered. We recommend using QIIME2 to preprocess sequences from all samples and multiplexing reads that pass the quality filter into a single fasta file.

  2. SCRAPT takes sequences pooled from all samples as input. This is similar to the pooled mode of DADA2. We plan to provide a pipeline to perform preprocessing in the future iterations of SCRAPT. We also plan on providing a sample x OTU table that can be used for differential abundance testing in the future.

  3. SCRAPT can be applied on any amplicon sequencing dataset and not just 16S rRNA. In the paper we present some results on 18S rRNA gene, which is a marker found in microeukaryotes.

Please feel free to reach out, if you have further queries on preprocessing raw read files.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants