A collection of reproducible benchmarks for nail
You'll need the following available on your system path:
- nail
- MMseqs2
- HMMER3
- easel (comes with HMMER3 distributions)
- the create-profmark binary (comes with HMMER3 distributions)
This benchmark was originally run using Pfam version 36.0
and Swissprot release-2023_05
To download the data, you can run
$ ./scripts/download-data.sh
which will place Pfam seed alignments & Swissprot sequences in the data/
directory:
$ tree data/
data
├── long-seq
│ ├── query
│ │ ├── 1.query.fa
│ │ ├── 2.query.fa
│ │ ├── 3.query.fa
│ │ ├── 4.query.fa
│ │ ├── 5.query.fa
│ │ └── 6.query.fa
│ └── target
│ ├── 1.target.fa
│ ├── 2.target.fa
│ ├── 3.target.fa
│ ├── 4.target.fa
│ ├── 5.target.fa
│ └── 6.target.fa
├── pfam.sto
├── uniprot.tar.gz
├── uniprot_sprot.dat.gz
├── uniprot_sprot.fasta
├── uniprot_sprot.fasta.ssi
├── uniprot_sprot.xml.gz
└── uniprot_sprot_varsplic.fasta.gz
To build the benchmark, run
$ ./scripts/build-benchmark.sh
To run the benchmark, run
$ ./scripts/run-all.sh
To produce the plots, run
$ python ./scripts/plots.py ./benchmark/