Skip to content

Parameters summary

jmestret edited this page Sep 13, 2022 · 2 revisions

Modes

classif

Parameter Required Type Description Default
--gtf T str Reference annotation in GTF format
-o/--output F str Prefix for output index file sqanti-sim
-d/--dir F str Directory for output files .
-k/--cores F int Number of cores to run in parallel 1

design

In this mode, you have three sub-modes that share some common arguments but have their unique arguments.

Common arguments

Parameter Required Type Description Default
-i/--trans_index T str File with transcript information generated by SQANTI-SIM (*_index.tsv)
--gtf T str Complete reference annotation in GTF format
-o/--output F str Prefix for output files Same as -i
-d/--dir F str Directory for output files .
-nt/--trans_number F int Total number of transcripts to simulate 10000
--ISM F int Number of incomplete-splice-matches to simulate 0
--NIC F int Number of novel-in-catalog to simulate 0
--NNC F int Number of novel-not-in-catalog to simulate 0
--Fusion F int Number of Fusion to simulate 0
--Antisense F int Number of Antisense to simulate 0
--GG F int Number of Genic-genomic to simulate 0
--GI F int Number of Genic-intron to simulate 0
--Intergenic F int Number of Intergenic to simulate 0
-k/--cores F int Number of cores to run in parallel 1
-s/--seed F int Randomizer seed None

equal arguments

Parameter Required Type Description Default
--read_count F int Number of reads to simulate 50000

custom arguments

Parameter Required Type Description Default
--nbn_known F float Average read count per known transcript to simulate (the parameter 'n' of the Negative Binomial distribution) 15
--nbp_knwon F float The parameter 'p' of the Negative Binomial distribution for known transcripts 0.5
--nbn_novel F float Average read count per novel transcript to simulate (the parameter 'n' of the Negative Binomial distribution) 5
--nbp_novel F float The parameter 'p' of the Negative Binomial distribution for novel transcripts 0.5

sample arguments

Parameter Required Type Description Default
--genome T str Reference genome FASTA
--pb_reads/--ont_reads/--mapped_reads T str PacBio or ONT reads for quantification in FASTA, FASTQ or aligned SAM format
--iso_complex F If used the program will approximate the expressed isoform complexity (number of isoforms per gene)
--diff_exp F If used the program will assign different expression values for novel and known transcripts
--low_prob F float Low value of prob vector (used if --diff_exp) 0.1
--high_prob F float High value of prob vector (used if --diff_exp) 0.9

sim

Parameter Required Type Description Default
-i/--trans_index T str File with transcript information generated with SQANTI-SIM (*_index.tsv)
--gtf T str Complete reference annotation in GTF format
--genome T str Reference genome FASTA
--pb/--ont T Choose to simulate ONT or PacBio reads
--read_type F str Read type for NanoSim simulation. Choose between "cDNA" or "dRNA" cDNA
--illumina F If used the program will simulate Illumina reads with Polyester
--long_count F int Number of long reads to simulate (if not given it will use the requested_counts from the --trans_index file)
--short_count F int Number of short reads to simulate (if not given it will use the requested_counts from the --trans_index file)
-d/--dir F str Directory for output files .
-k/--cores F int Number of cores to run in parallel 1
-s/--seed F int Randomizer seed None

eval

Parameter Required Type Description Default
--transcriptome T str Long-read-defined trancriptome reconstructed with your pipeline in GTF, FASTA or FASTQ format
-i/--trans_index T str File with transcript information generated with SQANTI-SIM (*_index.tsv)
--gtf T str Reduced reference annotation in GTF format
--genome T str Reference genome FASTA
-o/--output F str Prefix for output index file sqanti-sim
-d/--dir F str Directory for output files .
-e/--expression F str Expression of transcript models (file without header with two columns tab-separated: first with id and second with quantified number of reads, no header) None
-c/--coverage F str Junction coverage files (provide a single file, comma-delmited filenames, or a file pattern, ex: "mydir/*.junctions") None
--SR_bam F str Directory or fofn file with the sorted bam files of Short Reads RNA-Seq mapped against the genome None
--short_reads F str File Of File Names (fofn, space separated) with paths to FASTA or FASTQ from Short-Read RNA-Seq None
--CAGE_peak F str CAGE Peak file in BED format (example FANTOM5) None
--fasta F Use when running SQANTI-SIM by using as input a FASTA/FASTQ with the sequences of isoforms
--aligner_choice F str If --fasta used, choose the aligner to map your isoforms (minimap2, deSALT, gmap, uLTRA) minimap2
--min_support F int Minimum number of supporting reads for an isoform 3
-k/--cores F int Number of cores to run in parallel 1