Parameters summary

Modes

Parameter	Required	Type	Description	Default
--gtf	T	str	Reference annotation in GTF format
-o/--output	F	str	Prefix for output index file	sqanti-sim
-d/--dir	F	str	Directory for output files	.
-k/--cores	F	int	Number of cores to run in parallel	1

In this mode, you have three sub-modes that share some common arguments but have their unique arguments.

Common arguments

Parameter	Required	Type	Description	Default
-i/--trans_index	T	str	File with transcript information generated by SQANTI-SIM (*_index.tsv)
--gtf	T	str	Complete reference annotation in GTF format
-o/--output	F	str	Prefix for output files	Same as -i
-d/--dir	F	str	Directory for output files	.
-nt/--trans_number	F	int	Total number of transcripts to simulate	10000
--ISM	F	int	Number of incomplete-splice-matches to simulate	0
--NIC	F	int	Number of novel-in-catalog to simulate	0
--NNC	F	int	Number of novel-not-in-catalog to simulate	0
--Fusion	F	int	Number of Fusion to simulate	0
--Antisense	F	int	Number of Antisense to simulate	0
--GG	F	int	Number of Genic-genomic to simulate	0
--GI	F	int	Number of Genic-intron to simulate	0
--Intergenic	F	int	Number of Intergenic to simulate	0
-k/--cores	F	int	Number of cores to run in parallel	1
-s/--seed	F	int	Randomizer seed	None

equal arguments

Parameter	Required	Type	Description	Default
--read_count	F	int	Number of reads to simulate	50000

custom arguments

Parameter	Required	Type	Description	Default
--nbn_known	F	float	Average read count per known transcript to simulate (the parameter 'n' of the Negative Binomial distribution)	15
--nbp_knwon	F	float	The parameter 'p' of the Negative Binomial distribution for known transcripts	0.5
--nbn_novel	F	float	Average read count per novel transcript to simulate (the parameter 'n' of the Negative Binomial distribution)	5
--nbp_novel	F	float	The parameter 'p' of the Negative Binomial distribution for novel transcripts	0.5

sample arguments

Parameter	Required	Type	Description	Default
--genome	T	str	Reference genome FASTA
--pb_reads/--ont_reads/--mapped_reads	T	str	PacBio or ONT reads for quantification in FASTA, FASTQ or aligned SAM format
--iso_complex	F		If used the program will approximate the expressed isoform complexity (number of isoforms per gene)
--diff_exp	F		If used the program will assign different expression values for novel and known transcripts
--low_prob	F	float	Low value of prob vector (used if --diff_exp)	0.1
--high_prob	F	float	High value of prob vector (used if --diff_exp)	0.9

Parameter	Required	Type	Description	Default
-i/--trans_index	T	str	File with transcript information generated with SQANTI-SIM (*_index.tsv)
--gtf	T	str	Complete reference annotation in GTF format
--genome	T	str	Reference genome FASTA
--pb/--ont	T		Choose to simulate ONT or PacBio reads
--read_type	F	str	Read type for NanoSim simulation. Choose between "cDNA" or "dRNA"	cDNA
--illumina	F		If used the program will simulate Illumina reads with Polyester
--long_count	F	int	Number of long reads to simulate (if not given it will use the requested_counts from the --trans_index file)
--short_count	F	int	Number of short reads to simulate (if not given it will use the requested_counts from the --trans_index file)
-d/--dir	F	str	Directory for output files	.
-k/--cores	F	int	Number of cores to run in parallel	1
-s/--seed	F	int	Randomizer seed	None

Parameter	Required	Type	Description	Default
--transcriptome	T	str	Long-read-defined trancriptome reconstructed with your pipeline in GTF, FASTA or FASTQ format
-i/--trans_index	T	str	File with transcript information generated with SQANTI-SIM (*_index.tsv)
--gtf	T	str	Reduced reference annotation in GTF format
--genome	T	str	Reference genome FASTA
-o/--output	F	str	Prefix for output index file	sqanti-sim
-d/--dir	F	str	Directory for output files	.
-e/--expression	F	str	Expression of transcript models (file without header with two columns tab-separated: first with id and second with quantified number of reads, no header)	None
-c/--coverage	F	str	Junction coverage files (provide a single file, comma-delmited filenames, or a file pattern, ex: "mydir/*.junctions")	None
--SR_bam	F	str	Directory or fofn file with the sorted bam files of Short Reads RNA-Seq mapped against the genome	None
--short_reads	F	str	File Of File Names (fofn, space separated) with paths to FASTA or FASTQ from Short-Read RNA-Seq	None
--CAGE_peak	F	str	CAGE Peak file in BED format (example FANTOM5)	None
--fasta	F		Use when running SQANTI-SIM by using as input a FASTA/FASTQ with the sequences of isoforms
--aligner_choice	F	str	If --fasta used, choose the aligner to map your isoforms (minimap2, deSALT, gmap, uLTRA)	minimap2
--min_support	F	int	Minimum number of supporting reads for an isoform	3
-k/--cores	F	int	Number of cores to run in parallel	1