Skip to content

Commit

Permalink
PB sdm_opt update. Added experimental (!!) ONT sdm_opt
Browse files Browse the repository at this point in the history
  • Loading branch information
Falk Hildebrand (QIB) committed Sep 2, 2023
1 parent 47362e3 commit 2166dcb
Show file tree
Hide file tree
Showing 3 changed files with 78 additions and 2 deletions.
76 changes: 76 additions & 0 deletions configs/sdm_ONT_LSSU.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
#sdm options file to control sequence quality filtering, demultiplexing and preparation (can also be used without demultiplexing)
#* indicates alternative quality filtering options, saved in *.add.fna etc. files separately from initial quality filtered dataset
#sequence length refers to sequence length AFTER removal of Primers, Barcodes and trimming. this ensures that downstream analyis tools will have appropiate sequence information
#experimental options for ONT (Q20!) data
#use at your own risk
minSeqLength 700
maxSeqLength 2000
minAvgQuality 27
*minSeqLength 500
*minAvgQuality 20
#truncate total Sequence length to X (length after Barcode, Adapter and Primer removals, set to -1 to deactivate)
TruncateSequenceLength -1

#Ambiguous bases in Sequence
maxAmbiguousNT 10
*maxAmbiguousNT 15

#sequence is discarded if a homonucleotide run in sequence is longer
maxHomonucleotide 22

#Filter whole sequence if one window of quality scores is below average
QualWindowWidth 150
QualWindowThreshhold 18

#Trim the end of a sequence if a window falls below quality threshhold. Useful for removing low qulaity trailing ends of sequence
TrimWindowWidth 100
TrimWindowThreshhold -1

#Probabilistic max number of accumulated sequencing errors. After this length, the rest of the sequence will be deleted. Complimentary to TrimWindowThreshhold. (-1) deactivates this option.
maxAccumulatedError -1
#mid qual option
*maxAccumulatedError -1


#Max Barcode Errors
maxBarcodeErrs 1
maxPrimerErrs 2

#keep Barcode / Primer Sequence in the output fasta file - in a normal 16S analysis this should be deactivated (0) for Barcode and deactivated (0) for primer
keepBarcodeSeq 0
keepPrimerSeq 0


#set fastqVersion to 1 if you use Sanger, Illumina 1.8+ or NCBI SRA files. Set fastqVersion to 2, if you use Illumina 1.3+ - 1.7+ or Solexa fastq files.
fastqVersion 1

#if one or more files have a technical adapter still included (e.g. TCAG 454) this can be removed by setting this option
TechnicalAdapter

#delete X NTs (e.g. if the first 5 bases are known to have strange biases)
TrimStartNTs 0

#correct PE header format (1/2) this is to accomodate the illumina miSeq paired end annotations 2="@XXX 1:0:4" insteand of 1="@XXX/1". Note that the format will be automatically detected
PEheaderPairFmt 1

#sets if sequences without match to reverse primer will be accepted (T=reject ; F=accept all); default=F
RejectSeqWithoutRevPrim T
*RejectSeqWithoutRevPrim F

#sets if sequences without a forward (LinkerPrimerSequence) primer will be accepted (T=reject ; F=accept all); default=T
RejectSeqWithoutFwdPrim T
*RejectSeqWithoutFwdPrim T


#checks if pair1 and pair2 were switched (ignore if single read data)
CheckForMixedPairs T
#checks if whole amplicon was reverse-transcribed sequenced (not switched, just reverse translated)
CheckForReversedSeqs T

#this option should be "T" if your amplicons are possibly shorter than a single read in a paired end sequencing run (e.g. if the 16S amplicon length is 200bp in a 250x2 miSeq run, set this to "T"). This option increases runtime by 10%, if in doubt just set to "T". *Requires* LinkerPrimerSequence and ReversePrimer to be defined in mapping file. Also used to remove fwd and rev primer from long Pacbio sequences
AmpliconShortPE T


#mostly used for PacBio data: check if more than once the reverse primer occurrs on reads, if yes, cut length to first fwd-rev primer pairing
ExtensivePrimerChecks T

2 changes: 1 addition & 1 deletion configs/sdm_PacBio_ITS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ maxHomonucleotide 16

#Filter whole sequence if one window of quality scores is below average
QualWindowWidth 50
QualWindowThreshhold 25
QualWindowThreshhold -1

#Trim the end of a sequence if a window falls below quality threshhold. Useful for removing low qulaity trailing ends of sequence
TrimWindowWidth 20
Expand Down
2 changes: 1 addition & 1 deletion configs/sdm_PacBio_LSSU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ QualWindowThreshhold 25

#Trim the end of a sequence if a window falls below quality threshhold. Useful for removing low qulaity trailing ends of sequence
TrimWindowWidth 20
TrimWindowThreshhold 25
TrimWindowThreshhold -1

#Probabilistic max number of accumulated sequencing errors. After this length, the rest of the sequence will be deleted. Complimentary to TrimWindowThreshhold. (-1) deactivates this option.
maxAccumulatedError -1
Expand Down

0 comments on commit 2166dcb

Please sign in to comment.