NOTE: These tools are deprecated. Pleas use MASTER-EX-CLT for MASTER data analysis.
This is the data analysis prototype software for MAssively Systematic Transcript End Readout (MASTER) technique. This software analyzes the next generation sequencing results of DNA template library and 5' RNA-Seq in order to obtain the number of RNA reads start at each position of DNA template.
The purpose of analyzing the sequencing results of DNA template library is to associate the 7-bp randomized TSS-region with a corresponding second randomized 15-bp barcode region.
C++ program dna_fastq_parse.cpp is used for analyzing sequencing results of DNA template libraries. This program tekes 6 parameters and sequencing results in FASTQ format. The 6 parameters from left to right are sequencing quality score cutoff, length of randomized TSS-region, extra positions need to be considered after randomized TSS-region, length of barcode region, the name of output file containing filtered TSS-regions and barcodes, and the name of output file containing discarded DNA reads.
The output TSS-regions and barcodes file is tab delimited file with barcode, TSS-region, and count written from left to right. This program will also output a stats file of analyzed DNA template library.
The purpose of analyzing 5' RNA-Seq results is to identify the DNA templates and transcription start position of RNA reads.
C++ program rna_fastq_parse.cpp is used for 5' RNA-Seq Analysis. This program takes 8 parameters and sequencing results in FASTQ format. The 8 parameters from left to right are sequencing quality score cutoff, length of digital tag region, length of randomized TSS-region, extra positions need to be considered after randomized TSS-region, length of barcode region, the name of output tag record file, the name of output stats file of analyzed 5' RNA-Seq results, and the TSS-regions and barcodes file from DNA template library analysis.
Importantly, rna_fastq_parse.cpp will output a tag record file which records all transcribed RNA TSS-region sequences and their DNA templates and counts. This file is tab delimited with transcribed RNA TSS-region sequences, DNA template TSS-regions, start position at lacCONS template, read counts, number of different digital tags in all reads, and match or mismatch to DNA template TSS-regions.