Absolute quantitative and base-resolution sequencing reveals comprehensive landscape of pseudouridine across the human transcriptome
Authors: Haiqi Xu1,2,*, Linzhen Kong1,2,*, Jingfei Cheng1,2, Khatoun Al Moussawi1, Xiufei Chen1,2, Aleema Iqbal1,2, Peter A. C. Wing3, James M. Harris4, Senko Tsukuda4, Azman Embarc-Buh5, Guifeng Wei6, Alfredo Castello5, Skirmantas Kriaucionis1, Jane A. McKeating4, Xin Lu1, and Chun-Xiao Song1,2,†
These scripts are for BACS data analysis. The following steps are included:
- Data preprocessing
- Alignment and filtering
- Mutation counts and site calling
NNUNN1(fully modified NNΨNN) GCTTCAAGTTGANNTNNCATCGCAAGTGCA
NNUNN2(unmodified NNUNN) ATGTCTCGACGTNNTNNGTTACAGTACCGT
Hg38 for human samples; For RNA viruses, the following reference genomes were used:
Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1 (NC_045512.2),
Recombinant Hepatitis C virus J6(5’UTR-NS2)/JFH1 (JF343782.1),
Zika virus isolate ZIKV/H.sapiens/Brazil/Natal/2015 (NC_035889.1),
Hepatitis Delta Virus sequence from the pSVL(D3) plasmid99 (Addgene plasmid #29335) (https://www.addgene.org/29335/), and Sindbis virus (NC_001547.1).
For EBV samples, reads were aligned to Epstein-Barr virus (EBV) genome, strain B95-8 (V01555.2)
For Ribo-depletion libraries: bacs_smallrna.smk, bacs_smallrna_callsite.smk
For polyA libraries: bacs_alignment1.smk bacs_mRNA_part2.smk and bacs_mRNA_site.smk
calling_mRNA_ivt1.r and site_calling.r
figure_spikein.r
figure2.r
figure_snoRNA.r
figure_tRNA.r
figure_mRNA.r inosine_plot.r
comparision_method.r
knockout.r
virus_final.r