Skip to content

SziKayLeung/LOGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to the LOng-read General github repo!

This is a GitHub repo of (mostly independent) Python/R scripts that I developed to analyse data from long-read sequencing experiments. Purpose of scripts vary from generating txt files to run community tools (example pipelines), generating plots post-SQANTI, running differential expression analyses to more custom applications.

A pipeline for processing raw ONT reads from transcriptome cDNA processing, using research community tools (i.e. Porechop,Minimap2,SQANTI3) and own custom scripts.

Data exploration post-SQANTI

Below listed are features that can be explored on <sample>_classification.txt generated from SQANTI.

To run functions, read in <sample>_classification.txt file using:

  • SQANTI_class_preparation(<sample>_classification.txt, standard) if expression columns are included in the file (after running --FL_count in SQANTI)
  • SQANTI_class_preparation(<sample>_classification.txt, nstandard) if expression is not included

  • subset_targetgenes_classfiles.py: Subset SQANTI classification file based on genes and reads
  • colour_transcripts_by_countandpotential.py: Colour bed file by abundance and coding potential
  • extract_fasta_bestorf.py: Create a fasta file based on best ORF defined from CPAT

Differential expression analysis

Current script dump to maintain. Scripts to input results after running tappAS, running linear regression etc...

  • replace_filenames_with_csv.py: Replace multiple file names in a directory using reference csv file
  • search_fasta_by_sequence.py: Subset fasta based on sequence
  • subset_fasta_gtf.py: Subset gtf, fasta and bed files based on list of transcript IDs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published