Personal sc-RNASeq Project that uncovers Tumour Micro-Environment (TME) of a Non-Small Cell Tumour Sample via cell-subtype clustering. This project is still under development. To this date, cell-type clustering was performed to uncover all the different cell types within this tumour sample. The idea is to hopefully perform further analysis via trajectory inference, and cell-to-cell interactions to identify potential therapeutics.
R tidyverse Seurat ggplot2 DoubletFinder hdf5r
The raw input data file 20k_NSCLC_DTC_3p_nextgem_Multiplex_count_raw_feature_bc_matrix.h5 was derived from 10X Genomics website. It is not stored in a repository due to large file size. For more information, feel free to contact me.
The processed Seurat object with cell type clustering should be saved in the results/ directory as nsclc_seurat_obj_dblt_adj.rds.
-
Data Loading The raw scRNA-seq data is loaded from an HDF5 file using the Read10X_h5 function.
-
Quality Control Cells are filtered based on the number of detected features and the percentage of mitochondrial genes to ensure high-quality data for analysis.
-
Normalization The data is normalized using the NormalizeData function to make gene expression levels comparable across cells.
-
Feature Selection Highly variable features are identified using the FindVariableFeatures function, which are used in downstream analyses.
-
Scaling The data is scaled using the ScaleData function to standardize gene expression values.
-
Dimensionality Reduction Principal Component Analysis (PCA) is performed using the RunPCA function to reduce the dimensionality of the data and highlight the most important features.
-
Doublet Detection and Filtration Doublets are detected using the DoubletFinder package, and the data is adjusted to remove these artifacts.
-
Clustering Cells are clustered using a graph-based clustering approach with the FindNeighbors and FindClusters functions.
-
Visualization The clusters are visualized using UMAP with the RunUMAP function, allowing for the identification of distinct cell populations.
-
Save Results The final Seurat object, with doublets removed, is saved as an RDS file for future use.