Update README.md

Omabekee · Sep 15, 2024 · 131512e · 131512e
1 parent 372c01b
commit 131512e
Showing 1 changed file with 8 additions and 4 deletions.
diff --git a/Stage-2/README.md b/Stage-2/README.md
@@ -23,12 +23,16 @@ To load libraries after installation, use  *library(packagename)*
 
 2. **Heatmap Generation**: Heatmaps were generated to visualise the gene expression patterns in the samples using either diverging colour palette or sequential colour palette. This approach aids in showing variations between upregulated and downregulated genes as well as gradual changes in expression levels.  
 
-   Performed clustering by rows (genes), columns (samples) and both. This approach helps in understanding which samples are similar and which genes have similar expression patterns across these samples.  
+   Performed clustering by rows (genes), columns (samples) and both. This approach helps in understanding which samples are similar and which genes have similar expression patterns across these samples.
+   i). **Clustering Genes (Rows)**: helps identify gene sets with similar behaviour, which may indicate co-regulated genes or genes involved in the same biological processes.
+ii). **Clustering Samples (Columns)**: helps to find subtypes within the dataset based on their gene expression patterns.
+iii). **Clustering Both Genes and Samples (Rows and Columns)**: provided a comprehensive view of the data, showing how gene groups and sample profiles relate to each other.
+
      
    **R Function**: *heatmap.2()* is a function used to generate heatmaps.  
    *colorRampPalette* used to create custom colour palettes to visually distinguish data patterns.  
 
-3. **Identification of Significant Genes:** Using the clusters generated from figure above, the samples were divided into two groups. Subsequently, log fold change was calculated by taking the log2 difference between the mean gene expression of Group B and Group A, with a small constant added for stability.  
+4. **Identification of Significant Genes:** Using the clusters generated from figure above, the samples were divided into two groups. Subsequently, log fold change was calculated by taking the log2 difference between the mean gene expression of Group B and Group A, with a small constant added for stability.  
    **LogFC formula**: *log2(groupB\_mean \+ 0.5) \- log2(groupA\_mean \+ 0.5)*  
 
    The p-value was calculated using the Wilcoxon test to compare the mean gene counts between Group A and Group B.  
@@ -41,9 +45,9 @@ To load libraries after installation, use  *library(packagename)*
    **R Function**: *useMart()* is used to connect to a BioMart database for querying.  
    *getBM()* is used to retrieve data from the connected BioMart database.  
 
-4. **Functional Enrichment Analysis**: Used ShinyGO (with default parameters and GO biological process database) to identify pathways enriched in the upregulated genes. The results obtained from ShinyGO were saved in a .csv file and used for visualisation (find file in *output* directory).  
+5. **Functional Enrichment Analysis**: Used ShinyGO (with default parameters and GO biological process database) to identify pathways enriched in the upregulated genes. The results obtained from ShinyGO were saved in a .csv file and used for visualisation (find file in *output* directory).  
 
-5. **Visualisation of Pathways**: Visualised the top 5 upregulated pathways using the results obtained from the enrichment analysis to generate a lollipop plot (scaled by the \-log10 of the p-value to indicate pathway significance) for better clarity and interpretation of the enriched pathways.  
+6. **Visualisation of Pathways**: Visualised the top 5 upregulated pathways using the results obtained from the enrichment analysis to generate a lollipop plot (scaled by the \-log10 of the p-value to indicate pathway significance) for better clarity and interpretation of the enriched pathways.  
    **R Function**: *ggplot()* is used to create various plots in R, such as scatter plots, bar plots and lollipop plots in R, with customizations for layers, scales and  themes.
 
 ### Biological Significance of Top Upregulated Pathways