Skip to content

clustering

Yang Eric Li edited this page Apr 23, 2018 · 1 revision

4. Clustering

-- Yang Eric Li

Once we have normalized the data and perfromed the differential expression analysis, we can cluster the samples relevant to the biological questions. It is a hard problem to do the unsupervised clustering without prior knowledge. That is, we need to identify groups of samples based on the similarities of the transcriptomes. Moreover, in most situations we do not even know the number of clusters a priori. The problem is made even more challenging due to the high level of noise (both technical and biological) and the large number of dimensions (i.e. genes).

4.1 Dimensionality reductions

PCA

tSNE

4.2 Clustering methods

4.2.1 Hierarchical clustering

4.2.2 K-means