galaxyproject · hrukkudyr · Jul 7, 2023 · Jul 7, 2023
@@ -74,7 +74,7 @@
 
 The choice of a distance measure is crucial in clustering. It defines how the similarity of two elements `(x, y)` is calculated as it influences the shape of the clusters. The classical distance measures are [euclidean](https://en.wikipedia.org/wiki/Euclidean_distance) and [manhattan](https://en.wikipedia.org/wiki/Taxicab_geometry) distances. For the most common clustering algorithms, the default distance measure is euclidean. If the euclidean distance is chosen, then observations having high magnitudes of their respective features will be clustered together. The same holds for the observations having low magnitudes of their respective features. In Figure 3, we group the cells using euclidean distance and their distance matrix.
 
-![Distances](images/raceid_distance.svg "Euclidean distance between three points (R, P, V) across three features (G1, G2, G3)")
+![Distances](images/raceid_distance.png "Euclidean distance between three points (R, P, V) across three features (G1, G2, G3)")
 
 
 > <question-title></question-title>
@@ -119,7 +119,7 @@

 By looking at the dendrogram, the clusters can be observed showing different groups in the best way. The optimal number of clusters is the number of vertical lines in the dendrogram cut by a horizontal line that can transverse maximum distance vertically without intersecting a cluster.

 In the above example, the best choice of the number of clusters will be 4 as the red horizontal line in the dendrogram below covers maximum vertical distance AB. For more details, please read [here](https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/).
 ![data](images/Hierarchical_clustering_2.png "Hierarchical clustering")