Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change transparent image in Clustering Tutorial so it is readable in dark mode #4258

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@

The choice of a distance measure is crucial in clustering. It defines how the similarity of two elements `(x, y)` is calculated as it influences the shape of the clusters. The classical distance measures are [euclidean](https://en.wikipedia.org/wiki/Euclidean_distance) and [manhattan](https://en.wikipedia.org/wiki/Taxicab_geometry) distances. For the most common clustering algorithms, the default distance measure is euclidean. If the euclidean distance is chosen, then observations having high magnitudes of their respective features will be clustered together. The same holds for the observations having low magnitudes of their respective features. In Figure 3, we group the cells using euclidean distance and their distance matrix.

![Distances](images/raceid_distance.svg "Euclidean distance between three points (R, P, V) across three features (G1, G2, G3)")
![Distances](images/raceid_distance.png "Euclidean distance between three points (R, P, V) across three features (G1, G2, G3)")


> <question-title></question-title>
Expand Down Expand Up @@ -119,7 +119,7 @@

By looking at the dendrogram, the clusters can be observed showing different groups in the best way. The optimal number of clusters is the number of vertical lines in the dendrogram cut by a horizontal line that can transverse maximum distance vertically without intersecting a cluster.

In the above example, the best choice of the number of clusters will be 4 as the red horizontal line in the dendrogram below covers maximum vertical distance AB. For more details, please read [here](https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/).

Check failure on line 122 in topics/statistics/tutorials/clustering_machinelearning/tutorial.md

View workflow job for this annotation

GitHub Actions / lint

[rdjsonl] reported by reviewdog 🐶 Please do not use 'here' as your link title, it is [bad for accessibility](https://usability.yale.edu/web-accessibility/articles/links#link-text). Instead try restructuring your sentence to have useful descriptive text in the link. Raw Output: {"message":"Please do not use 'here' as your link title, it is [bad for accessibility](https://usability.yale.edu/web-accessibility/articles/links#link-text). Instead try restructuring your sentence to have useful descriptive text in the link.","location":{"path":"./topics/statistics/tutorials/clustering_machinelearning/tutorial.md","range":{"start":{"line":122,"column":193},"end":{"line":122,"column":199}}},"severity":"ERROR","code":{"value":"GTN:005","url":"https://github.com/galaxyproject/training-material/wiki/Error-Codes#gtn005"},"suggestions":[{"text":"[Something better here]","range":{"start":{"line":122,"column":193},"end":{"line":122,"column":199}}}]}
![data](images/Hierarchical_clustering_2.png "Hierarchical clustering")


Expand Down
Loading