When there are a large number of datapoints, it is often useful to be able to group the data together into subclasses. For this, the traditional machine learning approach is a form of unsupervised learning known as clustering. Clustering methods aim to determine the optimial division of groups of data points; facilitating the labeling of these groups. One of the benefits of clustering methods is that it can be applied to unlabelled data.
In this section, we will initially introduce k-means clustering which is available in the scikit-learn
package before going on to look at the more flexible Gaussian mixture models.
Finally, you will be given the chance to apply these clustering approaches the conversion of absolute to relative time-of-flight scattering.