This is a warm-up assignment. It has two purposes: it gives you an idea of how it is to work with unlabeled data, and it gives me an idea on what I can (realistically) expected from the participants.
For this assignment, you will use four artificially generated data sets, each stored as a CSV file.
Each row in the data files corresponds to a data point, and each column corresponds to a feature.
Your task is to discover the structure in these data sets. Each data set contains 2000 data points with an underlying structure. Particularly, the data in all data sets come from multiple multi-variate random variables. In other words, the underlying structure suggests that each data point belongs to a group which is not indicated in the data. You are free to use any method, including plotting and eyeballing the data. If you do not use an automated (machine learning) method for your solution, however, describe which method(s) would be useful, for example, to assign each data point to a sensible group or cluster.
Provide your answer by editing this file, and briefly explaining your solution for each data set. Figures and/or other visual material are welcome. You can provide short code segments inline below, or check in the code as separate files in your repository.