Question 2. Datasets
(a) The 1985 Auto Imports Database
- import-85.names is the helper file which contains information regarding the dataset
- import-85.data is a CSV file with the actual data
Question 3. Datasets
(a) Airfare and demand:
# Note: target is price
(b) Wine Quality:
# Note: target is quality
(c) Parkisons Dataset:
# Note: target is total UPDRS
Question 4. Datasets
Classification dataset:
(a) Tic Tac Toe
Question 5. Datasets
Classification datasets:
(a) Bank Marketing / bank.csv(https://archive.ics.uci.edu/ml/datasets/Bank+Marketing)
Regression datasets:
(b) Wine Quality(http://archive.ics.uci.edu/ml/datasets/Wine+Quality)
Question 6. Datasets
(a) Generate a Sample dataset called D1 :
i. Initialize matrix using Uniform distribution with μ = 1 and σ = 0.05
ii. Generate target using y = 1.3x^2 + 4.8x + 8 + ψ, where ψ randomly initialized.
(b) Wine Quality called D2:
Winequality-red.csv (http://archive.ics.uci.edu/ml/datasets/Wine+Quality)
Question 7. Datasets
Classification Datasets:
(a) Iris dataset D1: (https://archive.ics.uci.edu/ml/datasets/Iris)
# Note: Target attribute classes are Iris Setosa, Iris Versicolour and Iris Virginica
(b) Wine Quality called D2: winequality-red.csv (http://archive.ics.uci.edu/ml/datasets/Wine+Quality)
Question 9. Datasets
(a) Sparse Dataset / w8a dataset D1: (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#w8a)
(b) UCI Dataset / SMS Spam D2: (https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection)
(c) UCI Dataset / Spambase D3: (https://archive.ics.uci.edu/ml/datasets/Spambase)
Question 10. Datasets
Recommender Datasets:
(a) movielens 100k dataset D1: Rating prediction dataset (http://grouplens.org/datasets/movielens/100k/)
(b) The RMSE score for rating prediction is available at Mymedialite website (http://www.mymedialite.net/examples/datasets.html)
Question 10. Datasets
Sparse dataset :
(a) IRIS dataset D1: (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/iris.scale)
(b) rcv1v2 (topics; subsets) D2: (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multilabel.html)
(c) 20Newsgroups dataset D3: (http://qwone.com/~jason/20Newsgroups/)