Skip to content

Latest commit

 

History

History
45 lines (34 loc) · 2.22 KB

File metadata and controls

45 lines (34 loc) · 2.22 KB

Datasets

MICROSOFT PROVIDES THE DATASETS ON AN "AS IS" BASIS. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, GUARANTEES OR CONDITIONS WITH RESPECT TO YOUR USE OF THE DATASETS. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAW, MICROSOFT DISCLAIMS ALL LIABILITY FOR ANY DAMAGES OR LOSSES, INLCUDING DIRECT, CONSEQUENTIAL, SPECIAL, INDIRECT, INCIDENTAL OR PUNITIVE, RESULTING FROM YOUR USE OF THE DATASETS.

The datasets are provided under the original terms that Microsoft received such datasets. See below for more information about each dataset.

Wikipedia Detox

This dataset is provided under CC0. Redistributing the dataset "wikipedia-detox-250-line-data.tsv" with attribution:

Wulczyn, Ellery; Thain, Nithum; Dixon, Lucas (2016): Wikipedia Detox. figshare.

With modifications by taking a sample of rows and reducing rough language.

Original source: https://doi.org/10.6084/m9.figshare.4054689

Original readme: https://meta.wikimedia.org/wiki/Research:Detox

UCI Iris Flower Dataset

Redistributing the dataset "iris.txt" with attribution:

Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [https://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

With modifications by changing the separator character, order of columns, and numerical encoding of labels.

https://archive.ics.uci.edu/ml/datasets/iris

Breast Cancer Wisconsin

Redistributing the dataset "breast-cancer.txt" with attribution:

O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear programming", SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18.

Original source: http://ftp.cs.wisc.edu:80/math-prog/cpo-dataset/machine-learn/cancer/cancer1/datacum

Original readme: http://ftp.cs.wisc.edu/math-prog/cpo-dataset/machine-learn/cancer/cancer1/data.doc

UCI Adult Dataset

Redictributing the dataset "adult.tiny.with-schema.txt" with attribution:

Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [https://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

https://archive.ics.uci.edu/ml/datasets/Adult