-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the interface of Dataset.py #4
Comments
I think you are pretty much asking the question we were not exactly capable to answer in mldata. I think for starters we should only support the basic kind for classification with a tuple (input, target) where input is an array and target a onehot vector. After that, we could do a small modification and support the "unsupervised kind" (input, input) and slowly but surely add support for new dataset types. I suggest that we only focus on answering "Are trainset, validset and testset only three instances of Dataset or the split should be part of the class?", that we will need for the basic case, for now. |
I agree we should not try to be too general nor too specific. I say target should simply be another array (not necessarily a one-hot vector!). Regarding the question "Are trainset, validset and testset only three instances of Dataset or the split should be part of the class?", the way I see it is: trainset, validset and testset are sets of data, so they should be different instances of |
Good point for the one-hot. As for the valid train test I see them more as On Mon, May 25, 2015 at 11:08 AM, Marc-Alexandre Côté <
|
We should discuss here the interface of the datasets the smartpy library will be manipulating. In contrast with https://github.com/SMART-Lab/mldata, this is not a generic dataset manager.
Some questions
Dataset
or the split should be part of the class?The text was updated successfully, but these errors were encountered: