-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of popular benchmark datasets #722
Comments
Thanks for the suggestions!
Would the dataset classes download the datasets? Are those datasets readily available for download these days?
Could you give an example of how this might work? |
Ideally, yes, as I would like it to mimic pytorch because of familiarity. This would mean you can specify the root, split, and download (possibly something else in case I missed it). I've already implemented Cars196, and CUB on my fork, so you can have a look at what I had mind: https://github.com/ir2718/pytorch-metric-learning/tree/dataset. If you think this is a step in the right direction, do say so.
I haven't given it that much thought, but for the sake of example, maybe a function that generates a pytorch dataset from the given huggingface dataset name, input column, and output column. |
Looks good! I don't think there's any harm in adding them, and I think some people will find it convenient. Feel free to open a PR for those dataset classes.
Hmm, I don't have any thoughts on this now. We can keep this discussion open though. |
Hi,
I find that it's nice to have a few benchmark datasets integrated into libraries for easier research. My feature request boils down to the implementation of a few image retrieval datasets, namely: CUB, Cars196, Stanford Online Products, and INaturalist. In most image retrieval papers, these datasets are used for benchmarking new methods and models. @KevinMusgrave, If you agree with this request, I can create a PR.
Additionally, some kind of integration with HuggingFace datasets might be nice for text retrieval/text similarity, but I'm not sure if this is of any use since sentence-transformers is probably the most often used library for such things. It also introduces an external dependency, so I'd like to hear your opinion on this.
The text was updated successfully, but these errors were encountered: