Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data prefetching? #75

Closed
Ebanflo42 opened this issue Apr 4, 2024 · 1 comment
Closed

Data prefetching? #75

Ebanflo42 opened this issue Apr 4, 2024 · 1 comment

Comments

@Ebanflo42
Copy link
Contributor

Question mark because it might be most reasonable to depend on another crate for this.

At the most abstract level, I imagine us having a Training struct which takes of course a model context and optimizer context but also very importantly an example generator. The generator should have a specified number of threads which it uses to load and preprocess and specified number of samples before they are required by the training loop. It should be known whether the generator is finite or infinite (counting epochs on an infinite training set would of course be silly).

Here is the simplest implementation in python: https://github.com/justheuristic/prefetch_generator. Note that all loading and preprocessing is user-defined, it basically just implements the multiprocessing aspect. Is there already a crate we can use for this?

@Ebanflo42
Copy link
Contributor Author

We can just use rust iterators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant