Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate pipeline with activeloop hub #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

etiennedupont
Copy link

Work performed

After exporting a training dataset in YOLO format from Labelflow and send it as a Hub dataset using labelflow/labelflow#659, I created here a training script to use any Hub dataset for training, instead of a local Coco dataset as done originally. Additionally, I created a benchmark script "speed_test_hub.py" that compares the speed of data loading whether it's local or remote w/ hub. Results are encouraging.

Problems encountered

On macOS there is a PyTorch issue that hinders multiprocessing for data loading. I couldn't find a proper workaround (see https://stackoverflow.com/questions/64772335/pytorch-w-parallelnative-cpp206). Interestingly, that issue does not appear for regular training (with a local coco dataset), so there are some side effects due to the hub usage.

@etiennedupont etiennedupont marked this pull request as ready for review January 6, 2022 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant