Integrate pipeline with activeloop hub #1

etiennedupont · 2022-01-06T09:29:11Z

Work performed

After exporting a training dataset in YOLO format from Labelflow and send it as a Hub dataset using labelflow/labelflow#659, I created here a training script to use any Hub dataset for training, instead of a local Coco dataset as done originally. Additionally, I created a benchmark script "speed_test_hub.py" that compares the speed of data loading whether it's local or remote w/ hub. Results are encouraging.

Problems encountered

On macOS there is a PyTorch issue that hinders multiprocessing for data loading. I couldn't find a proper workaround (see https://stackoverflow.com/questions/64772335/pytorch-w-parallelnative-cpp206). Interestingly, that issue does not appear for regular training (with a local coco dataset), so there are some side effects due to the hub usage.

Add benchmark and training scritps

85c696c

etiennedupont marked this pull request as ready for review January 6, 2022 09:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate pipeline with activeloop hub #1

Integrate pipeline with activeloop hub #1

Uh oh!

etiennedupont commented Jan 6, 2022

Uh oh!

Uh oh!

Integrate pipeline with activeloop hub #1

Are you sure you want to change the base?

Integrate pipeline with activeloop hub #1

Uh oh!

Conversation

etiennedupont commented Jan 6, 2022

Work performed

Problems encountered

Uh oh!

Uh oh!