Don't reshuffle eval data each "epoch" #229

epwalsh · 2023-07-10T22:59:20Z

Keep eval data order consistent across iterations. Add paths to metadata in MemMapDataset class.

dirkgr · 2023-07-10T23:41:18Z

olmo/util.py

-        if isinstance(dataloader.sampler, DistributedSampler):
+        if update_epoch_seed and isinstance(dataloader.sampler, DistributedSampler):
            epoch = dataloader.sampler.epoch + 1
            dataloader.sampler.set_epoch(epoch)


I don't understand how skipping this makes sure it starts from scratch every epoch. I thought we'd have to reset the data loader somehow every time.

Each validation loop through a dataset is a complete loop / epoch over the dataset (less the extra examples, since drop_last=True). The DistributedSampler reshuffles each epoch with a specific seed that's shared across all processes. That seed never changes unless you call DistributedSampler.set_epoch(). See https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler.

But the fact that this is confusing, err kind of works by exploiting a nuance of the DistributedSampler, tells me that this probably isn't the best way.

Updated: 9b52525

Do we still need cycle_through_epoch then? Or at least, do we still need this change?

Removed: ac8322c

dirkgr

I'm assuming then that we don't use cycle_through_epochs for the training set?

epwalsh · 2023-07-11T22:46:29Z

I'm assuming then that we don't use cycle_through_epochs for the training set?

Right

epwalsh added 2 commits July 10, 2023 15:55

keep eval data order constant

b596467

add paths to metadata in mem map dataset

3f78c13

epwalsh requested a review from dirkgr July 10, 2023 23:00

dirkgr reviewed Jul 10, 2023

View reviewed changes

change how we iterate over eval examples

9b52525

epwalsh requested a review from dirkgr July 11, 2023 16:50

remove unused function

ac8322c

dirkgr approved these changes Jul 11, 2023

View reviewed changes

epwalsh merged commit 952819b into main Jul 11, 2023
10 checks passed

epwalsh deleted the eval-order branch July 11, 2023 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't reshuffle eval data each "epoch" #229

Don't reshuffle eval data each "epoch" #229

epwalsh commented Jul 10, 2023

dirkgr Jul 10, 2023

epwalsh Jul 11, 2023

epwalsh Jul 11, 2023

epwalsh Jul 11, 2023

dirkgr Jul 11, 2023

epwalsh Jul 11, 2023

dirkgr left a comment

epwalsh commented Jul 11, 2023

Don't reshuffle eval data each "epoch" #229

Don't reshuffle eval data each "epoch" #229

Conversation

epwalsh commented Jul 10, 2023

dirkgr Jul 10, 2023

Choose a reason for hiding this comment

epwalsh Jul 11, 2023

Choose a reason for hiding this comment

epwalsh Jul 11, 2023

Choose a reason for hiding this comment

epwalsh Jul 11, 2023

Choose a reason for hiding this comment

dirkgr Jul 11, 2023

Choose a reason for hiding this comment

epwalsh Jul 11, 2023

Choose a reason for hiding this comment

dirkgr left a comment

Choose a reason for hiding this comment

epwalsh commented Jul 11, 2023