Consistent ordering of cached data files between deeplearning and Great Lakes #1075
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
os.listdir()
does not necessarily list files in the same order on Great Lakes as it does on the deeplearning server, even if the contents of the directory are the same in both places and the same random seed is set in both places.This behavior is documented, e.g., here and here.
This issue is relevant for any BLISS case study that trains a network on Great Lakes and then evaluates the trained network on the deeplearning server. In
CachedSimulatedDataModule
, files are assigned to train/val/test based on the order in which they are read in byos.listdir()
. So because of the behavior mentioned above, one could obtain different train/val/test splits on deeplearning and Great Lakes even if the directory contents and random seed are the same in both places.The proposed fix just wraps
sorted()
around the list of file names.