Evaluation of PoET in distributed training mode #12

tgjantos · 2023-07-04T11:54:48Z

Currently when training PoET in distributed training mode, it seems that the evaluation is only based on the data used by GPU 1, i.e. 1/n of the dataset. Possible solution might be using Hugging Face Accelerate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation of PoET in distributed training mode #12

Evaluation of PoET in distributed training mode #12

tgjantos commented Jul 4, 2023

Evaluation of PoET in distributed training mode #12

Evaluation of PoET in distributed training mode #12

Comments

tgjantos commented Jul 4, 2023