Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? #3077
Job | Run time |
---|---|
56s | |
4m 19s | |
4m 5s | |
3m 31s | |
10m 16s | |
10m 22s | |
33m 29s |
Job | Run time |
---|---|
56s | |
4m 19s | |
4m 5s | |
3m 31s | |
10m 16s | |
10m 22s | |
33m 29s |