Enable autograd graph to propagate after multi-device syncing (for loss functions in ddp
)
#9542
Job | Run time |
---|---|
2m 24s | |
14s | |
14s | |
3m 47s | |
15s | |
2m 46s | |
2m 26s | |
1m 38s | |
1m 28s | |
3m 15s | |
3m 18s | |
1s | |
21m 46s |