You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All 4 of the runs highlighted in the image below are from a single accelerate launch train.py ...
It looks like the way to use wandb with Accelerator is to use the log_with argument when instantiating the accelerator: accelerator = Accelerator(log_with="wandb") and then call accelerator.init_trackers under a condition of is_main_process.
I noticed I was getting 4 wandb runs when I trained with
accelerate launch
on a machine with 4 GPUs. All 4 of the wandb runs include system statistics, like GPU temp. But only one of them includes the training/evaluation panels, because of our check foris_main_process
.All 4 of the runs highlighted in the image below are from a single
![image](https://private-user-images.githubusercontent.com/1719584/282852344-f11ae0f6-0554-4bd9-9163-f3f2168b1a26.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5MzgzMDgsIm5iZiI6MTczOTkzODAwOCwicGF0aCI6Ii8xNzE5NTg0LzI4Mjg1MjM0NC1mMTFhZTBmNi0wNTU0LTRiZDktOTE2My1mM2YyMTY4YjFhMjYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxOSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTlUMDQwNjQ4WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZDBhNDM1ZDI5MTQ4OTRhZDgzYTRlMDdmMTNmZmUwY2Y4ODA1NDQ2YzlkM2Q3MDRmYmNiYmYxNGI2ZmE2MGI4MiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.BPnuB_0y5SG3b3GKA1OybCPM-jQ9l_oJIVSbQWc0sak)
accelerate launch train.py ...
It looks like the way to use wandb with Accelerator is to use the
log_with
argument when instantiating the accelerator:accelerator = Accelerator(log_with="wandb")
and then callaccelerator.init_trackers
under a condition ofis_main_process
.This issue has some discussion on the 🤗 forums: https://discuss.huggingface.co/t/multiple-wandb-outputs/21394
The text was updated successfully, but these errors were encountered: