Skip to content

Commit

Permalink
Call wandb.finish when the tracker is destructed if wandb is in use (#…
Browse files Browse the repository at this point in the history
…191)

runs always show "crashed" on my wandb, despite finishing successfully.
"Crashed" indicates that wandb did not finish sending the "success"
signal to the server so the server believes the client was terminated
unexpectedly. Furthermore, wandb log is incomplete (last lines missing).

This PR adds a call to `wandb.finish` when the Tracker was destructed
(oftentimes when `trainer.fit` finished) so that signals are sent to the
server and a data sync is performed.

Without this change:
<img width="526" alt="image"
src="https://github.com/user-attachments/assets/869da24e-c5b8-415c-b15a-bb79c49f96ce"
/>

With this change:
<img width="548" alt="image"
src="https://github.com/user-attachments/assets/16f0a40d-ea3b-48ed-93a4-f40ee01cb7c6"
/>
  • Loading branch information
TonyLianLong authored Feb 4, 2025
1 parent 4d420fe commit 483fa8a
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions verl/utils/tracking.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ def log(self, data, step, backend=None):
if backend is None or default_backend in backend:
logger_instance.log(data=data, step=step)

def __del__(self):
if 'wandb' in self.logger:
self.logger['wandb'].finish(exit_code=0)


class _MlflowLoggingAdapter:

Expand Down

0 comments on commit 483fa8a

Please sign in to comment.