How to preserve the train_loss of each step? I desire to draw a graph to observe the convergence situation of the model training. #10

SunriseEastSea · 2024-09-28T06:20:16Z

No description provided.

zhao-zilong · 2024-09-28T14:33:14Z

Here is how you can do that.

First define a callback function

from transformers import TrainerCallback

class LossLoggingCallback(TrainerCallback):
    def on_log(self, args, state, control, logs=None, **kwargs):
        if logs is not None:
            loss = logs.get("loss")
            if loss is not None:
                print(f"Step {state.global_step}: Loss: {loss}")
                # Optionally, you can store the loss in a file or a list for further processing
                with open("training_loss_log.txt", "a") as log_file:
                    log_file.write(f"Step {state.global_step}: Loss: {loss}\n")

Then add the callback to the trainer

from transformers import TrainingArguments

training_args = TrainingArguments(
    self.experiment_dir,
    num_train_epochs=self.epochs,
    per_device_train_batch_size=self.batch_size,
    save_strategy="no",
    **self.train_hyperparameters
)

# Add the custom callback to the trainer
loss_logging_callback = LossLoggingCallback()

# Create the trainer with the callback
tabula_trainer = TabulaTrainer(
    self.model, 
    training_args, 
    train_dataset=tabula_ds, 
    tokenizer=self.tokenizer, 
    data_collator=TabulaDataCollator(self.tokenizer),
    callbacks=[loss_logging_callback]  # Add the callback here
)

You can implement and if it is bug-free, you can create a PR and I will merge it.

SunriseEastSea · 2024-09-29T12:27:08Z

Thank you very much, your solutions successfully saved the training losses to the file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to preserve the train_loss of each step? I desire to draw a graph to observe the convergence situation of the model training. #10

How to preserve the train_loss of each step? I desire to draw a graph to observe the convergence situation of the model training. #10

SunriseEastSea commented Sep 28, 2024

zhao-zilong commented Sep 28, 2024

SunriseEastSea commented Sep 29, 2024

How to preserve the train_loss of each step? I desire to draw a graph to observe the convergence situation of the model training. #10

How to preserve the train_loss of each step? I desire to draw a graph to observe the convergence situation of the model training. #10

Comments

SunriseEastSea commented Sep 28, 2024

zhao-zilong commented Sep 28, 2024

SunriseEastSea commented Sep 29, 2024