What is the loss funciton used by the model? #122

CoCoNuTeK · 2024-06-15T16:15:50Z

CoCoNuTeK
Jun 15, 2024

Hello there,
I would like to ask about the loss function given i wanted to create my own loss function lets say moving average MASE as loss function to the model, everything is there except the outputs of the model contains: loss, logits and other stuff but there are no predicted values directly,
so is there a way to use the tokenizer used for creating the input_ids, labels and attention_mask to somehow turn the logits into the predicted values, so the reverse operation?

abdulfatir · 2024-06-15T16:27:09Z

abdulfatir
Jun 15, 2024
Maintainer

@CoCoNuTeK let's use discussions for such open ended questions. You can check the implementation of the T5 model in transformers and modify as needed.

1 reply

CoCoNuTeK Jun 15, 2024
Author

Okay so its this model i asume from the transformers library: AutoModelForSeq2SeqLM;
The main thing would be if i overloaded the trainer class and created a custom trainer like this

    class CustomTrainer(Trainer):
        def __init__(self, *args, **kwargs):
            self.loss_fn = kwargs.pop("loss_fn")
            super().__init__(*args, **kwargs)
            self.tokenizer = kwargs.pop("tokenizer")
        
      def compute_loss(self, model, inputs, return_outputs=False):
                  past_target = inputs.pop("past_target")  # get the original past values
                  future_target = inputs.pop("future_target")  # get the original future values
                  outputs = model(**inputs)
                  logits = outputs.get("logits")
      
                  # Decode logits to actual predictions
                  y_pred = self.decode_predictions(logits) #TODO
                  
                  # Calculate the loss using the actual values
                  loss = self.loss_fn(y_pred, future_target, past_target)
                  return (loss, outputs) if return_outputs else loss

the models weights will be finetuned with whats returned by the overloaded function compute_loss right?? Or does the model do that internally in which case it would be more of a work. And what is the current loss fnc used?

CoCoNuTeK · 2024-06-15T17:24:23Z

CoCoNuTeK
Jun 15, 2024
Author

The output from the model has shape [batch_size, pred_len, 4096] the logits tensor; so to get the predicted token, i can just commit to the highest logit and get dimension [batch_size, pred_len] but i am still stuck with token values + the tokenizer i used here

    def to_hf_format(self, df: pd.DataFrame) -> dict:
        # Extract past and future targets
        past_target = torch.tensor(df['y'].values[:self.context_length]).unsqueeze(0)
        future_target = torch.tensor(df['y'].values[self.context_length:self.context_length + self.prediction_length]).unsqueeze(0)

        # Transform using the tokenizer
        input_ids, attention_mask, scale = self.tokenizer.context_input_transform(past_target)
        labels, labels_mask = self.tokenizer.label_input_transform(future_target, scale)
        labels[labels_mask == 0] = -100

to create the tokens from float values; does it have backwards operation?? And i dont see that working as the tokens encode a range of values not just one value.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the loss funciton used by the model? #122

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

What is the loss funciton used by the model? #122

CoCoNuTeK Jun 15, 2024

Replies: 2 comments · 1 reply

abdulfatir Jun 15, 2024 Maintainer

CoCoNuTeK Jun 15, 2024 Author

CoCoNuTeK Jun 15, 2024 Author

CoCoNuTeK
Jun 15, 2024

Replies: 2 comments 1 reply

abdulfatir
Jun 15, 2024
Maintainer

CoCoNuTeK Jun 15, 2024
Author

CoCoNuTeK
Jun 15, 2024
Author