Unit 5, Exercise 2 #73

csnatarajan · 2023-07-13T21:34:52Z

csnatarajan
Jul 13, 2023

Hi @rasbt , Thank you for putting together the tutorial! lightning noob here, sorry if this is obvious. I tried changing the DataLoaders to the same ones in the MNISTDataModule in the hopes of seeing 0 as the answer to the difference between validation and train accuracy. i.e

def train_dataloader(self):
    return DataLoader(
        self.mnist_train,
        batch_size=self.batch_size,
    )

def val_dataloader(self):
    return DataLoader(
        self.mnist_train,
        batch_size=self.batch_size,
    )

I was surprised to see that that was not the case. IIUC, it should be a forward pass so either data is different or weights. I am assuming weights should not be different as they shouldn't be updated on a validation step and I can't see why data would be different as I have shuffle off.
for ref, these are my steps.

def _shared_step(self, batch):
        features, true_labels = batch
        logits = self(features)
        loss = F.cross_entropy(logits, true_labels)
        predicted_labels = torch.argmax(logits, dim=1)
        return loss, true_labels, predicted_labels

def training_step(self, batch, batch_idx):
        loss, true_labels, predicted_labels = self._shared_step(batch)
        self.log("train_loss", loss)
        self.train_acc(predicted_labels, true_labels)
        self.log(
            "train_acc", self.train_acc, prog_bar=True, on_epoch=True, on_step=False
        )
        return loss

def validation_step(self, batch, batch_idx):
    loss, true_labels, predicted_labels = self._shared_step(batch)
    self.log("val_loss", loss, prog_bar=True)
    self.val_acc(predicted_labels, true_labels)
    self.log("val_acc", self.val_acc)

Would appreciate your guidance on what I might be missing or if the behavior is what you would expect. metrics .csv produced from logging is also attached.
metrics.csv

Answered by rasbt

Jul 14, 2023

Hi there! This is actually a great observation. 100% agree with you that you'd expect 0%. The reason is a tiny implementation detail here. So, when you compute the classification accuracy on the training dataset during training, it is computing the accuracy on each batch. But then the model is updated after each batch.

What I mean is, it's like follows:

compute & track predictions on batch
update model via backprop
compute compute & track on batch
update model via backprop
...
compute accuracy from these batch predictions

This is a workaround because otherwise it would be too expensive to compute the accuracy after each epoch if the training set is very large.

However, during validation…

View full answer

rasbt · 2023-07-14T23:12:39Z

rasbt
Jul 14, 2023

Hi there! This is actually a great observation. 100% agree with you that you'd expect 0%. The reason is a tiny implementation detail here. So, when you compute the classification accuracy on the training dataset during training, it is computing the accuracy on each batch. But then the model is updated after each batch.

What I mean is, it's like follows:

compute & track predictions on batch
update model via backprop
compute compute & track on batch
update model via backprop
...
compute accuracy from these batch predictions

This is a workaround because otherwise it would be too expensive to compute the accuracy after each epoch if the training set is very large.

However, during validation, there you iterate through the dataset without updating the model.

So yeah, that's likely where the discrepancy comes from. I hope that makes sense. And please let me know if that's confusing. It's a bit difficult to describe via text 😅

1 reply

csnatarajan Jul 17, 2023
Author

Thanks @rasbt! Yeap, that makes a lot of sense and again thank you for putting out this material!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit 5, Exercise 2 #73

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Unit 5, Exercise 2 #73

csnatarajan Jul 13, 2023

Replies: 1 comment · 1 reply

rasbt Jul 14, 2023

csnatarajan Jul 17, 2023 Author

csnatarajan
Jul 13, 2023

Replies: 1 comment 1 reply

rasbt
Jul 14, 2023

csnatarajan Jul 17, 2023
Author