Add IJEPA task #25

vahid0001 · 2024-10-09T18:52:45Z

PR Type

[Feature]

Short Description

Add first version of IJEPA pretraining task

Tests Added

N/A

This change is

fcogidi

Reviewable status: 0 of 1 files reviewed, 14 unresolved discussions (waiting on @vahid0001)

mmlearn/tasks/ijepa_pretraining.py line 19 at r1 (raw file):

class IJEPAPretraining(L.LightningModule):

Suggestion:

class IJEPA(L.LightningModule):

mmlearn/tasks/ijepa_pretraining.py line 38 at r1 (raw file):

    pred_depth : int
        Depth of the predictor.
    optimizer : Optional[Any], optional

Suggestion:

optimizer : Optional[torch.optim.Optimizer], optional

mmlearn/tasks/ijepa_pretraining.py line 39 at r1 (raw file):

        Depth of the predictor.
    optimizer : Optional[Any], optional
        Optimizer configuration, by default None.

What you will get here is not a config, but an initialized optimizer.

mmlearn/tasks/ijepa_pretraining.py line 41 at r1 (raw file):

        Optimizer configuration, by default None.
    lr_scheduler : Optional[Any], optional
        Learning rate scheduler configuration, by default None.

Same as the optimizer. This will be an instantiated learning rate scheduler, if one is provided.

mmlearn/tasks/ijepa_pretraining.py line 91 at r1 (raw file):

        self.total_steps = None

        self.encoder = VisionTransformer.__dict__[model_name](

Like we discussed. I think this should be passed in already initialized. Same with the predictor.

mmlearn/tasks/ijepa_pretraining.py line 106 at r1 (raw file):

        if checkpoint_path != "":
            self.encoder, self.predictor, self.target_encoder, _, _, _ = (

Also, as we discussed, I think each module should handle loading the pretrained checkpoints individually. An example use case for why is being able to use iJEPA-pretrained encoder (the original one) for contrastive pretraining.

mmlearn/tasks/ijepa_pretraining.py line 171 at r1 (raw file):

        return encoder, predictor, target_encoder, opt, scaler, epoch

    def forward(

This forward pass is currently not being used by this module. Everything is in training_step right now. Remember that the LightningModule is also an nn.Module.

mmlearn/tasks/ijepa_pretraining.py line 177 at r1 (raw file):

    ) -> torch.Tensor:
        """Forward pass through the encoder."""
        return self.encode(x, masks)

The encode method doesn't exist. Please make sure to try running the code to ensure that things are working properly.
Also, please take a look at the other encoders in the library - the current convention is to pass the entire batch dictionary to the encoder.

mmlearn/tasks/ijepa_pretraining.py line 181 at r1 (raw file):

    def training_step(self, batch: Dict[str, Any], batch_idx: int) -> torch.Tensor:
        """Perform a single training step."""
        images = batch[Modalities.RGB]

The format for this has changed. See this PR.

Suggestion:

images = batch[Modalities.RGB.name]

mmlearn/tasks/ijepa_pretraining.py line 191 at r1 (raw file):

        # Move images and masks to device
        images = images.to(self.device)

You don't need to do this for anything inside the batch dictionary. Lightning will handle that.

mmlearn/tasks/ijepa_pretraining.py line 216 at r1 (raw file):

            "train/loss",
            loss,
            on_step=True,

Why log both on_step and on_epoch? During training, the default for lightning is to log on_step but not on_epoch (for validation and testing, the default is to log on_epoch but not on_step)

mmlearn/tasks/ijepa_pretraining.py line 227 at r1 (raw file):

        return loss

    def _update_target_encoder(self) -> None:

Please look into reusing the existing EMA module.

mmlearn/tasks/ijepa_pretraining.py line 247 at r1 (raw file):

                "params": (p for n, p in self.encoder.named_parameters()
                           if ("bias" not in n) and (len(p.shape) != 1)),
                "weight_decay": self.optimizer_cfg.get("weight_decay", 0.0)

Like I mentioned earlier, you will be getting an instantiated Optimizer object, not the config. Please take a look at the contrastive pretraining task for how to get the weight decay value from the instantiated Optimizer object.

mmlearn/tasks/ijepa_pretraining.py line 309 at r1 (raw file):

        return self._shared_eval_step(batch, batch_idx, "test")

    def _shared_eval_step(

Notice that most of the code in this method is repeated in training_step. You can define it once and call it multiple times for training and eval.

fcogidi

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @vahid0001)

mmlearn/modules/encoders/vision.py line 313 at r2 (raw file):

                nn.init.constant_(m.bias, 0)

    def load_checkpoint(

How's this intended to be used?

It looks like both the encoder and predictor needs to be instantiated first and then passed to this method. When I had first made the suggestion to move the checkpoint loading functionality to the encoder, I imagined the VisionTransformer having its own checkpoint loading logic (just for the encoder) and the predictor having its own (loading the target encoder may not be necessary, I think, if one knows the exact ema value where it stopped at).

mmlearn/tasks/ijepa_pretraining.py line 79 at r2 (raw file):

        self.predictor = predictor

        self.ema = ExponentialMovingAverage(encoder, ema_decay, ema_decay_end, 1000)

Should the ema_anneal_end_step, currently fixed at 1000 be a user-defined value?

Also, device_id might be important to set, especially in distributed setting. It can be set to self.device.

mmlearn/tasks/ijepa_pretraining.py line 138 at r2 (raw file):

        )

        if is_training:

Is the is_training flag necessary? Why not, if step_type == "train"?

mmlearn/tasks/ijepa_pretraining.py line 230 at r2 (raw file):

    def on_validation_epoch_start(self) -> None:
        """Prepare for the validation epoch."""
        self._on_eval_epoch_start("val")

The self._on_eval_epoch_{start/end} methods are not defined.

fcogidi

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @vahid0001)

mmlearn/modules/encoders/vision.py line 313 at r2 (raw file):

Previously, fcogidi (Franklin) wrote…

How's this intended to be used?

It looks like both the encoder and predictor needs to be instantiated first and then passed to this method. When I had first made the suggestion to move the checkpoint loading functionality to the encoder, I imagined the VisionTransformer having its own checkpoint loading logic (just for the encoder) and the predictor having its own (loading the target encoder may not be necessary, I think, if one knows the exact ema value where it stopped at).

Speaking of, please check this out.

vahid0001 added 5 commits October 9, 2024 14:51

add v1 of ijepa task

e108aef

reorganize imports

763a7fa

fix precommits

cc98d33

change docstring

f7fdcea

revise optimizer configure method

be3ac26

fcogidi requested changes Oct 11, 2024

View reviewed changes

vahid0001 added 2 commits October 16, 2024 14:43

add load checkpoint to vit class

a1dcd84

revise ijepa class

41a21f6

fcogidi requested changes Oct 22, 2024

View reviewed changes

vahid0001 and others added 20 commits October 30, 2024 10:37

remove load checkpoint

f9ee90a

revise ijepa trainer class

8d41a76

change/add files for ijepa training

b671d09

add IJEPA task

0f44546

revise masking function

b91b9ea

revise vit classes

d4be471

add IJEPA task

9046cbb

revise masking function

4cd057c

revise vit classes

48eea90

remove state_dict method

6db4155

revise ijepa pretraining task

dbc4154

revise ijepa config file

583c020

add init to ijepa project config file

871e6a2

add proper transformations

f508293

revise ijepa config file

336c040

change some hp, fix precommit errors

c06390c

fix precommit errors

facac4b

fix type annotation

6e91766

Remove IJEPA import from __init__.py and add it to tasks/__init__.py

2fe2a7c

Refactor ExponentialMovingAverage class in ema.py

7a0bd33

fcogidi added 8 commits November 27, 2024 10:52

Add base class for tasks that require training

70bd0e6

Update configuration file for reproducing ImageNet experiment

b0be82b

Merge branch 'main' into ijepa_training

e37d476

Fix modalities key in ijepa_pretraining.py

099ae33

Fix type hints and rename ijepa task module

39c2e19

Add dirpath for saving checkpoints in Vector SLURM environment

5e19f87

Fix import statement for IJEPA class

5937565

Merge branch 'main' into ijepa_training

ceed49d

fcogidi changed the title ~~add IJEPA task~~ Add IJEPA task Dec 5, 2024

fcogidi approved these changes Dec 5, 2024

View reviewed changes

fcogidi merged commit c6b07e0 into main Dec 5, 2024
4 checks passed

fcogidi deleted the ijepa_training branch December 5, 2024 16:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add IJEPA task #25

Add IJEPA task #25

Uh oh!

vahid0001 commented Oct 9, 2024 •

edited by fcogidi

Loading

Uh oh!

fcogidi left a comment

Uh oh!

fcogidi left a comment

Uh oh!

fcogidi left a comment

Uh oh!

Uh oh!

Uh oh!

Add IJEPA task #25

Add IJEPA task #25

Uh oh!

Conversation

vahid0001 commented Oct 9, 2024 • edited by fcogidi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Short Description

Tests Added

Uh oh!

fcogidi left a comment

Choose a reason for hiding this comment

Uh oh!

fcogidi left a comment

Choose a reason for hiding this comment

Uh oh!

fcogidi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vahid0001 commented Oct 9, 2024 •

edited by fcogidi

Loading