Skip to content

[NeurIPS2024] Code for the Asynchronous Perception Machines: Inspired by Geoffrey Hinton's GLOM

Notifications You must be signed in to change notification settings

rajatmodi62/apm

Repository files navigation

[NeurIPS'24] Asynchronous Perception Machine For Efficient Test Time Training

Paper | Openreview | Blog | Project Page

Rotating Features for Object Discovery

This repository contains the code for the paper Asynchronous Perception Machine For Efficient Test Time Training by Rajat Modi And Yogesh Singh Rawat

Our proposed Asynchronous Perception Machine represents a new way to do machine perception: i.e. asynchronous perception. This involves processing patches of an image one at a time in any order, and still encode semantic awareness in the network. This helps us in moving towards architectures which consume less flops and occupy less on-device-memory, and predict almost same features that a transformer predicts. This also allows us to achieve strong performance on test-time-training benchmarks.

This is the public official release of our model and coco checkpoints, and we urge people across the world to try some more of GLOM's ideas. We will add more code here as we make progress.

Setup

  • Install conda

  • Install Pytorch We used version 1.13.0, an A6000 gpu on Ubuntu 22.04 However, our codebase is pretty simple, and should remain robust to library changes in future, since it contains minimal dependencies.

  • Run the download script to download the checkpoints and the coco dataset (validation set):

bash download.sh

Run Experiments

  • Visualize semantic clusterings on the coco val set. Note that the model was trained on coco train set
python visualize_coco.py
  • Visualize islands of agreement on any image in the wild.
python predict_test_image.py
  • Interpolate between any two images in the wild. A similar result was shown in the GAN paper, and diffusion too. We can do such interpolation in the MLP now.
python interpolate.py
  • One sample learning, which is used in test-time-training. This illustrates the ability of the APM to learn on a single CLS token distilled from a teacher, for eg, CLIP.

In practice, we observed that a higher-parameterized teacher leads to higher-performance.

cd single_token_segmentation
python train_tta.py

Please follow the installation setting of the original clip repo to run this particular part of the code. You can find those installation instruction here.

  • Computational Analysis
cd flop_analysis
python count_flops.py
python count_memory.py
python count_parameters.py

This should yield the same numbers as the computational analysis table i.e. Table 4 in the APM paper.

  • Scaling Up experiments on COCO dataset
cd misc_scripts
python resize_coco_images.py
python 1_extract_coco_features.py
python train.py

Here we share the training code on the COCO dataset. Basically, we first dump features from Dinov2 backbone on coco-train set. You may need to download coco-train set and save it in the data/ directory for training. Otherwise, you can finetune on the checkpoints we have shared.

Islands of Agreement

We illustrate that the idea of islands of agreement in the GLOM paper actually works. The below video has been shared with permission from Geoffrey Hinton.

Hinton's Islands of agreement

To plot similar islands for any image in the wild, please follow the proper steps here

Citation

When using this code, please cite our paper:

@article{modi2024apm,
  title={Asynchronous Perception Machine For Efficient Test-Time-Training},
  author={Modi, Rajat and Rawat, Yogesh},
  journal={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2024}
}

Contact

For questions and suggestions, feel free to open an issue on GitHub or send an email to [email protected]. I will get onto it as soon as possible.

Acknowledgements

This achievement reflects the collective effort of many brilliant minds, and we are deeply grateful for their contributions.

About

[NeurIPS2024] Code for the Asynchronous Perception Machines: Inspired by Geoffrey Hinton's GLOM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published