PyTorch implementation and pretrained models for LTRP. For details, see Learning to Rank Patches for Unbiased Image Redundancy Reduction.
Please install PyTorch and download the ImageNet dataset. This codebase has been developed with python version 3.8, PyTorch version 1.12.1, CUDA 11.3 and torchvision 0.13.1. The exact arguments to reproduce the models presented in our paper can be found in the args
column of the pretrained models section.
Run LTRP with ViT-small classing model on a single node with 8 GPUs for 400 epochs with the following command. We provide training logs for this run to help reproducibility.
python -m torch.distributed.launch --nproc_per_node=8 ltrp/main_ltrp.py \
--data_path yourpath/imagenet_2012 \
--batch_size 512 \
--model ltrp_base_and_vs \
--mask_ratio 0.9 \
--epochs 400 \
--resume_from_mae yourpath/mae_visualize_vit_base.pth \
--ltr_loss list_mleEx \
--list_mle_k 20 \
--asymmetric
python -m torch.distributed.launch --nproc_per_node=4 ltrp/main_ml.py \
--finetune_ltrp yourpath/pretrained_ckpt.pth \
--finetune yourpath/mae_pretrain_vit_base.pth \
--data_path yourpath/coco2017/ \
--batch_size 256 \
--decoder_embedding 768 \
--epochs 100 \
--dist_eval \
--score_net ltrp_cluster \
--keep_nums 147 \
--nb_classes 80 \
--ltrp_cluster_ratio 0.7