Skip to content

Vision benchmark

zhezhaoa edited this page Dec 11, 2022 · 3 revisions

Here is a short summary of our solution on Vision benchmark. One can obtain the pre-trained models used below from here.

CIFAR10

The example of fine-tuning and doing inference on CIFAR10 dataset with ViT-base-patch16-224-in21k:

python3 finetune/run_image_classifier.py --pretrained_model_path models/vit_base_patch16_224_model.bin \
                                         --tokenizer virtual \
                                         --config_path models/vit/base-16-224_config.json \
                                         --train_path datasets/cifar10/train.tsv \
                                         --dev_path datasets/cifar10/test.tsv \
                                         --output_model_path models/image_classifier_model.bin \
                                         --epochs_num 3 --batch_size 64

python3 inference/run_image_classifier_infer.py --load_model_path models/image_classifier_model.bin \
                                                --tokenizer virtual \
                                                --config_path models/vit/base-16-224_config.json \
                                                --test_path datasets/cifar10/test.tsv \
                                                --prediction_path datasets/cifar10/prediction.tsv \
                                                --labels_num 10

The example of fine-tuning and doing inference on CIFAR10 dataset with ViT-large-patch16-224-in21k:

python3 finetune/run_image_classifier.py --pretrained_model_path models/vit_large_patch16_224_model.bin \
                                         --tokenizer virtual \
                                         --config_path models/vit/large-16-224_config.json \
                                         --train_path datasets/cifar10/train.tsv \
                                         --dev_path datasets/cifar10/test.tsv \
                                         --output_model_path models/image_classifier_model.bin \
                                         --epochs_num 3 --batch_size 64

python3 inference/run_image_classifier_infer.py --load_model_path models/image_classifier_model.bin \
                                                --tokenizer virtual \
                                                --config_path models/vit/large-16-224_config.json \
                                                --test_path datasets/cifar10/test.tsv \
                                                --prediction_path datasets/cifar10/prediction.tsv \
                                                --labels_num 10

CIFAR100

The example of fine-tuning and doing inference on CIFAR100 dataset with ViT-base-patch16-224-in21k:

python3 finetune/run_image_classifier.py --pretrained_model_path models/vit_base_patch16_224_model.bin \
                                         --tokenizer virtual \
                                         --config_path models/vit/base-16-224_config.json \
                                         --train_path datasets/cifar100/train.tsv \
                                         --dev_path datasets/cifar100/test.tsv \
                                         --output_model_path models/image_classifier_model.bin \
                                         --epochs_num 3 --batch_size 64

python3 inference/run_image_classifier_infer.py --load_model_path models/image_classifier_model.bin \
                                                --tokenizer virtual \
                                                --config_path models/vit/base-16-224_config.json \
                                                --test_path datasets/cifar100/test.tsv \
                                                --prediction_path datasets/cifar10/prediction.tsv \
                                                --labels_num 100

The example of fine-tuning and doing inference on CIFAR10 dataset with ViT-large-patch16-224-in21k:

python3 finetune/run_image_classifier.py --pretrained_model_path models/vit_large_patch16_224_model.bin \
                                         --tokenizer virtual \
                                         --config_path models/vit/large-16-224_config.json \
                                         --train_path datasets/cifar100/train.tsv \
                                         --dev_path datasets/cifar100/test.tsv \
                                         --output_model_path models/image_classifier_model.bin \
                                         --epochs_num 3 --batch_size 64

python3 inference/run_image_classifier_infer.py --load_model_path models/image_classifier_model.bin \
                                                --tokenizer virtual \
                                                --config_path models/vit/large-16-224_config.json \
                                                --test_path datasets/cifar100/test.tsv \
                                                --prediction_path datasets/cifar100/prediction.tsv \
                                                --labels_num 100
Clone this wiki locally