Swin/VIT training #16

JohnMBrandt · 2024-08-15T16:23:19Z

Hello -- really appreciate your work! I was able to get a ResNet 50 model to train perfectly well on my custom dataset using your config files and confirmed that Stable DINO / R50 is better than DINO / R50 for my COCO-like dataset.

However, when trying to change to using Swin or ViT, the Stable DINO does not train properly but DINO does.

I have tried:

modifying positional encoding temperature to match MMDetection values for transformer backbones (20/0) instead of (10000/-0.5)
confirming that backbone model weights load exactly as they do in MMDetection and have equivalent values
confirming that channelmapper / neck inputs and outputs are exactly as expected
confirming that batch size, weight decay, optimizer, learning rate, etc are all exactly the same

JohnMBrandt · 2024-08-17T02:43:19Z

MaP from Stable Dino implemented in detrex with a ViT-H backbone on my custom dataset:

AP,AP50,AP75,APs,APm,APl
15.9893,24.5232,18.4653,10.0865,35.2932,51.1695

mAP of Dino implemented in detrex with a ViT-H backbone:

AP,AP50,AP75,APs,APm,APl
24.1312,49.2856,21.1866,18.3770,38.8828,57.5000

Not quite sure what's going on to cause the huge difference. The only difference in the cfg files is changing:

encoder=L(DINOTransformerEncoder) to encoder=L(StableDINOTransformerEncoder)
Adding multi_level_fusion="dense-fusion" to the encoder
Using StableDINO criterion and matcher
Adjusting the classification loss from 1 to 6 as in this repository,
Commenting out the aux weight as is done in this repository
Adding the additional cfg parameters for Stable DINO as in this repository:

use_ce_loss_type="stable-dino",
 ta_alpha=0.0,
ta_beta=2.0,
gdn_k=2,
neg_step_type='none',
no_img_padding=False,
dn_to_matching_block=False,

EDIT: Closing the loop on this, I found that the issue was in the StableDINO matcher.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swin/VIT training #16

Swin/VIT training #16

JohnMBrandt commented Aug 15, 2024 •

edited

Loading

JohnMBrandt commented Aug 17, 2024 •

edited

Loading

Swin/VIT training #16

Swin/VIT training #16

Comments

JohnMBrandt commented Aug 15, 2024 • edited Loading

JohnMBrandt commented Aug 17, 2024 • edited Loading

JohnMBrandt commented Aug 15, 2024 •

edited

Loading

JohnMBrandt commented Aug 17, 2024 •

edited

Loading