Image RPE (iRPE for short) methods are new relative position encoding methods dedicated to 2D images, considering directional relative distance modeling as well as the interactions between queries and relative position embeddings in self-attention mechanism. The proposed iRPE methods are simple and lightweight, being easily plugged into transformer blocks. Experiments demonstrate that solely due to the proposed encoding methods, DeiT and DETR obtain up to 1.5% (top-1 Acc) and 1.3% (mAP) stable improvements over their original versions on ImageNet and COCO respectively, without tuning any extra hyperparamters such as learning rate and weight decay. Our ablation and analysis also yield interesting findings, some of which run counter to previous understanding.
forked from microsoft/Cream
-
Notifications
You must be signed in to change notification settings - Fork 1
Contains Modified DETR-with-iRPE for object detection on custom generated synthetic dataset.
License
RishiDarkDevil/ViT-RPE
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Contains Modified DETR-with-iRPE for object detection on custom generated synthetic dataset.
Resources
License
Code of conduct
Security policy
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 93.3%
- C++ 3.4%
- Cuda 3.3%