- Image Recognition
- 2D Object Detection
- 3D Object Detection
- image retrieval
- metric learning, face recognition
- person/vehicle re-identification
- instance segmentation
- semantic segmentation
- tracking
- motion prediction for autonomous driving
- human object interaction detection
- pose estimation
- knowledge distillation
- domain adaptation
- action recognition
- depth estimation
Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
- paper: https://arxiv.org/abs/2003.08866
- summary:
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
- paper: https://arxiv.org/abs/1909.12978
- summary:
- code: https://github.com/taoyang1122/MutualNet
Hybrid Models for Open Set Recognition
- paper: https://arxiv.org/abs/2003.12506
- summary:
Gradient Centralization: A New Optimization Technique for Deep Neural Networks
- paper: https://arxiv.org/abs/2004.01461
- summary:
- code: https://github.com/Yonghongwei/Gradient-Centralization
Multi-task Learning Increases Adversarial Robustness
Rethinking Bottleneck Structure for Efficient Mobile Network Design
- paper: https://arxiv.org/abs/2007.02269
- summary:
Negative Margin Matters: Understanding Margin in Few-shot Classification
- paper: https://arxiv.org/abs/2003.12060
- summary:
- code: https://github.com/bl0/negative-margin.few-shot
Dynamic Group Convolution for Accelerating Convolutional Neural Networks
- paper: https://arxiv.org/abs/2007.04242
- summary:
- code: https://github.com/zhuogege1943/dgc
PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
- paper: https://arxiv.org/abs/2007.09654
- summary:
- code: https://github.com/wutong16/DistributionBalancedLoss
URIE: Universal Image Enhancement for Visual Recognition in the Wild
- paper: https://arxiv.org/abs/2007.08979
- summary:
DADA: Differentiable Automatic Data Augmentation
- paper: https://arxiv.org/abs/2003.03780
- summary:
- code: https://github.com/VDIGPKU/DADA
AutoMix: Mixup Networks for Sample Interpolation via Cooperative Barycenter Learning
Momentum Batch Normalization for Deep Learning with Small Batch Size
Hard negatives examples are hard, but useful
Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need?
- paper: https://arxiv.org/abs/2003.11539
- summary:
Topic-aware Multi-Label Classification
Resolution Switchable Networks for Runtime Efficient Image Classification
- paper: https://arxiv.org/abs/2007.09558
- summary:
- code: https://github.com/yikaiw/RS-Nets
Suppressing Mislabeled Data via Grouping and Self-Attention
Attentive Normalization
- paper: https://arxiv.org/abs/1908.01259
- summary:
L2 Norm: A Generic Visualization Approach for Convolutional Neural Networks
- paper: https://arxiv.org/abs/2007.09748
- code: https://github.com/ahmdtaha/constrained_attention_filter
- summary:
FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning
- paper: https://arxiv.org/abs/2007.08505
- summary:
Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches
- paper: https://arxiv.org/abs/2003.03836
- summary:
PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer
- paper: https://arxiv.org/abs/2007.06191
- code: https://github.com/d-li14/PSConv
- summary:
Faster AutoAugment: Learning Augmentation Strategies using Backpropagation
- paper: https://arxiv.org/abs/1911.06987
- summary:
GATCluster: Self-Supervised Gaussian-Attention Network for Image Clustering
- paper: https://arxiv.org/abs/2002.11863
- summary:
Volumetric Transformer Networks
- paper: https://arxiv.org/abs/2007.09433
- summary:
OnlineAugment: Online Data Augmentation with Less Domain Knowledge
- paper: https://arxiv.org/abs/2007.09271
- summary:
End-to-End Object Detection with Transformers
- paper: https://arxiv.org/abs/2005.12872
- summary:
- code: https://github.com/facebookresearch/detr
BorderDet: Border Feature for Dense Object Detection
- paper: https://arxiv.org/abs/2007.11056
- summary:
- code: https://github.com/Megvii-BaseDetection/BorderDet
Side-Aware Boundary Localization for More Precise Object Detection
- paper: https://arxiv.org/abs/1912.04260
- summary:
- code: https://github.com/open-mmlab/mmdetection
PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments
- paper: https://arxiv.org/abs/2007.09584
- summary:
AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling
- paper: https://arxiv.org/abs/2007.09336
- summary:
TIDE: A General Toolbox for Understanding Errors in Object Detection
- paper: https://arxiv.org/abs/2008.08115
- summary: phalanx-hk#5
- code: https://github.com/dbolya/tide
Corner Proposal Network for Anchor-free, Two-stage Object Detection
- paper: https://arxiv.org/abs/2007.13816
- summary: phalanx-hk#4
- code: https://github.com/Duankaiwen/CPNDet
Soft Anchor-Point Object Detection
- paper: https://arxiv.org/abs/1911.12448
- summary:
MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection
Disentangled Non-local Neural Networks
- paper: https://arxiv.org/abs/2006.06668
- summary:
Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training
- paper: https://arxiv.org/abs/2004.06002
- summary:
OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features
- paper: https://arxiv.org/abs/2003.06800
- summary:
- code: https://github.com/aosokin/os2d
New Threats against Object Detector with Non-Local Block
Dense RepPoints: Representing Visual Objects with Dense Point Sets
- paper: https://arxiv.org/abs/1912.11473
- summary:
- code: https://github.com/justimyhxu/Dense-RepPoints
Large Batch Optimization for Object Detection: Training COCO in 12 Minutes
Dive Deeper Into Box for Object Detection
CenterNet Heatmap Propagation for Real-time Video Object Detection
Probabilistic Anchor Assignment with IoU Prediction for Object Detection
- paper: https://arxiv.org/abs/2007.08103
- summary:
- code: https://github.com/kkhoot/PAA
HoughNet: Integrating near and long-range evidence for bottom-up object detection
- paper: https://arxiv.org/abs/2007.02355
- summary:
- code: https://github.com/nerminsamet/houghnet
Domain Adaptive Object Detection via Asymmetric Tri-way Faster-RCNN
- paper: https://arxiv.org/abs/2007.01571
- summary:
Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation
- paper: https://arxiv.org/abs/2007.02846
- summary:
RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving
- paper: https://arxiv.org/abs/2001.03343
- summary:
Finding Your (3D) Center: 3D Object Detection Using a Learned Loss
- paper: https://arxiv.org/abs/2004.02693
- summary:
Rotation-robust Intersection over Union for 3D Object Detection
Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
- paper: https://arxiv.org/abs/1912.12791
- summary:
Automated Data Augmentation Significantly Improves 3D Object Detection
Pillar-based Object Detection for Autonomous Driving
- paper: https://arxiv.org/abs/2007.10323
- summary:
Learning and aggregating deep local descriptors for instance-level recognition
Targeted Attack for Deep Hashing based Retrieval
- paper: https://arxiv.org/abs/2004.07955
- summary:
Online Invariance Selection for Local Feature Descriptors
- paper: https://arxiv.org/abs/2007.08988
- summary:
ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval
- paper: http://www.weixiushen.com/publication/eccv20_ExchNet.pdf
- summary:
S2DNet: Learning accurate correspondences for sparse-to-dense feature matching
- paper: https://arxiv.org/abs/2004.01673
- summary:
Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval
- paper: https://arxiv.org/abs/2007.12163
- summary:
Unifying Deep Local and Global Features for Image Search
- paper: https://arxiv.org/abs/2001.05027
- summary:
SOLAR: Second-Order Loss and Attention for Image Retrieval
- paper: https://arxiv.org/abs/2001.08972
- summary:
Metric learning: cross-entropy vs. pairwise losses
- paper: https://arxiv.org/abs/2003.08983
- summary:
Spherical Feature Transform for Deep Metric Learning
The Group Loss for Deep Metric Learning
- paper: http://arxiv.org/abs/1912.00385
- summary:
BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition
Explainable Face Recognition
Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification
- paper: https://arxiv.org/abs/2007.10315
- summary:
Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network
Identity-Guided Human Semantic Parsing Learning for Person Re-Identification
Faster Person Re-Identification
The Devil is in the Details: Self-Supervised Attention for Vehicle Re-Identification
- paper: https://arxiv.org/abs/2004.06271
- summary:
Generalizing Person Re-Identification by Camera-Aware Invariance Learning and Cross-Domain Mixup
Prediction, Recovery and Identification: Adaptive Low-Resolution Person Re-Identification
Multiple Expert Brainstorming for Domain Adaptive Person Re-identification
- paper: https://arxiv.org/abs/2007.01546
- summary:
- code: https://github.com/YunpengZhai/MEB-Net
Conditional Convolutions for Instance Segmentation
- paper: https://arxiv.org/abs/2003.05664
- summary:
SIP: Spatial Information Preservation for Fast Instance Segmentation
- paper: https://arxiv.org/abs/2007.14772
- summary:
- code: https://github.com/JialeCao001/SipMask
Learning with Noisy Class Labels for Instance Segmentation
Boundary-preserving Mask R-CNN
- paper: https://arxiv.org/abs/2007.08921
- summary:
- code: https://github.com/hustvl/BMaskR-CNN
The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation
- paper: https://arxiv.org/abs/2007.11978
- summary:
- code: https://github.com/twangnh/SimCal
SOLO: Segmenting Objects by Locations
- paper: https://arxiv.org/abs/1912.04488
- summary: phalanx-hk#2
LevelSet R-CNN: A Deep Variational Method for Instance Segmentation
PatchPerPix for Instance Segmentation
- paper: https://arxiv.org/abs/2001.07626
- summary:
Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation
- paper: https://arxiv.org/abs/2003.08440
- summary:
Semantic Flow for Fast and Accurate Scene Parsing
- paper: https://arxiv.org/abs/2002.10120
- summary:
- code: https://github.com/donnyyou/torchcv
Object-Contextual Representations for Semantic Segmentation
- paper: https://arxiv.org/abs/1909.11065
- summary:
- code: https://github.com/rosinality/ocr-pytorch
GINet: Graph Interaction Network for Scene Parsing
Blended Grammar Network for Human Parsing
EfficientFCN: Holistically-guided Decoding for Semantic Segmentation
Learning to Predict Context-adaptive Convolution for Semantic Segmentation
- paper: https://arxiv.org/abs/2004.08222
- summary:
SegFix: Model-Agnostic Boundary Refinement for Segmentation
- paper: https://arxiv.org/abs/2007.04269
- summary: phalanx-hk#3
- code: https://github.com/openseg-group/openseg.pytorch
Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
- paper: https://arxiv.org/abs/2007.01550
- summary:
- code: https://github.com/detectRecog/PointTrack
Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking
- paper: https://arxiv.org/pdf/2007.14557.pdf
- summary:
- code: https://github.com/pjl1995/CTracker
Tracking objects as points
- paper: https://arxiv.org/abs/2004.01177
- summary:
- code: https://github.com/xingyizhou/CenterTrack
TAO: A Large-scale Benchmark for Tracking Any Object
- paper: https://arxiv.org/abs/2005.10356
- summary:
- project page: http://taodataset.org/
Towards Real-time MOT: A Joint Solution for Detection and Appearance Embedding
Learning Feature Embeddings for Discriminant Model based Tracking
Learning Object-aware Anchor-free Networks for Real-time Object Tracking
PG-Net: Pixel to Global Matching Network for Visual Tracking Know Your Surroundings: Exploiting Scene Information for Object Tracking
- paper: https://arxiv.org/abs/2003.11014
- summary:
PiP: Planning-informed Trajectory Prediction for Autonomous Driving
- paper: https://arxiv.org/abs/2003.11476
- summary:
Detecting Human-Object Interactions with Action Co-occurrence Priors
- paper: https://arxiv.org/abs/2007.08728
- summary:
- code: https://github.com/Dong-JinKim/ActionCooccurrencePriors/
UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection
Visual Compositional Learning for Human Object Interaction Detection
Polysemy Deciphering Network for Human-Object Interaction Detection
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
- paper: https://arxiv.org/abs/2004.06468
- summary:
End-to-End Estimation of Multi-Person 3D Poses from Multiple Cameras
- paper: https://arxiv.org/abs/2004.06239
- summary:
HMOR: Hierarchical Multi-person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
Learning Delicate Local Representations for Multi-Person Pose Estimation
- paper: http://arxiv.org/abs/2003.04030
- summary:
Whole-Body Human Pose Estimation in the Wild
- paper: https://arxiv.org/abs/2007.11858
- summary:
- code: https://github.com/jin-s13/COCO-WholeBody
SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation
Adversarial Semantic Data Augmentation for Human Pose Estimation
Occlusion-Aware Siamese Network for Human Pose Estimation
emporal Keypoint Matching and Refinement Network for Pose Estimation and Tracking
Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
- paper: https://arxiv.org/abs/2001.01536
- summary:
Knowledge Transfer via Dense Cross-layer Mutual-distillation
Matching Guided Distillation
Feature Normalized Knowledge Distillation for Image Classification
Learning to Detect Open Classes for Universal Domain Adaptation
On the Effectiveness of Image Rotation for Open Set Domain Adapation
Self-Supervised CycleGAN for Object-Preserving Image-to-Image Domain Adaptation
Domain Adaptation through Task Distillation
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
- paper: https://arxiv.org/abs/2007.09933
- summary:
- code:
Feature-metric Loss for Self-supervised Learning of Depth and Egomotion
- paper: https://arxiv.org/abs/2007.10603
- summary: