Skip to content

zijinY/CVPR2021-Papers-with-Code

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

CVPR 2021 论文和开源项目合集(Papers with Code)

CVPR 2021 论文和开源项目合集(papers with code)!

CVPR 2021 收录列表:http://cvpr2021.thecvf.com/sites/default/files/2021-03/accepted_paper_ids.txt

注1:欢迎各位大佬提交issue,分享CVPR 2021论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

CVPR 2021 中奖群已成立!已经收录的同学,可以添加微信:CVer9999,请备注:CVPR2021已收录+姓名+学校/公司名称!一定要根据格式申请,可以拉你进群沟通开会等事宜。

【CVPR 2021 论文开源目录】

Backbone

Diverse Branch Block: Building a Convolution as an Inception-like Unit

Scaling Local Self-Attention For Parameter Efficient Visual Backbones

ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network

Involution: Inverting the Inherence of Convolution for Visual Recognition

Coordinate Attention for Efficient Mobile Network Design

Inception Convolution with Efficient Dilation Search

RepVGG: Making VGG-style ConvNets Great Again

NAS

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Transformers

Neural Architecture Search with Random Labels

Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation

  • Paper: None
  • Code: None

Prioritized Architecture Sampling with Monto-Carlo Tree Search

Contrastive Neural Architecture Search with Neural Architecture Comparators

AttentiveNAS: Improving Neural Architecture Search via Attentive

ReNAS: Relativistic Evaluation of Neural Architecture Search

HourNAS: Extremely Fast Neural Architecture

Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

Inception Convolution with Efficient Dilation Search

GAN

TediGAN: Text-Guided Diverse Image Generation and Manipulation

Generative Hierarchical Features from Synthesizing Image

Teachers Do More Than Teach: Compressing Image-to-Image Models

HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms

pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network

Diverse Semantic Image Synthesis via Probability Distribution Modeling

LOHO: Latent Optimization of Hairstyles via Orthogonalization

PISE: Person Image Synthesis and Editing with Decoupled GAN

DeFLOCNet: Deep Image Editing via Flexible Low-level Controls

PD-GAN: Probabilistic Diverse GAN for Image Inpainting

Efficient Conditional GAN Transfer with Knowledge Propagation across Classes

Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

  • Paper: None
  • Code: None

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

A 3D GAN for Improved Large-pose Facial Recognition

HumanGAN: A Generative Model of Humans Images

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis

CoMoGAN: continuous model-guided image-to-image translation

Training Generative Adversarial Networks in One Stage

Closed-Form Factorization of Latent Semantics in GANs

Anycost GANs for Interactive Image Synthesis and Editing

Image-to-image Translation via Hierarchical Style Disentanglement

VAE

Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders

Visual Transformer

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Transformers

MIST: Multiple Instance Spatial Transformer Network

Multimodal Motion Prediction with Stacked Transformers

Revamping cross-modal recipe retrieval with hierarchical Transformers and self-supervised learning

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Pre-Trained Image Processing Transformer

End-to-End Video Instance Segmentation with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

End-to-End Human Object Interaction Detection with HOI Transformer

Transformer Interpretability Beyond Attention Visualization

Regularization

Regularizing Neural Networks via Adversarial Model Perturbation

长尾分布(Long-Tailed)

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification

无监督/自监督(Un/Self-Supervised)

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

Spatially Consistent Representation Learning

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples

Exploring Simple Siamese Representation Learning

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

半监督学习(Semi-Supervised )

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

2D目标检测(Object Detection)

2D目标检测

OTA: Optimal Transport Assignment for Object Detection

Distilling Object Detectors via Decoupled Features

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Positive-Unlabeled Data Purification in the Wild for Object Detection

  • Paper: None
  • Code: None

Instance Localization for Self-supervised Detection Pretraining

MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection

End-to-End Object Detection with Fully Convolutional Network

Robust and Accurate Object Detection via Adversarial Learning

I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

YOLOF:You Only Look One-level Feature

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

General Instance Distillation for Object Detection

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

Multiple Instance Active Learning for Object Detection

Towards Open World Object Detection

Few-Shot目标检测

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

Few-Shot Object Detection via Contrastive Proposal Encoding

旋转目标检测

ReDet: A Rotation-equivariant Detector for Aerial Object Detection

单/多目标跟踪(Object Tracking)

单目标跟踪

Graph Attention Tracking

Rotation Equivariant Siamese Networks for Tracking

Track to Detect and Segment: An Online Multi-Object Tracker

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

TransT - Transformer Tracking

多目标跟踪

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Learning a Proposal Classifier for Multiple Object Tracking

Track to Detect and Segment: An Online Multi-Object Tracker

语义分割(Semantic Segmentation)

Bidirectional Projection Network for Cross Dimension Scene Understanding

Cross-Dataset Collaborative Learning for Semantic Segmentation

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations

Capturing Omni-Range Context for Omnidirectional Segmentation

Learning Statistical Texture for Semantic Segmentation

PLOP: Learning without Forgetting for Continual Semantic Segmentation

弱监督语义分割

Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation

BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation

半监督语义分割

Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation

域自适应语义分割

Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization

MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation

Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation

实例分割(Instance Segmentation)

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers

End-to-End Video Instance Segmentation with Transformers

Zero-shot instance segmentation(Not Sure)

全景分割(Panoptic Segmentation)

Fully Convolutional Networks for Panoptic Segmentation

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

医学图像分割

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

交互式视频目标分割(Interactive-Video-Object-Segmentation)

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

显著性检测(Saliency Detection)

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

行人搜索(Person Search)

Anchor-Free Person Search

视频理解/行为识别(Video Understanding)

Learning Salient Boundary Feature for Anchor-free Temporal Action Localization

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

ACTION-Net: Multipath Excitation for Action Recognition

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

TDN: Temporal Difference Networks for Efficient Action Recognition

人脸识别(Face Recognition)

A 3D GAN for Improved Large-pose Facial Recognition

MagFace: A Universal Representation for Face Recognition and Quality Assessment

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

人脸检测(Face Detection)

CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement

人脸活体检测(Face Anti-Spoofing)

Cross Modal Focal Loss for RGBD Face Anti-Spoofing

Deepfake检测(Deepfake Detection)

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

Multi-attentional Deepfake Detection

人脸年龄估计(Age Estimation)

PML: Progressive Margin Loss for Long-tailed Age Classification

人脸表情识别(Facial Expression Recognition)

Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition

Deepfakes

MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes

人体解析(Human Parsing)

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

2D/3D人体姿态估计(2D/3D Human Pose Estimation)

2D 人体姿态估计

DCPose: Deep Dual Consecutive Network for Human Pose Estimation

3D 人体姿态估计

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

场景文本识别(Scene Text Recognition)

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

模型压缩/剪枝/量化

Teachers Do More Than Teach: Compressing Image-to-Image Models

模型剪枝

Dynamic Slimmable Network

模型量化

Learnable Companding Quantization for Accurate Low-bit Neural Networks

知识蒸馏(Knowledge Distillation)

Distilling Object Detectors via Decoupled Features

超分辨率(Super-Resolution)

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

AdderSR: Towards Energy Efficient Image Super-Resolution

视频超分辨率

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

图像恢复(Image Restoration)

Multi-Stage Progressive Image Restoration

图像补全(Image Inpainting)

PD-GAN: Probabilistic Diverse GAN for Image Inpainting

图像编辑(Image Editing)

Anycost GANs for Interactive Image Synthesis and Editing

PISE: Person Image Synthesis and Editing with Decoupled GAN

DeFLOCNet: Deep Image Editing via Flexible Low-level Controls

Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

  • Paper: None
  • Code: None

反光去除(Reflection Removal)

Robust Reflection Removal with Reflection-free Flash-only Cues

3D点云分类(3D Point Clouds Classification)

Equivariant Point Network for 3D Point Cloud Analysis

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

3D目标检测(3D Object Detection)

M3DSSD: Monocular 3D Single Stage Object Detector

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

Center-based 3D Object Detection and Tracking

Categorical Depth Distribution Network for Monocular 3D Object Detection

3D语义分割(3D Semantic Segmentation)

Bidirectional Projection Network for Cross Dimension Scene Understanding

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

3D目标跟踪(3D Object Trancking)

Center-based 3D Object Detection and Tracking

3D点云配准(3D Point Cloud Registration)

PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency

PREDATOR: Registration of 3D Point Clouds with Low Overlap

3D点云补全(3D Point Cloud Completion)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion

6D位姿估计(6D Pose Estimation)

FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

相机姿态估计

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose

深度估计

Beyond Image to Depth: Improving Depth Prediction using Echoes

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

Depth from Camera Motion and Object Detection

对抗样本

Natural Adversarial Examples

图像检索(Image Retrieval)

QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval

视频检索(Video Retrieval)

On Semantic Similarity in Video Retrieval

跨模态检索(Cross-modal Retrieval)

Revamping cross-modal recipe retrieval with hierarchical Transformers and self-supervised learning

Zero-Shot Learning

Counterfactual Zero-Shot and Open-Set Visual Recognition

联邦学习(Federated Learning)

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

视频插帧(Video Frame Interpolation)

CDFI: Compression-Driven Network Design for Frame Interpolation

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

视觉推理(Visual Reasoning)

Transformation Driven Visual Reasoning

视图合成(View Synthesis)

NeX: Real-time View Synthesis with Neural Basis Expansion

DomainGeneralization

FSDR: Frequency Space Domain Randomization for Domain Generalization

"人-物"交互(HOI)检测

Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information

Reformulating HOI Detection as Adaptive Set Prediction

Detecting Human-Object Interaction via Fabricated Compositional Learning

End-to-End Human Object Interaction Detection with HOI Transformer

阴影去除(Shadow Removal)

Auto-Exposure Fusion for Single-Image Shadow Removal

虚拟换衣(Virtual Try-On)

Parser-Free Virtual Try-on via Distilling Appearance Flows

基于外观流蒸馏的无需人体解析的虚拟换装

数据集(Datasets)

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

Depth from Camera Motion and Object Detection

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

其他(Others)

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

ACRE: Abstract Causal REasoning Beyond Covariation

Confluent Vessel Trees with Accurate Bifurcations

Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks

Knowledge Evolution in Neural Networks

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

SGP: Self-supervised Geometric Perception

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

Diffusion Probabilistic Models for 3D Point Cloud Generation

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

待添加(TODO)

不确定中没中(Not Sure)

CT Film Recovery via Disentangling Geometric Deformation and Photometric Degradation: Simulated Datasets and Deep Models

Toward Explainable Reflection Removal with Distilling and Model Uncertainty

DeepOIS: Gyroscope-Guided Deep Optical Image Stabilizer Compensation

Exploring Adversarial Fake Images on Face Manifold

Uncertainty-Aware Semi-Supervised Crowd Counting via Consistency-Regularized Surrogate Task

Temporal Contrastive Graph for Self-supervised Video Representation Learning

Boosting Monocular Depth Estimation Models to High-Resolution via Context-Aware Patching

Fast and Memory-Efficient Compact Bilinear Pooling

Identification of Empty Shelves in Supermarkets using Domain-inspired Features with Structural Support Vector Machine

Estimating A Child's Growth Potential From Cephalometric X-Ray Image via Morphology-Aware Interactive Keypoint Estimation

https://github.com/ShaoQiangShen/CVPR2021

https://github.com/gillesflash/CVPR2021

https://github.com/anonymous-submission1991/BaLeNAS

https://github.com/cvpr2021dcb/cvpr2021dcb

https://github.com/anonymousauthorCV/CVPR2021_PaperID_8578

https://github.com/AldrichZeng/FreqPrune

https://github.com/Anonymous-AdvCAM/Anonymous-AdvCAM

https://github.com/ddfss/datadrive-fss

About

CVPR 2021 论文和开源项目合集

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published