ICCV2021 Papers With Code Save

ICCV 2023 论文和开源项目合集

Project README

ICCV2023-Papers-with-Code

ICCV 2023 论文和开源项目合集(papers with code)!

2160 papers accepted!

ICCV 2023 收录论文IDs:https://t.co/A0mCH8gbOi

注1:欢迎各位大佬提交issue,分享ICCV 2023论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

ICCV 2021

如果你想了解最新最优质的的CV论文、开源项目和学习资料,欢迎扫码加入【CVer学术交流群】!互相学习,一起进步~

【ICCV 2023 论文开源目录】

Avatars

Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

Paper: https://arxiv.org/abs/2303.17606

Code: https://github.com/songrise/AvatarCraft

Backbone

Rethinking Mobile Block for Efficient Attention-based Models

CLIP

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation

NeRF

IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis

Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis

Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields

Diffusion Models(扩散模型)

PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

DIRE for Diffusion-Generated Image Detection

Prompt

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Introducing Language Guidance in Prompt-based Continual Learning

视觉和语言(Vision-Language)

Read-only Prompt Optimization for Vision-Language Few-shot Learning

目标检测(Object Detection)

Femtodet: an object detection baseline for energy versus performance tradeoffs

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

目标跟踪(Visual Tracking)

Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers

语义分割(Semantic Segmentation)

Segment Anything

MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation

FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation

Residual Pattern Learning for Pixel-wise Out-of-Distribution Detection in Semantic Segmentation

Disentangle then Parse:Night-time Semantic Segmentation with Illumination Disentanglement

视频目标分割(Video Object Segmentation)

Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus

视频实例分割(Video Instance Segmentation)

DVIS: Decoupled Video Instance Segmentation Framework

医学图像分类

BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification

医学图像分割

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

Low-level Vision

Self-supervised Learning to Bring Dual Reversed Rolling Shutter Images Alive

超分辨率(Super-Resolution)

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution.

3D点云(3D Point Cloud)

Robo3D: Towards Robust and Reliable 3D Perception against Corruptions

Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models

Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos

3D目标检测(3D Object Detection)

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling

SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection

3D语义分割(3D Semantic Segmentation)

Rethinking Range View Representation for LiDAR Segmentation

3D目标跟踪(3D Object Tracking)

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors

视频理解(Video Understanding)

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

图像生成(Image Generation)

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

视频生成(Video Generation)

Simulating Fluids in Real-World Still Images

图像编辑(Image Editing)

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

视频编辑(Video Editing)

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

人体运动生成(Human Motion Generation)

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction

低光照图像增强(Low-light Image Enhancement)

Implicit Neural Representation for Cooperative Low-light Image Enhancement

场景文本检测(Scene Text Detection)

场景文本识别(Scene Text Recognition)

Self-supervised Character-to-Character Distillation for Text Recognition

MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition

图像检索(Image Retrieval)

Zero-Shot Composed Image Retrieval with Textual Inversion

图像融合(Image Fusion)

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

轨迹预测(Trajectory Prediction)

EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting

人群计数(Crowd Counting)

Point-Query Quadtree for Crowd Counting, Localization, and More

Video Quality Assessment(视频质量评价)

Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

其它(Others)

MotionBERT: A Unified Perspective on Learning Human Motion Representations

Graph Matching with Bi-level Noisy Correspondence

LDL: Line Distance Functions for Panoramic Localization

Active Neural Mapping

Reconstructing Groups of People with Hypergraph Relational Reasoning

Open Source Agenda is not affiliated with "ICCV2021 Papers With Code" Project. README Source: amusi/ICCV2023-Papers-with-Code

Open Source Agenda Badge

Open Source Agenda Rating