ICCV 2023 论文和开源项目合集
ICCV 2023 论文和开源项目合集(papers with code)!
2160 papers accepted!
ICCV 2023 收录论文IDs:https://t.co/A0mCH8gbOi
注1:欢迎各位大佬提交issue,分享ICCV 2023论文和开源项目!
注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision
如果你想了解最新最优质的的CV论文、开源项目和学习资料,欢迎扫码加入【CVer学术交流群】!互相学习,一起进步~
Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
Paper: https://arxiv.org/abs/2303.17606
Code: https://github.com/songrise/AvatarCraft
Rethinking Mobile Block for Efficient Attention-based Models
PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization
CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation
IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis
Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis
Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields
PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
DIRE for Diffusion-Generated Image Detection
Read-only Prompt Optimization for Vision-Language Few-shot Learning
Introducing Language Guidance in Prompt-based Continual Learning
Read-only Prompt Optimization for Vision-Language Few-shot Learning
Femtodet: an object detection baseline for energy versus performance tradeoffs
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation
Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers
Segment Anything
MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation
FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation
Residual Pattern Learning for Pixel-wise Out-of-Distribution Detection in Semantic Segmentation
Disentangle then Parse:Night-time Semantic Segmentation with Illumination Disentanglement
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
DVIS: Decoupled Video Instance Segmentation Framework
BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification
CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
Self-supervised Learning to Bring Dual Reversed Rolling Shutter Images Alive
Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution.
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models
Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Cross Modal Transformer: Towards Fast and Robust 3D Object Detection
MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling
SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection
Rethinking Range View Representation for LiDAR Segmentation
MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Simulating Fluids in Real-World Still Images
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
Implicit Neural Representation for Cooperative Low-light Image Enhancement
Self-supervised Character-to-Character Distillation for Text Recognition
MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition
Zero-Shot Composed Image Retrieval with Textual Inversion
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting
Point-Query Quadtree for Crowd Counting, Localization, and More
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives
MotionBERT: A Unified Perspective on Learning Human Motion Representations
Graph Matching with Bi-level Noisy Correspondence
LDL: Line Distance Functions for Panoramic Localization
Active Neural Mapping
Reconstructing Groups of People with Hypergraph Relational Reasoning