A Collection of Papers and Codes in ICCV2023/2021 about low level vision
A Collection of Papers and Codes in ICCV2023 related to Low-Level Vision
[In Construction] If you find some missing papers or typos, feel free to pull issues or requests.
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
DiffIR: Efficient Diffusion Model for Image Restoration
PIRNet: Privacy-Preserving Image Restoration Network via Wavelet Lifting
Focal Network for Image Restoration
Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration
Under-Display Camera Image Restoration with Scattering Effect
FSI: Frequency and Spatial Interactive Learning for Image Restoration in Under-Display Cameras
Multi-weather Image Restoration via Domain Translation
Adverse Weather Removal with Codebook Priors
Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond
Improving Lens Flare Removal with General Purpose Pipeline and Multiple Light Sources Recovery
High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
Boundary-Aware Divide and Conquer: A Diffusion-Based Solution for Unsupervised Shadow Removal
Leveraging Inpainting for Single-Image Shadow Removal
Fine-grained Visible Watermark Removal
Physics-Driven Turbulence Image Restoration with Stochastic Refinement
Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild
DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration
Fingerprinting Deep Image Restoration Models
Self-supervised Monocular Underwater Depth Recovery, Image Restoration, and a Real-sea Video Dataset
Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction
Snow Removal in Video: A New Dataset and A Novel Method
Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation
Fast Full-frame Video Stabilization with Iterative Optimization
Minimum Latency Deep Online Video Stabilization
Task Agnostic Restoration of Natural Video Dynamics
On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement
SRFormer: Permuted Self-Attention for Single Image Super-Resolution
DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution
Dual Aggregation Transformer for Image Super-Resolution
MSRA-SR: Image Super-resolution Transformer with Multi-scale Shared Representation Acquisition
Content-Aware Local GAN for Photo-Realistic Super-Resolution
Boosting Single Image Super-Resolution via Partial Channel Shifting
Feature Modulation Transformer: Cross-Refinement of Global Representation via High-Frequency Prior for Image Super-Resolution
Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution
Lightweight Image Super-Resolution with Superpixel Token Interaction
Reconstructed Convolution Module Based Look-Up Tables for Efficient Image Super-Resolution
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution
MetaF2N: Blind Image Super-Resolution by Learning Efficient Model Adaptation from Faces
Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution
LMR: A Large-Scale Multi-Reference Dataset for Reference-Based Super-Resolution
Real-CE: A Benchmark for Chinese-English Scene Text Image Super-resolution
Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution
Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution
HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution
Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling
Decomposition-Based Variational Network for Multi-Contrast MRI Super-Resolution and Reconstruction
CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution
Towards Real-World Burst Image Super-Resolution: Benchmark and Method
Self-Supervised Burst Super-Resolution
Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution
Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution
MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution
Downscaled Representation Matters: Improving Image Rescaling with Collaborative Downscaled Images
Random Sub-Samples Generation for Self-Supervised Real Image Denoising
Score Priors Guided Deep Variational Inference for Unsupervised Real-World Single Image Denoising
Unsupervised Image Denoising in Real-World Scenarios via Self-Collaboration Parallel Generative Adversarial Branches
Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network
Multi-view Self-supervised Disentanglement for General Image Denoising
Iterative Denoiser and Noise Estimator for Self-Supervised Image Denoising
Noise2Info: Noisy Image to Information of Noise for Self-Supervised Image Denoising
The Devil is in the Upsampling: Architectural Decisions Made Simpler for Denoising with Deep Image Prior
Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising
ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
Towards General Low-Light Raw Noise Synthesis and Modeling
Hybrid Spectral Denoising Transformer with Guided Attention
Multiscale Structure Guided Diffusion for Image Deblurring
Multi-Scale Residual Low-Pass Filter Network for Image Deblurring
Single Image Defocus Deblurring via Implicit Neural Inverse Kernels
Single Image Deblurring with Row-dependent Blur Magnitude
Non-Coaxial Event-Guided Motion Deblurring with Spatial Alignment
Generalizing Event-Based Motion Deblurring in Real-World Scenarios
Exploring Temporal Frequency Spectrum in Deep Video Deblurring
From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal
Learning Rain Location Prior for Nighttime Deraining
Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks
Unsupervised Video Deraining with An Event Camera
Both Diverse and Realism Matter: Physical Attribute and Style Alignment for Rainy Image Generation
MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing
Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors
Alignment-free HDR Deghosting with Semantics Consistent Transformer
MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion
RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image
Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction
LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction
Joint Demosaicing and Deghosting of Time-Varying Exposures for Single-Shot HDR Imaging
GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild
Beyond the Pixel: a Photometrically Calibrated HDR Dataset for Luminance and Color Prediction
Video Object Segmentation-aware Video Frame Interpolation
Rethinking Video Frame Interpolation from Shutter Mode Induced Degradation
Iterative Prompt Learning for Unsupervised Backlit Image Enhancement
ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
Implicit Neural Representation for Cooperative Low-light Image Enhancement
Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network
Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model
Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement
Low-Light Image Enhancement with Multi-Stage Residue Quantization and Brightness-Aware Attention
Dancing in the Dark: A Benchmark towards General Low-light Video Enhancement
NIR-assisted Video Enhancement via Unpaired 24-hour Data
Coherent Event Guided Low-Light Video Enhancement
Deep Image Harmonization with Learnable Augmentation
Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
Diverse Inpainting and Editing with GAN Inversion
Rethinking Fast Fourier Convolution in Image Inpainting
Continuously Masked Transformer for Image Inpainting
MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices
PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting
ProPainter: Improving Propagation and Transformer for Video Inpainting
Semantic-Aware Dynamic Parameter for Video Inpainting Transformer
CIRI: Curricular Inactivation for Residue-aware One-shot Video Inpainting
Parallax-Tolerant Unsupervised Deep Image Stitching
RFD-ECNet: Extreme Underwater Image Compression with Reference to Feature Dictionary
COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability
Computationally-Efficient Neural Image Compression with Shallow Decoders
Dec-Adapter: Exploring Efficient Decoder-Side Adapter for Bridging Screen Content and Natural Image Compression
Semantically Structured Image Compression via Irregular Group-Based Decoupling
TransTIC: Transferring Transformer-based Image Compression from Human Perception to Machine Perception
AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing
COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec
Scene Matters: Model-based Deep Video Compression
Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks
Test Time Adaptation for Blind Image Quality Assessment
Troubleshooting Ethnic Quality Bias with Curriculum Domain Adaptation for Face Image Quality Assessment
SQAD: Automatic Smartphone Camera Quality Assessment and Benchmarking
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives
AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks
Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers
All-to-key Attention for Arbitrary Style Transfer
StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
HairNeRF: Geometry-Aware Image Synthesis for Hairstyle Transfer
Adaptive Nonlinear Latent Transformation for Conditional Face Editing
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation
HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending
StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
Diverse Inpainting and Editing with GAN Inversion
Effective Real Image Editing with Accelerated Iterative Diffusion Inversion
Conceptual and Hierarchical Latent Space Decomposition for Face Editing
Editing Implicit Assumptions in Text-to-Image Diffusion Models
Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
A Latent Space of Stochastic Diffusion Models for Zero-Shot Image Editing and Guidance
Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation
RIGID: Recurrent GAN Inversion and Editing of Real Face Videos
Pix2Video: Video Editing using Image Diffusion
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
StableVideo: Text-driven Consistency-aware Diffusion Video Editing
VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs
Adding Conditional Control to Text-to-Image Diffusion Models
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Ablating Concepts in Text-to-Image Diffusion Models
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
Story Visualization by Online Text Augmentation with Context Memory
DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
Dense Text-to-Image Generation with Attention Modulation
ITI-GEN: Inclusive Text-to-Image Generation
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Human Preference Score: Better Aligning Text-to-Image Models with Human Preference
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
Zero-shot spatial layout conditioning for text-to-image diffusion models
A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis
Evaluating Data Attribution for Text-to-Image Models
Expressive Text-to-Image Generation with Rich Text
Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Anti-DreamBooth: Protecting Users from Personalized Text-to-image Synthesis
Discriminative Class Tokens for Text-to-Image Diffusion Models
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Reinforced Disentanglement for Face Swapping without Skip Connection
BlendFace: Re-designing Identity Encoders for Face-Swapping
General Image-to-Image Translation with One-Shot Image Guidance
GaFET: Learning Geometry-aware Facial Expression Translation from In-The-Wild Images
Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation
UGC: Unified GAN Compression for Efficient Image-to-Image Translation
Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis
Controllable Person Image Synthesis with Pose-Constrained Latent Diffusion
Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration
Masked Diffusion Transformer is a Strong Image Synthesizer
Q-Diffusion: Quantizing Diffusion Models
The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation
LFS-GAN: Lifelong Few-Shot Image Generation
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
Smoothness Similarity Regularization for Few-Shot GAN Adaptation
UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation
Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation
Personalized Image Generation for Color Vision Deficiency Population
EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Neural Characteristic Function Learning for Conditional Image Generation
LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis
Perceptual Artifacts Localization for Image Synthesis Tasks
SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
Erasing Concepts from Diffusion Models
A Complete Recipe for Diffusion Generative Models
Efficient Diffusion Training via Min-SNR Weighting Strategy
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption
AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration
Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion
SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Text2Performer: Text-Driven Human Video Generation
StyleLipSync: Style-based Personalized Lip-sync Video Generation
Mixed Neural Voxels for Fast Multi-view Video Synthesis
WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction
DreamPose: Fashion Video Synthesis with Stable Diffusion
Structure and Content-Guided Video Synthesis with Diffusion Models
DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer
Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
Deep Optics for Video Snapshot Compressive Imaging
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
Single Image Reflection Separation via Component Synergy
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
Few shot font generation via transferring similarity guided global style and quantization local style