This repository contains the offical PyTorch implementation of paper:
DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation, CVPR 2023
To be continued...
We will release the training and inference code for the LSUN cat, church, horse later : )
Install CLIP:
conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git
Download pre-trained models :
./models/pretrained_models
../models/pretrained_models/stylegan2-{cat/church/horse}
.DeltaEdit is trained on latent vectors.
For the facial domain, 58,000 real images from FFHQ dataset are randomly selected and 200,000 fake images from the z space in StyleGAN are sampled for training. Note that all real images are inverted by e4e encoder.
Download the provided FFHQ latent vectors from here and then place all numpy files into the folder ./latent_code/ffhq
.
Generate the 200,000 sampled latent vectors by running the following commands for each specific domain:
CUDA_VISIBLE_DEVICES=0 python generate_codes.py --classname ffhq --samples 200000
CUDA_VISIBLE_DEVICES=0 python generate_codes.py --classname cat --samples 200000
CUDA_VISIBLE_DEVICES=0 python generate_codes.py --classname church --samples 200000
CUDA_VISIBLE_DEVICES=0 python generate_codes.py --classname horse --samples 200000
./scripts/train.py
../options/train_options.py
.For training please run the following commands:
CUDA_VISIBLE_DEVICES=0 python scripts/train.py
./scripts/inference.py
../options/test_options.py
../checkpoints
../examples
.To produce editing results please run the following commands :
CUDA_VISIBLE_DEVICES=1 python scripts/inference.py --target "chubby face","face with eyeglasses","face with smile","face with pale skin","face with tanned skin","face with big eyes","face with black clothes","face with blue suit","happy face","face with bangs","face with red hair","face with black hair","face with blond hair","face with curly hair","face with receding hairline","face with bowlcut hairstyle"
The produced results are showed in the following.
You can also specify your desired target attributes to the flag of --target
.
./scripts/inference_real.py
../options/test_options.py
../checkpoints
../test_imgs
.To produce editing results please run the following commands :
CUDA_VISIBLE_DEVICES=1 python scripts/inference_real.py --target "chubby face","face with eyeglasses","face with smile","face with pale skin","face with tanned skin","face with big eyes","face with black clothes","face with blue suit","happy face","face with bangs","face with red hair","face with black hair","face with blond hair","face with curly hair","face with receding hairline","face with bowlcut hairstyle"
This code is developed based on the code of orpatashnik/StyleCLIP by Or Patashnik et al.
If you use this code for your research, please cite our paper:
@InProceedings{lyu2023deltaedit,
author = {Lyu, Yueming and Lin, Tianwei and Li, Fu and He, Dongliang and Dong, Jing and Tan, Tieniu},
title = {DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2023},
}