Tensorflow implementation for reproducing main results in the paper StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.
python 2.7
[Optional] Torch is needed, if use the pre-trained char-CNN-RNN text encoder.
[Optional] skip-thought is needed, if use the skip-thought text encoder.
In addition, please add the project folder to PYTHONPATH and pip install
the following packages:
prettytensor
progressbar
python-dateutil
easydict
pandas
torchfile
Data
Data/
.Data/birds/
and Data/flowers/
, respectively.python misc/preprocess_birds.py
python misc/preprocess_flowers.py
Training
python stageI/run_exp.py --cfg stageI/cfg/birds.yml --gpu 0
python stageII/run_exp.py --cfg stageII/cfg/birds.yml --gpu 1
birds.yml
to flowers.yml
to train a StackGAN model on Oxford-102 dataset using our preprocessed data for flowers.*.yml
files are example configuration files for training/testing our models.Pretrained Model
models/
.models/
.models/
(Just used the same setting as the char-CNN-RNN. We assume better results can be achieved by playing with the hyper-parameters).Run Demos
sh demo/flowers_demo.sh
to generate flower samples from sentences. The results will be saved to Data/flowers/example_captions/
. (Need to download the char-CNN-RNN text encoder for flowers to models/text_encoder/
. Note: this text encoder is provided by reedscot/icml2016).sh demo/birds_demo.sh
to generate bird samples from sentences. The results will be saved to Data/birds/example_captions/
.(Need to download the char-CNN-RNN text encoder for birds to models/text_encoder/
. Note: this text encoder is provided by reedscot/icml2016).python demo/birds_skip_thought_demo.py --cfg demo/cfg/birds-skip-thought-demo.yml --gpu 2
to generate bird samples from sentences. The results will be saved to Data/birds/example_captions-skip-thought/
. (Need to download vocabulary for skip-thought vectors to Data/skipthoughts/
).Examples for birds (char-CNN-RNN embeddings), more on youtube:
Examples for flowers (char-CNN-RNN embeddings), more on youtube:
Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same discription :smiley:
If you find StackGAN useful in your research, please consider citing:
@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}
Our follow-up work
References