Awesome Text To Image Save

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

Project README

𝓐𝔀𝓮𝓼𝓸𝓶𝓮 𝓣𝓮𝔁𝓽📝-𝓽𝓸-𝓘𝓶𝓪𝓰𝓮🌇

𝓐 𝓬𝓸𝓵𝓵𝓮𝓬𝓽𝓲𝓸𝓷 𝓸𝓯 𝓻𝓮𝓼𝓸𝓾𝓻𝓬𝓮𝓼 𝓸𝓷 𝓽𝓮𝔁𝓽-𝓽𝓸-𝓲𝓶𝓪𝓰𝓮 𝓼𝔂𝓷𝓽𝓱𝓮𝓼𝓲𝓼/𝓶𝓪𝓷𝓲𝓹𝓾𝓵𝓪𝓽𝓲𝓸𝓷 𝓽𝓪𝓼𝓴𝓼.

⭐ Citation

If you find this paper and repo helpful for your research, please cite it below:


@inproceedings{zhou2023vision+,
  title={Vision+ Language Applications: A Survey},
  author={Zhou, Yutong and Shimada, Nobutaka},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={826--842},
  year={2023}
}

🎑 News

[!TIP] Version 1.0 (All-in-one version) can be found here and will be stop updating from 24/02/29.

[24/02/29] Update "Awesome Text to Image" Version 2.0! Paper With Code and Other Related Works will also be gradually updated in March.
[23/05/26] 🔥 Add our survey paper "Vision + Language Applications: A Survey" and a special Best Collection list!
[23/04/04] "Vision + Language Applications: A Survey" was accepted by CVPRW2023.
[20/10/13] Awesome-Text-to-Image repo is created.

To Do

- Add Topic Order list and Chronological Order list
- Add Best Collection
- Create ⏳Recently Focused Papers

Content

- 1. Description
- 2. Quantitative Evaluation Metrics
- 3. Datasets
- 4. Project
- 5. Paper With Code
- 6. Other Related Works
Contact Me
Contributors

Description

In the last few decades, the fields of Computer Vision (CV) and Natural Language Processing (NLP) have been made several major technological breakthroughs in deep learning research. Recently, researchers interested in combining semantic information and visual information in these traditionally independent fields. A number of studies have been conducted on text-to-image synthesis techniques that transfer input textual descriptions (keywords or sentences) into realistic images.
Papers, codes, and datasets for the text-to-image task are available here.

🐌 Markdown Format:

(Conference/Journal Year) Title, First Author et al. [Paper] [Code] [Project]

Paper With Code

Text to Face👨🏻🧒👧🏼🧓🏽
- (arXiv preprint 2024) [💬 3D] Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior, Yiqian Wu et al. [Paper]
- (CVPR 2024) CosmicMan: A Text-to-Image Foundation Model for Humans, Shikai Li et al. [Paper] [Project]
- (arXiv preprint 2024) Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization, Jinlu Zhang et al. [Paper] [Code]
- (IJACSA 2023) Mukh-Oboyob: Stable Diffusion and BanglaBERT enhanced Bangla Text-to-Face Synthesis, Aloke Kumar Saha et al. [Paper] [Code]
- (SIGGRAPH 2023) [💬 3D] DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance, Longwen Zhang et al. [Paper] [Project] [HuggingFace]
- (CVPR 2023) [💬 3D] High-Fidelity 3D Face Generation from Natural Language Descriptions, Menghua Wu et al. [Paper] [Code] [Project]
- (CVPR 2023) Collaborative Diffusion for Multi-Modal Face Generation and Editing, Ziqi Huang et al. [Paper] [Code] [Project]
- (Pattern Recognition 2023) Where you edit is what you get: Text-guided image editing with region-based attention, Changming Xiao et al. [Paper] [Code]
- (arXiv preprint 2022) Bridging CLIP and StyleGAN through Latent Alignment for Image Editing, Wanfeng Zheng et al. [Paper]
- (ACMMM 2022) Learning Dynamic Prior Knowledge for Text-to-Face Pixel Synthesis, Jun Peng et al. [Paper]
- (ACMMM 2022) Towards Open-Ended Text-to-Face Generation, Combination and Manipulation, Jun Peng et al. [Paper]
- (BMVC 2022) clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP, Justin N. M. Pinkney et al. [Paper] [Code]
- (arXiv preprint 2022) ManiCLIP: Multi-Attribute Face Manipulation from Text, Hao Wang et al. [Paper]
- (arXiv preprint 2022) Generated Faces in the Wild: Quantitative Comparison of Stable Diffusion, Midjourney and DALL-E 2, Ali Borji, [Paper] [Code] [Data]
- (arXiv preprint 2022) Text-Free Learning of a Natural Language Interface for Pretrained Face Generators, Xiaodan Du et al. [Paper] [Code]
- (Knowledge-Based Systems-2022) CMAFGAN: A Cross-Modal Attention Fusion based Generative Adversarial Network for attribute word-to-face synthesis, Xiaodong Luo et al. [Paper]
- (Neural Networks-2022) DualG-GAN, a Dual-channel Generator based Generative Adversarial Network for text-to-face synthesis, Xiaodong Luo et al. [Paper]
- (arXiv preprint 2022) Text-to-Face Generation with StyleGAN2, D. M. A. Ayanthi et al. [Paper]
- (CVPR 2022) StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis, Zhiheng Li et al. [Paper] [Code]
- (arXiv preprint 2022) StyleT2F: Generating Human Faces from Textual Description Using StyleGAN2, Mohamed Shawky Sabae et al. [Paper] [Code]
- (CVPR 2022) AnyFace: Free-style Text-to-Face Synthesis and Manipulation, Jianxin Sun et al. [Paper]
- (IEEE Transactions on Network Science and Engineering-2022) TextFace: Text-to-Style Mapping based Face Generation and Manipulation, Xianxu Hou et al. [Paper]
- (CVPR 2021) TediGAN: Text-Guided Diverse Image Generation and Manipulation, Weihao Xia et al. [Paper] [Extended Version][Code] [Dataset] [Colab] [Video]
- (FG 2021) Generative Adversarial Network for Text-to-Face Synthesis and Manipulation with Pretrained BERT Model, Yutong Zhou et al. [Paper]
- (ACMMM 2021) Multi-caption Text-to-Face Synthesis: Dataset and Algorithm, Jianxin Sun et al. [Paper] [Code]
- (ACMMM 2021) Generative Adversarial Network for Text-to-Face Synthesis and Manipulation, Yutong Zhou. [Paper]
- (WACV 2021) Faces a la Carte: Text-to-Face Generation via Attribute Disentanglement, Tianren Wang et al. [Paper]
- (arXiv preprint 2019) FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation, Xiang Chen et al. [Paper]

Awesome Text To Image Save

⭐ Citation

🎑 News

To Do

Content

Description

Paper With Code

6. Other Related Works

Contact Me

Contributors

Open Source Agenda Badge

From the blog

How to Choose Which Programming Language to Learn First?

From the blog

How to Choose Which Programming Language to Learn First?