Text to Image using Deep Learning: A Survey

Raghad Ahmed Gad; Salma Saad Abdelshakour; Ahmed Abdelhafeez

doi:10.61356/j.scin.2024.1518

Full Text (PDF)

Published: Dec 30, 2024

DOI: https://doi.org/10.61356/j.scin.2024.1518

Keywords:

Text-to-image Synthesis, Attention Mechanisms, Conditional GAN, Semantic Disentanglement, Computational Complexity

Raghad Ahmed Gad

Faculty of Information Systems and Computer Science, October 6th University 12585, Egypt

https://orcid.org/0009-0004-1034-180X

Salma Saad Abdelshakour

Faculty of Information Systems and Computer Science, October 6th University 12585, Egypt

https://orcid.org/0009-0006-2119-6315

Ahmed Abdelhafeez

Computer Science Department, Faculty of Information System and Computer Science, October 6 University, Giza, 12585, Egypt

https://orcid.org/0000-0001-6983-5645

Abstract

Text-to-image synthesis is an exciting marriage of natural language processing and computer vision for image synthesis from textual descriptions. This survey explores the discussed accomplishment of value in an industry that is rapidly evolving. Various attention mechanisms proposed by models such as AttnGAN have been discovered to improve fine-grained text-visual correspondences and hence deliver higher quality outputs. Comprehensive reviews of the text generation neural network have provided the base upon which various architectures and applications would be identified and investigated. Conditional GAN has defined how an image becomes a suitable image given a piece of text; methodological directions addressing reproducible human evaluation framework have established benchmarks for qualitative assessments of model performance. Semantic disentanglement methods also tackle the need for controlled generation, facilitating better interpretability and diversity. Bringing these developments together in one review, this survey discusses the issues now confronting researchers, such as computational complexity and consistency of evaluation, before describing the ways in which text-to-image generation will develop to enhance its academic and applied uses

Downloads

Download data is not yet available.

How to Cite

Gad, R. A., Abdelshakour, S. S., & Abdelhafeez, A. (2024). Text to Image using Deep Learning: A Survey. SciNexuses, 1, 184-202. https://doi.org/10.61356/j.scin.2024.1518

Issue

Vol. 1 (2024): SciNexuses

Section

Review Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Gad, R. A., Abdelshakour, S. S., & Abdelhafeez, A. (2024). Text to Image using Deep Learning: A Survey. SciNexuses, 1, 184-202. https://doi.org/10.61356/j.scin.2024.1518