DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis

Tao, Ming; Tang, Hao; Wu, Fei; Jing, Xiao‐Yuan; Bao, Bing‐Kun; Xu, Changsheng

doi:10.1109/cvpr52688.2022.01602

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access

DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis

MTMing Tao HTHao Tang FWFei Wu XJXiao‐Yuan Jing BBBing‐Kun Bao

Nanjing University of Posts and Telecommunications · Wuhan University · +3 more institutions

Indexed incrossref

Abstract

Synthesizing high-quality realistic images from text descriptions is a challenging task. Existing text-to-image Generative Adversarial Networks generally employ a stacked architecture as the backbone yet still remain three flaws. First, the stacked architecture introduces the entanglements between generators of different image scales. Second, existing studies prefer to apply and fix extra networks in adversarial learning for text-image semantic consistency, which limits the supervision capability of these networks. Third, the cross-modal attention-based text-image fusion that widely adopted by previous works is limited on several special image scales because of the computational cost. To these ends, we propose…

Citation impact

299

total citations

FWCI: 16.93
Percentile: 100%
References: 81

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Discriminator
Consistency (knowledge bases)
Image (mathematics)
Code (set theory)
Block (permutation group theory)
Artificial intelligence
Matching (statistics)

UN Sustainable Development Goals

Reduced inequalities

No related works found for this paper.

Funding

NS
Natural Science Foundation of Jiangsu Province
Award: bk20200037,BK20210595