Scaling up GANs for Text-to-Image Synthesis
Pohang University of Science and Technology · Adobe Systems (United States) · +2 more institutions
Abstract
The recent success of text-to-image synthesis has taken the world by storm and captured the general public's imagination. From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models. GANs used to be the de facto choice, with techniques like StyleGAN. With DALL.E 2, autoregressive and diffusion models became the new standard for large-scale generative models overnight. This rapid shift raises a fundamental question: can we scale up GANs to benefit from large datasets like LAION? We find that naïvely increasing the capacity of the StyleGan architecture quickly becomes unstable. We introduce GigaGAN, a new GAN architecture that far exceeds this limit,…
Citation impact
- FWCI
- 41.36
- Percentile
- 100%
- References
- 141
Authors
7- MKMinguk KangCorresponding
Pohang University of Science and Technology, Adobe Systems (United States), Korea Post
- JZJun-Yan Zhu
Carnegie Mellon University
- RZRichard Zhang
Adobe Systems (United States)
- JPJaesik Park
Pohang University of Science and Technology, Korea Post
- ESEli Shechtman
Adobe Systems (United States)
Topics & keywords
- Computer science
- Image (mathematics)
- Autoregressive model
- Interpolation (computer graphics)
- Inference
- Generative model
- Architecture
- Generative grammar
- Sustainable cities and communities