Toward Multimodal Image-to-Image Translation
University of California, Berkeley · Adobe Systems (United States)
Abstract
Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim to model a \emph{distribution} of possible outputs in a conditional generative modeling setting. The ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time. A generator learns to map the given input, combined with this latent code, to the output. We explicitly encourage the connection between output and the latent code to be invertible. This helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse, and produces more diverse results.…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 57
Authors
7Topics & keywords
- Computer science
- Image (mathematics)
- Code (set theory)
- Artificial intelligence
- Translation (biology)
- Encoding (memory)
- Generator (circuit theory)
- Consistency (knowledge bases)