preprintarXiv (Cornell University)Nov 30, 2017GREEN OA

Toward Multimodal Image-to-Image Translation

University of California, Berkeley · Adobe Systems (United States)

Indexed inarxivdatacite

Abstract

Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim to model a \emph{distribution} of possible outputs in a conditional generative modeling setting. The ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time. A generator learns to map the given input, combined with this latent code, to the output. We explicitly encourage the connection between output and the latent code to be invertible. This helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse, and produces more diverse results.…

Citation impact

742
total citations
FWCI
Percentile
References
57
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Image (mathematics)
  • Code (set theory)
  • Artificial intelligence
  • Translation (biology)
  • Encoding (memory)
  • Generator (circuit theory)
  • Consistency (knowledge bases)
No related works found for this paper.