preprintarXiv (Cornell University)Oct 8, 2019GREEN OA

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

Indexed inarxivdatacite

Abstract

Previous works (Donahue et al., 2018a; Engel et al., 2019a) have found that generating coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques. Subjective evaluation metric (Mean Opinion Score, or MOS) shows the effectiveness of the proposed approach for high quality mel-spectrogram inversion. To establish the generality of the proposed techniques, we show qualitative results of our model in speech synthesis, music domain translation and unconditional music synthesis. We evaluate the various components of the model through…

Citation impact

598
total citations
FWCI
Percentile
References
0
Citations per year

Authors

9

Topics & keywords

Keywords
  • Computer science
  • Spectrogram
  • Autoregressive model
  • Generality
  • Generative grammar
  • Algorithm
  • Mean opinion score
  • Waveform
UN Sustainable Development Goals
  • Reduced inequalities
No related works found for this paper.