YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS. Our method builds upon the VITS model and adds several novel modifications for zeroshot multi-speaker and multilingual training. We achieved stateof-the-art (SOTA) results in zero-shot multi-speaker TTS and results comparable to SOTA in zero-shot voice conversion on the VCTK dataset.
2021: Edresson Casanova, Julian Weber, C. Shulby, Arnaldo Cândido Júnior, Eren Gölge, M. Ponti
https://arxiv.org/pdf/2112.02418v1.pdf
view more