Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder

IEEE Access ◽

10.1109/access.2018.2872060 ◽

2018 ◽

Vol 6 ◽

pp. 60478-60488 ◽

Cited By ~ 17

Author(s):

Yi Zhao ◽

Shinji Takaki ◽

Hieu-Thi Luong ◽

Junichi Yamagishi ◽

Daisuke Saito ◽

...

Keyword(s):

Speech Synthesis ◽

Acoustic Model ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Model Training

Download Full-text

MAKEDONKA: Applied Deep Learning Model for Text-to-Speech Synthesis in Macedonian Language

Applied Sciences ◽

10.3390/app10196882 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6882

Author(s):

Kostadin Mishev ◽

Aleksandra Karovska Ristovska ◽

Dimitar Trajanov ◽

Tome Eftimov ◽

Monika Simjanoska

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Speech Synthesis ◽

Feature Engineering ◽

Learning Approach ◽

Acoustic Model ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Smooth Transitions ◽

Deep Learning Model

This paper presents MAKEDONKA, the first open-source Macedonian language synthesizer that is based on the Deep Learning approach. The paper provides an overview of the numerous attempts to achieve a human-like reproducible speech, which has unfortunately shown to be unsuccessful due to the work invisibility and lack of integration examples with real software tools. The recent advances in Machine Learning, the Deep Learning-based methodologies, provide novel methods for feature engineering that allow for smooth transitions in the synthesized speech, making it sound natural and human-like. This paper presents a methodology for end-to-end speech synthesis that is based on a fully-convolutional sequence-to-sequence acoustic model with a position-augmented attention mechanism—Deep Voice 3. Our model directly synthesizes Macedonian speech from characters. We created a dataset that contains approximately 20 h of speech from a native Macedonian female speaker, and we use it to train the text-to-speech (TTS) model. The achieved MOS score of 3.93 makes our model appropriate for application in any kind of software that needs text-to-speech service in the Macedonian language. Our TTS platform is publicly available for use and ready for integration.

Download Full-text

Acoustic model-based subword tokenization and prosodic-context extraction without language knowledge for text-to-speech synthesis

Speech Communication ◽

10.1016/j.specom.2020.09.003 ◽

2020 ◽

Vol 125 ◽

pp. 53-60

Author(s):

Masashi Aso ◽

Shinnosuke Takamichi ◽

Norihiro Takamune ◽

Hiroshi Saruwatari

Keyword(s):

Speech Synthesis ◽

Acoustic Model ◽

Text To Speech ◽

Model Based ◽

Language Knowledge ◽

Text To Speech Synthesis ◽

Context Extraction

Download Full-text

Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis

10.21437/interspeech.2017-1762 ◽

2017 ◽

Cited By ~ 1

Author(s):

Beiming Cao ◽

Myungjong Kim ◽

Jan van Santen ◽

Ted Mau ◽

Jun Wang

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Subset Selection, Adaptation, Gemination and Prosody Prediction for Amharic Text-to-Speech Synthesis

10.21437/ssw.2019-37 ◽

2019 ◽

Author(s):

Elshadai Tesfaye Biru ◽

Yishak Tofik Mohammed ◽

David Tofu ◽

Erica Cooper ◽

Julia Hirschberg

Keyword(s):

Speech Synthesis ◽

Subset Selection ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Prosody Prediction

Download Full-text

“I Can’t Talk Now”: Speaking with Voice Output Communication Aid Using Text-to-Speech Synthesis During Multiparty Video Conference

Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3411763.3451745 ◽

2021 ◽

Author(s):

Wooseok Kim ◽

Sangsu Lee

Keyword(s):

Speech Synthesis ◽

Video Conference ◽

Text To Speech ◽

Voice Output Communication Aid ◽

Communication Aid ◽

Text To Speech Synthesis ◽

Voice Output

Download Full-text

Comparative Study on Neural Vocoders for Multispeaker Text-To-Speech Synthesis

2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS) ◽

10.1109/raics51191.2020.9332514 ◽

2020 ◽

Author(s):

Rajeev Rajan ◽

Ashish Roopan ◽

Sachin Prakash ◽

Elisa Jose ◽

Sati P.

Keyword(s):

Comparative Study ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Comparison of Urdu text to speech synthesis using unit selection and HMM based techniques

2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA) ◽

10.1109/icsda.2016.7918988 ◽

2016 ◽

Cited By ~ 1

Author(s):

Farah Adeeba ◽

Tania Habib ◽

Sarmad Hussain ◽

Ehsan-ul-haq ◽

Kh. Shahzada Shahid

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Unit Selection ◽

Text To Speech Synthesis

Download Full-text

Comparative study of text-to-speech synthesis techniques for mobile linguistic translation process

2014 IEEE International Conference on Control System, Computing and Engineering (ICCSCE 2014) ◽

10.1109/iccsce.2014.7072761 ◽

2014 ◽

Author(s):

Phanchita Chomwihoke ◽

Manop Phankokkruad

Keyword(s):

Comparative Study ◽

Speech Synthesis ◽

Text To Speech ◽

Translation Process ◽

Synthesis Techniques ◽

Text To Speech Synthesis

Download Full-text

The future role of text to speech synthesis in automated services

10.1049/ic:19970799 ◽

1997 ◽

Author(s):

A.P. Breen

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Future Role ◽

The Future ◽

Text To Speech Synthesis

Download Full-text

An advanced NLP framework for high-quality Text-to-Speech synthesis

2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD) ◽

10.1109/sped.2011.5940733 ◽

2011 ◽

Cited By ~ 6

Author(s):

Catalin Ungurean ◽

Dragos Burileanu

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

High Quality ◽

Text To Speech Synthesis

Download Full-text