statistical parametric speech synthesis Latest Research Papers

Statistical parametric speech synthesis based on Hidden Markov Models has been an important technique for the production of artificial voices, due to its ability to produce results with high intelligibility and sophisticated features such as voice conversion and accent modification with a small footprint, particularly for low-resource languages where deep learning-based techniques remain unexplored. Despite the progress, the quality of the results, mainly based on Hidden Markov Models (HMM) does not reach those of the predominant approaches, based on unit selection of speech segments of deep learning. One of the proposals to improve the quality of HMM-based speech has been incorporating postfiltering stages, which pretend to increase the quality while preserving the advantages of the process. In this paper, we present a new approach to postfiltering synthesized voices with the application of discriminative postfilters, with several long short-term memory (LSTM) deep neural networks. Our motivation stems from modeling specific mapping from synthesized to natural speech on those segments corresponding to voiced or unvoiced sounds, due to the different qualities of those sounds and how HMM-based voices can present distinct degradation on each one. The paper analyses the discriminative postfilters obtained using five voices, evaluated using three objective measures, Mel cepstral distance and subjective tests. The results indicate the advantages of the discriminative postilters in comparison with the HTS voice and the non-discriminative postfilters.

Download Full-text

The Theory behind Controllable Expressive Speech Synthesis: A Cross-Disciplinary Approach

Human 4.0 - From Biology to Cybernetic ◽

10.5772/intechopen.89849 ◽

2021 ◽

Cited By ~ 1

Author(s):

Noé Tits ◽

Kevin El Haddad ◽

Thierry Dutoit

Keyword(s):

Recurrent Neural Networks ◽

Speech Synthesis ◽

Text To Speech ◽

Expressive Speech ◽

Rich Domain ◽

Interaction Field ◽

Audio Features ◽

History Of ◽

Text To Speech Synthesis ◽

Statistical Parametric Speech Synthesis

As part of the Human-Computer Interaction field, Expressive speech synthesis is a very rich domain as it requires knowledge in areas such as machine learning, signal processing, sociology, and psychology. In this chapter, we will focus mostly on the technical side. From the recording of expressive speech to its modeling, the reader will have an overview of the main paradigms used in this field, through some of the most prominent systems and methods. We explain how speech can be represented and encoded with audio features. We present a history of the main methods of Text-to-Speech synthesis: concatenative, parametric and statistical parametric speech synthesis. Finally, we focus on the last one, with the last techniques modeling Text-to-Speech synthesis as a sequence-to-sequence problem. This enables the use of Deep Learning blocks such as Convolutional and Recurrent Neural Networks as well as Attention Mechanism. The last part of the chapter intends to assemble the different aspects of the theory and summarize the concepts.

Download Full-text

A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems

2020 12th International Conference on Knowledge and Systems Engineering (KSE) ◽

10.1109/kse50997.2020.9287553 ◽

2020 ◽

Author(s):

Huy Kinh Phan ◽

Viet Lam Phung ◽

Anh Tuan Dinh ◽

Quoc Bao Nguyen

Keyword(s):

Speech Synthesis ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis

Download Full-text

Duration modelling and evaluation for Arabic statistical parametric speech synthesis

Multimedia Tools and Applications ◽

10.1007/s11042-020-09901-7 ◽

2020 ◽

Author(s):

Imene Zangar ◽

Zied Mnasri ◽

Vincent Colotte ◽

Denis Jouvet

Keyword(s):

Speech Synthesis ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis

Download Full-text

Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2019edp7167 ◽

2020 ◽

Vol E103.D (5) ◽

pp. 1099-1107

Author(s):

Mohammed Salah AL-RADHI ◽

Tamás Gábor CSAPÓ ◽

Géza NÉMETH

Keyword(s):

Speech Synthesis ◽

Noise Masking ◽

Statistical Parametric Speech Synthesis ◽

Continuous Noise ◽

Parametric Speech Synthesis

Download Full-text

WaveFFJORD: FFJORD-Based Vocoder for Statistical Parametric Speech Synthesis

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053202 ◽

2020 ◽

Author(s):

Ning-Qian Wu ◽

Zhen-Hua Ling

Keyword(s):

Speech Synthesis ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis

Download Full-text

A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus

Computer Speech & Language ◽

10.1016/j.csl.2019.101025 ◽

2020 ◽

Vol 60 ◽

pp. 101025 ◽

Cited By ~ 1

Author(s):

Mohammed Salah Al-Radhi ◽

Omnia Abdo ◽

Tamás Gábor Csapó ◽

Sherif Abdou ◽

Géza Németh ◽

...

Keyword(s):

Speech Synthesis ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis

Download Full-text

Excitation modelling using epoch features for statistical parametric speech synthesis

Computer Speech & Language ◽

10.1016/j.csl.2019.101029 ◽

2020 ◽

Vol 60 ◽

pp. 101029 ◽

Cited By ~ 2

Author(s):

M Kiran Reddy ◽

K Sreenivasa Rao

Keyword(s):

Speech Synthesis ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis

Download Full-text

Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis

Communications in Computer and Information Science - High Performance Computing ◽

10.1007/978-3-030-41005-6_25 ◽

2020 ◽

pp. 369-382

Author(s):

Marvin Coto-Jiménez

Keyword(s):

Speech Synthesis ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis

Download Full-text

statistical parametric speech synthesis
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Learning Deep and Wide Contextual Representations Using BERT for Statistical Parametric Speech Synthesis

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

The Theory behind Controllable Expressive Speech Synthesis: A Cross-Disciplinary Approach

A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems

Duration modelling and evaluation for Arabic statistical parametric speech synthesis

Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis

WaveFFJORD: FFJORD-Based Vocoder for Statistical Parametric Speech Synthesis

A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus

Excitation modelling using epoch features for statistical parametric speech synthesis

Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis

Export Citation Format

statistical parametric speech synthesisRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Learning Deep and Wide Contextual Representations Using BERT for Statistical Parametric Speech Synthesis

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

The Theory behind Controllable Expressive Speech Synthesis: A Cross-Disciplinary Approach

A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems

Duration modelling and evaluation for Arabic statistical parametric speech synthesis

Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis

WaveFFJORD: FFJORD-Based Vocoder for Statistical Parametric Speech Synthesis

A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus

Excitation modelling using epoch features for statistical parametric speech synthesis

Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis

statistical parametric speech synthesis
Recently Published Documents