Synthesized speech quality of Indonesian natural text-to-speech by using HTS and CLUSTERGEN

Author(s):  
Elok Cahyaningtyas ◽  
Dhany Arifianto
2011 ◽  
Vol 97 (5) ◽  
pp. 852-868 ◽  
Author(s):  
Peter Počta ◽  
Jan Holub

This paper investigates the impact of independent and dependent losses and coding on speech quality predictions provided by PESQ (also known as ITU-T P.862) and P.563 models, when both naturally-produced and synthesized speech are used. Two synthesized speech samples generated with two different Text-to-Speech systems and one naturally-produced sample are investigated. In addition, we assess the variability of PESQ's and P.563's predictions with respect to the type of speech used (naturally-produced or synthesized) and loss conditions as well as their accuracy, by comparing the predictions with subjective assessments. The results show that there is no difference between the impact of packet loss on naturally-produced speech and synthesized speech. On the other hand, the impact of coding is different for the two types of stimuli. In addition, synthesized speech seems to be insensitive to degradations provided by most of the codecs investigated here. The reasons for those findings are particularly discussed. Finally, it is concluded that both models are capable of predicting the quality of transmitted synthesized speech under the investigated conditions to a certain degree. As expected, PESQ achieves the best performance over almost all of the investigated conditions.


2010 ◽  
Vol 44-47 ◽  
pp. 3672-3676
Author(s):  
Jian Lei Li ◽  
Zhen Ma ◽  
Ming Zhao Wu

On the base of all-poles model, this paper provides order-variable all-poles model according to instability of track complexity and applies this model in Multi-pulses linear prediction speech coding. This method is simulated in Matlab and quality of synthesized speech is evaluated, order-variable model is founded to keep better speech quality on the base of decreasing coding rates.


2012 ◽  
Vol 3 (2) ◽  
pp. 218-222 ◽  
Author(s):  
Abdelkader Chabchoub ◽  
Salah Alahmadi ◽  
Adnan Cherif ◽  
Wahid Barkouti

This work describes the new Arabic Text-to-speech (TTS) synthesis system. This system based on di-Diphone concatenation with TD-PSOLA modifier synthesizer. The quality of a synthesized speech is improved by analyzing the spectrum features of voice source in various F0 ranges and timbres in detail and new unites concatenation. It generates speech synthesis based on analysis and estimation of formant by classifying the voice source into different types. The developed model enhances the quality of the naturalness, and the intelligibility of speech synthesis in various speaking environment.


1973 ◽  
Vol 53 (1) ◽  
pp. 322-322
Author(s):  
Herman R. Silbiger ◽  
Richard E. Cullingford ◽  
Linda Pierce
Keyword(s):  

2013 ◽  
Author(s):  
Florian Hinterleitner ◽  
Christoph R. Norrenbrock ◽  
Sebastian Möller ◽  
Ulrich Heute

2002 ◽  
Vol 45 (4) ◽  
pp. 689-699 ◽  
Author(s):  
Donald G. Jamieson ◽  
Vijay Parsa ◽  
Moneca C. Price ◽  
James Till

We investigated how standard speech coders, currently used in modern communication systems, affect the quality of the speech of persons who have common speech and voice disorders. Three standardized speech coders (GSM 6.10 RPELTP, FS1016 CELP, and FS1015 LPC) and two speech coders based on subband processing were evaluated for their performance. Coder effects were assessed by measuring the quality of speech samples both before and after processing by the speech coders. Speech quality was rated by 10 listeners with normal hearing on 28 different scales representing pitch and loudness changes, speech rate, laryngeal and resonatory dysfunction, and coder-induced distortions. Results showed that (a) nine scale items were consistently and reliably rated by the listeners; (b) all coders degraded speech quality on these nine scales, with the GSM and CELP coders providing the better quality speech; and (c) interactions between coders and individual voices did occur on several voice quality scales.


1989 ◽  
Vol 25 (19) ◽  
pp. 1275 ◽  
Author(s):  
J.I. Lee ◽  
C.K. Un
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document