Speech synthesis for text-to-speech alignment and prosodic feature extraction

Author(s):  
F. Malfrere ◽  
T. Dutoit
Author(s):  
Jagadish S Kallimani ◽  
V. K Ananthashayana ◽  
Debjani Goswami

Text-to-speech synthesis is a complex combination of language processing, signal processing and computer science. Ubiquitous computing (ubicomp) is a post-desktop model of human-computer interaction in which information processing has been thoroughly integrated into everyday objects and activities. Speech synthesis is the generation of synthesized speech from text. This chapter deals with the development of a Text to Speech (TTS) Synthesis system for an Indian regional language by considering Bengali as the language. This chapter highlights various methods which may be used for speech synthesis and also it provides an overview on the problems and difficulties in Bengali text to speech conversion. Variations in the prosody (speech parameters – volume, pitch, intonation, amplitude) of the speech yields the emotional aspects (anger, happy, normal), which are applied to our developed TTS system.


2018 ◽  
Vol 18 (02) ◽  
pp. 1850010 ◽  
Author(s):  
T. Shreekanth ◽  
M. R. Deeksha ◽  
Karthikeya R. Kaushik

In society, there exists a gap in communication between the sighted community and the visually challenged people due to different scripts followed to read and write. To bridge this gap there is a need for a system that supports automatic conversion of Braille script to text and speech in the corresponding language. Optical Braille Recognition (OBR) system converts the hand-punched Braille characters into their equivalent natural language characters. The Text-to-Speech (TTS) system converts the recognized characters into audible speech using speech synthesis techniques. Existing literature reveals that OBR and TTS systems have been well established independently for English. There is a scope for development of OBR and TTS systems for regional languages. In spite of Kannada being one of the most widely spoken regional languages in India, minimal work has been done towards Kannada OBR and TTS. There is no system that directly converts Braille script to speech, therefore, this development of Kannada Braille to text and speech system is one of a kind. The acquired image is processed and feature extraction is performed using [Formula: see text]-means algorithm and heuristics to convert the Braille characters to Kannada script. The concatenation based speech synthesis technique employing phoneme as the basic unit is used to convert Kannada TTS using Festival TTS framework. Performance evaluation of the proposed system is done using Kannada Braille database developed independently, and the results obtained are found to be satisfactory when compared to existing methods in the literature.


2020 ◽  
pp. 1-12
Author(s):  
Li Dongmei

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.


Author(s):  
Beiming Cao ◽  
Myungjong Kim ◽  
Jan van Santen ◽  
Ted Mau ◽  
Jun Wang

2019 ◽  
Author(s):  
Elshadai Tesfaye Biru ◽  
Yishak Tofik Mohammed ◽  
David Tofu ◽  
Erica Cooper ◽  
Julia Hirschberg

Sign in / Sign up

Export Citation Format

Share Document