Speech synthesis for text-to-speech alignment and prosodic feature extraction

The Feature Extraction Algorithm for the Production of Emotions in Text-to-Speech (TTS) System for an Indian Regional Language

Pervasive Computing for Business ◽

10.4018/978-1-60566-996-0.ch003 ◽

2010 ◽

pp. 17-30

Author(s):

Jagadish S Kallimani ◽

V. K Ananthashayana ◽

Debjani Goswami

Keyword(s):

Feature Extraction ◽

Language Processing ◽

Speech Synthesis ◽

Text To Speech ◽

Everyday Objects ◽

Regional Language ◽

Extraction Algorithm ◽

Processing Signal ◽

Emotional Aspects ◽

Text To Speech Synthesis

Text-to-speech synthesis is a complex combination of language processing, signal processing and computer science. Ubiquitous computing (ubicomp) is a post-desktop model of human-computer interaction in which information processing has been thoroughly integrated into everyday objects and activities. Speech synthesis is the generation of synthesized speech from text. This chapter deals with the development of a Text to Speech (TTS) Synthesis system for an Indian regional language by considering Bengali as the language. This chapter highlights various methods which may be used for speech synthesis and also it provides an overview on the problems and difficulties in Bengali text to speech conversion. Variations in the prosody (speech parameters – volume, pitch, intonation, amplitude) of the speech yields the emotional aspects (anger, happy, normal), which are applied to our developed TTS system.

Download Full-text

A Novel Data Independent Approach for Conversion of Hand Punched Kannada Braille Script to Text and Speech

International Journal of Image and Graphics ◽

10.1142/s0219467818500109 ◽

2018 ◽

Vol 18 (02) ◽

pp. 1850010 ◽

Cited By ~ 2

Author(s):

T. Shreekanth ◽

M. R. Deeksha ◽

Karthikeya R. Kaushik

Keyword(s):

Feature Extraction ◽

Performance Evaluation ◽

Natural Language ◽

Speech Synthesis ◽

Basic Unit ◽

Text To Speech ◽

Synthesis Technique ◽

Visually Challenged ◽

Minimal Work ◽

Regional Languages

In society, there exists a gap in communication between the sighted community and the visually challenged people due to different scripts followed to read and write. To bridge this gap there is a need for a system that supports automatic conversion of Braille script to text and speech in the corresponding language. Optical Braille Recognition (OBR) system converts the hand-punched Braille characters into their equivalent natural language characters. The Text-to-Speech (TTS) system converts the recognized characters into audible speech using speech synthesis techniques. Existing literature reveals that OBR and TTS systems have been well established independently for English. There is a scope for development of OBR and TTS systems for regional languages. In spite of Kannada being one of the most widely spoken regional languages in India, minimal work has been done towards Kannada OBR and TTS. There is no system that directly converts Braille script to speech, therefore, this development of Kannada Braille to text and speech system is one of a kind. The acquired image is processed and feature extraction is performed using [Formula: see text]-means algorithm and heuristics to convert the Braille characters to Kannada script. The concatenation based speech synthesis technique employing phoneme as the basic unit is used to convert Kannada TTS using Festival TTS framework. Performance evaluation of the proposed system is done using Kannada Braille database developed independently, and the results obtained are found to be satisfactory when compared to existing methods in the literature.

Download Full-text

Design of English text-to-speech conversion algorithm based on machine learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189238 ◽

2020 ◽

pp. 1-12

Author(s):

Li Dongmei

Keyword(s):

Machine Learning ◽

Speech Synthesis ◽

Feature Recognition ◽

Learning Algorithm ◽

Morphological Structure ◽

English Text ◽

Text To Speech ◽

Part Of Speech ◽

Modern Computer ◽

Conversion Algorithm

English text-to-speech conversion is the key content of modern computer technology research. Its difficulty is that there are large errors in the conversion process of text-to-speech feature recognition, and it is difficult to apply the English text-to-speech conversion algorithm to the system. In order to improve the efficiency of the English text-to-speech conversion, based on the machine learning algorithm, after the original voice waveform is labeled with the pitch, this article modifies the rhythm through PSOLA, and uses the C4.5 algorithm to train a decision tree for judging pronunciation of polyphones. In order to evaluate the performance of pronunciation discrimination method based on part-of-speech rules and HMM-based prosody hierarchy prediction in speech synthesis systems, this study constructed a system model. In addition, the waveform stitching method and PSOLA are used to synthesize the sound. For words whose main stress cannot be discriminated by morphological structure, label learning can be done by machine learning methods. Finally, this study evaluates and analyzes the performance of the algorithm through control experiments. The results show that the algorithm proposed in this paper has good performance and has a certain practical effect.

Download Full-text

Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis

10.21437/interspeech.2017-1762 ◽

2017 ◽

Cited By ~ 1

Author(s):

Beiming Cao ◽

Myungjong Kim ◽

Jan van Santen ◽

Ted Mau ◽

Jun Wang

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Subset Selection, Adaptation, Gemination and Prosody Prediction for Amharic Text-to-Speech Synthesis

10.21437/ssw.2019-37 ◽

2019 ◽

Author(s):

Elshadai Tesfaye Biru ◽

Yishak Tofik Mohammed ◽

David Tofu ◽

Erica Cooper ◽

Julia Hirschberg

Keyword(s):

Speech Synthesis ◽

Subset Selection ◽

Text To Speech ◽

Text To Speech Synthesis ◽

Prosody Prediction

Download Full-text

“I Can’t Talk Now”: Speaking with Voice Output Communication Aid Using Text-to-Speech Synthesis During Multiparty Video Conference

Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3411763.3451745 ◽

2021 ◽

Author(s):

Wooseok Kim ◽

Sangsu Lee

Keyword(s):

Speech Synthesis ◽

Video Conference ◽

Text To Speech ◽

Voice Output Communication Aid ◽

Communication Aid ◽

Text To Speech Synthesis ◽

Voice Output

Download Full-text

Comparative Study on Neural Vocoders for Multispeaker Text-To-Speech Synthesis

2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS) ◽

10.1109/raics51191.2020.9332514 ◽

2020 ◽

Author(s):

Rajeev Rajan ◽

Ashish Roopan ◽

Sachin Prakash ◽

Elisa Jose ◽

Sati P.

Keyword(s):

Comparative Study ◽

Speech Synthesis ◽

Text To Speech ◽

Text To Speech Synthesis

Download Full-text

Comparison of Urdu text to speech synthesis using unit selection and HMM based techniques

2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA) ◽

10.1109/icsda.2016.7918988 ◽

2016 ◽

Cited By ~ 1

Author(s):

Farah Adeeba ◽

Tania Habib ◽

Sarmad Hussain ◽

Ehsan-ul-haq ◽

Kh. Shahzada Shahid

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Unit Selection ◽

Text To Speech Synthesis

Download Full-text

Comparative study of text-to-speech synthesis techniques for mobile linguistic translation process

2014 IEEE International Conference on Control System, Computing and Engineering (ICCSCE 2014) ◽

10.1109/iccsce.2014.7072761 ◽

2014 ◽

Author(s):

Phanchita Chomwihoke ◽

Manop Phankokkruad

Keyword(s):

Comparative Study ◽

Speech Synthesis ◽

Text To Speech ◽

Translation Process ◽

Synthesis Techniques ◽

Text To Speech Synthesis

Download Full-text

The future role of text to speech synthesis in automated services

10.1049/ic:19970799 ◽

1997 ◽

Author(s):

A.P. Breen

Keyword(s):

Speech Synthesis ◽

Text To Speech ◽

Future Role ◽

The Future ◽

Text To Speech Synthesis

Download Full-text