Adaptive hosted text to speech processing

Jacob Christfort

doi:10.1121/1.2336693

Research and Design of Electronic Speech Reader Based on Text-to-Speech Technology

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.4883 ◽

2012 ◽

Vol 433-440 ◽

pp. 4883-4887

Author(s):

Hong Li Yang ◽

Yun Yang ◽

Zhu Yue

Keyword(s):

Computer Technology ◽

Speech Processing ◽

Sound Signal ◽

Current Development ◽

Text To Speech ◽

Speech Technology ◽

Key Technology ◽

Text Information ◽

Voice Service ◽

Research And Design

TTS, namely text-to-speech, is a kind of technology who can convert text information into sound signal according to information Speech processing rules. TTS, as the synthetic technology of the pronunciation, is the key technology in the current development of computer technology, and one of the most forward technical in its voice service, telephone banking, and information home appliances, mobile PDA fields. TTS has its extensive applications. In this paper, TTS is applied to electronic speech reader, which changes traditional way to read e-book, and both listening to and novels and learning English. This article introduces a method about how to make use of TTS technology, and how to achieve an electronic Speech reader of programming based on Visual Studio C# 2008 environment bring API and Microsoft SAPI interface.

Download Full-text

Text-to-speech processing using African language as case study

Journal of Discrete Mathematical Sciences and Cryptography ◽

10.1080/09720529.2006.10698085 ◽

2006 ◽

Vol 9 (2) ◽

pp. 365-382 ◽

Cited By ~ 3

Author(s):

E. J. Ogwu ◽

M. Talib ◽

O. A. Odejobi

Keyword(s):

Speech Processing ◽

African Language ◽

Text To Speech

Download Full-text

A Novel Text to Speech Technique for Tamil Language using Hidden Markov Models (HMM)

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8589.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 38-47

Keyword(s):

Signal Processing ◽

Markov Model ◽

Speech Processing ◽

Major Part ◽

Markov Models ◽

Hidden Markov ◽

Digital Signal ◽

Text To Speech ◽

Traditional System ◽

Local Languages

Application of digital signal processing in speech processing plays a major part in our everyday life. Text to speech system lets people to see and read out loud consecutively. Text-to-speech synthesizers use synthesis techniques that require good quality speech. Text to speech conversion (TTS) can apply to many applications such as automation, audio recording and audio-based assistance system. Text to speech conversion can be applied for various multinational language as well as for a number of local languages. An efficient text to speech conversion for Tamil language with extreme accuracy is proposed in this work. Multi feature, with a Hidden Markov Model (HMM) predictor is used to convert text to speech efficiently. By using the proposed method, the precision of the framework is enhanced by a factor of 6% when contrasted with the traditional system.

Download Full-text

Application of Feature Extraction in Text-to-Speech Processing

Artificial Neural Nets and Genetic Algorithms ◽

10.1007/978-3-7091-6230-9_35 ◽

2001 ◽

pp. 145-148

Author(s):

Václav Šebesta ◽

Jana Tučková

Keyword(s):

Feature Extraction ◽

Speech Processing ◽

Text To Speech

Download Full-text

Deep Learning Based TTS-STT Model with Transliteration for Indic Languages

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39689 ◽

2021 ◽

Vol 9 (12) ◽

pp. 2207-2213

Author(s):

Kartik Tiwari

Keyword(s):

Open Source ◽

Speech Processing ◽

High Performance ◽

Translation System ◽

Data Sets ◽

Text To Speech ◽

Strong Focus ◽

End To End ◽

Translation Methods ◽

Speech Presentation

Abstract: This paper introduces a new text-to-speech presentation from end-to-end (E2E-TTS) using toolkit called ESPnet-TTS, which is an open source extension. ESPnet speech processing tools kit. Various models come under ESPnet TTS TacoTron 2, Transformer TTS, and Fast Speech. This also provides recipes recommended by the Kaldi speech recognition tool kit (ASR). Recipes based on the composition combined with the ESPnet ASR recipe, which provides high performance. This toolkit also provides pre-trained models and samples of all recipes for users to use as a base .It works on TTS-STT and translation features for various indicator languages, with a strong focus on English, Marathi and Hindi. This paper also shows that neural sequence-to-sequence models find the state of the art or near the effects of the art state on existing databases. We also analyze some of the key design challenges that contribute to the development of a multilingual business translation system, which includes processing bilingual business data sets and evaluating multiple translation methods. The test result can be obtained using tokens and these test results show that our models can achieve modern performance compared to the latest LJ Speech tool kit data. Terms of Reference — Open source, end-to-end, text-to-speech

Download Full-text

Bootstrapping Text-to-Speech for speech processing in languages without an orthography

2013 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2013.6639221 ◽

2013 ◽

Cited By ~ 6

Author(s):

Sunayana Sitaram ◽

Sukhada Palkar ◽

Yun-Nung Chen ◽

Alok Parlikar ◽

Alan W Black

Keyword(s):

Speech Processing ◽

Text To Speech

Download Full-text

A Review of 21 iPad Applications for Augmentative and Alternative Communication Purposes

Perspectives on Augmentative and Alternative Communication ◽

10.1044/aac21.2.60 ◽

2012 ◽

Vol 21 (2) ◽

pp. 60-71 ◽

Cited By ~ 24

Author(s):

Ashley Alliano ◽

Kimberly Herriger ◽

Anthony D. Koutsoftas ◽

Theresa E. Bartolotta

Keyword(s):

Augmentative And Alternative Communication ◽

Cost Effective ◽

Alternative Form ◽

Alternative Communication ◽

Text To Speech ◽

Reference Guide ◽

Expressive Communication ◽

Communication Needs ◽

User Friendly ◽

Ipad Applications

Abstract Using the iPad tablet for Augmentative and Alternative Communication (AAC) purposes can facilitate many communicative needs, is cost-effective, and is socially acceptable. Many individuals with communication difficulties can use iPad applications (apps) to augment communication, provide an alternative form of communication, or target receptive and expressive language goals. In this paper, we will review a collection of iPad apps that can be used to address a variety of receptive and expressive communication needs. Based on recommendations from Gosnell, Costello, and Shane (2011), we describe the features of 21 apps that can serve as a reference guide for speech-language pathologists. We systematically identified 21 apps that use symbols only, symbols and text-to-speech, and text-to-speech only. We provide descriptions of the purpose of each app, along with the following feature descriptions: speech settings, representation, display, feedback features, rate enhancement, access, motor competencies, and cost. In this review, we describe these apps and how individuals with complex communication needs can use them for a variety of communication purposes and to target a variety of treatment goals. We present information in a user-friendly table format that clinicians can use as a reference guide.

Download Full-text

Age-Related Speech Processing Decline May Account for Comprehension Problems

ASHA Leader ◽

10.1044/leader.rib2.22012017.15 ◽

2017 ◽

Vol 22 (1) ◽

pp. 15-15

Keyword(s):

Speech Processing ◽

Age Related

Download Full-text

What Is Left Is Right

European Psychologist ◽

10.1027/1016-9040.14.1.78 ◽

2009 ◽

Vol 14 (1) ◽

pp. 78-89 ◽

Cited By ~ 8

Author(s):

Kenneth Hugdahl ◽

René Westerhausen

Keyword(s):

Speech Processing ◽

Information Transfer ◽

Hemispheric Asymmetry ◽

Functional Neuroimaging ◽

Gray Matter Volume ◽

Functional Anatomy ◽

Posterior Part ◽

Functional Asymmetry ◽

Planum Temporale ◽

Evolution Of Speech

The present paper is based on a talk on hemispheric asymmetry given by Kenneth Hugdahl at the Xth European Congress of Psychology, Praha July 2007. Here, we propose that hemispheric asymmetry evolved because of a left hemisphere speech processing specialization. The evolution of speech and the need for air-based communication necessitated division of labor between the hemispheres in order to avoid having duplicate copies in both hemispheres that would increase processing redundancy. It is argued that the neuronal basis of this labor division is the structural asymmetry observed in the peri-Sylvian region in the posterior part of the temporal lobe, with a left larger than right planum temporale area. This is the only example where a structural, or anatomical, asymmetry matches a corresponding functional asymmetry. The increase in gray matter volume in the left planum temporale area corresponds to a functional asymmetry of speech processing, as indexed from both behavioral, dichotic listening, and functional neuroimaging studies. The functional anatomy of the corpus callosum also supports such a view, with regional specificity of information transfer between the hemispheres.

Download Full-text

Rate-Dependent Speech Processing Can be Domain Specific

PsycEXTRA Dataset ◽

10.1037/e502412013-967 ◽

2012 ◽

Author(s):

Christine M. Szostak ◽

Mark A. Pitt ◽

Laura C. Dilley

Keyword(s):

Speech Processing ◽

Domain Specific ◽

Rate Dependent

Download Full-text