SINTETINĖS ŠNEKOS KOKYBĖS VERTINIMAS: KELIŲ KOMPIUTERINIŲ SINTEZATORIŲ LYGINAMASIS TYRIMAS

Psichologija ◽

10.15388/psichol.2002..4402 ◽

2002 ◽

Vol 25 ◽

pp. 72-96 ◽

Cited By ~ 1

Author(s):

Albinas Bagdonas ◽

Feliksas Laugalys

Keyword(s):

Comparative Study ◽

Speech Intelligibility ◽

Speech Quality ◽

Natural Speech ◽

Synthetic Speech ◽

Human Speech ◽

Previous Version ◽

Computer Based ◽

Improve Correlation

Straipsnyje pateikiami kelių versijų lietuviškos ir rusiškos sintetinės šnekos suprantamumo ir lietuviškos, rusiškos, vengriškos bei itališkos sintetinių šnekų patrauklumo duomenys. Lietuvių ir rusų diktorių kalba yra suprantamesnė nei atitinkama sintetinė. Ankstesnė rusiškos šnekos sintezė blogesnė nei lietuviška ar patobulinta rusiška sintezė (PRS). Pagal sintetinamų garsų charakteristikas aiškėja dvi priešingos PRS tendencijos - pagal bendrą atpažinimo klaidų mažėjimą ji artėja prie natūralios šnekos, tačiau pagal klaidų homogeniškumą nuo pastarosios tolsta. Kadangi pirmoji tendencija vyrauja, bendra atstojamoji rodo PRS gerėjimą.PRS suprantamumo ir patrauklumo koreliacija taip pat rodo jos didesnį artumą natūraliai šnekai. Tiriamiesiems PRS yra patrauklesnė nei ankstesnė rusiškos sintezės versija. Pastaroji, tiriamųjų nuomone, panašesne į roboto šneką, o PRS - į blogą, tačiau jau žmogaus šnekos versiją.Pagal patrauklumo duomenis natūralią šneką labiausiai vertina vengrų klausytojai, o kritiškiausi jos atžvilgiu yra italai. Visos tirtos sintetinių šnekų versijos vertinamos kaip mažiau patrauklios nei natūrali šneka, tačiau jas patobulinus šis vertinimas švelnėja. EVALUATION OF SYNTHETIC SPEECH QUALITY: A COMPARATIVE STUDY OF SEVERAL COMPUTER-BASED SPEECH SYNTHESIZERS Albinas Bagdonas, Feliksas Laugalys SummaryThis paper examines some versions of Lithuanian and Russian synthetic speech intelligibility and Lithuanian, Russian, Hungarian and Italian synthetic speech acceptability. The speech of both Russian and Lithuanian speaker is more intelligible than Russian or Lithuanian synthesis. Previous version of Russian synthesis is worse than Lithuanian and improved Russian synthesis (IRS). Study of characteristics of IRS sounds shows two opposite tendencies - according to the general quantity of mistake reduction this version is tending towards the natural speech, but according to the homogeneity of mistakes, it moves away. As the first tendency is clearly dominant, the general resultant in the new version shows a tend to improve. Correlation between intelligibility and acceptability of IRS deals possibility of small progress towards the natural speech. The IRS is more acceptable to subjects than previous version. The old synthesis is viewed as a rather decent instance of a robot's speech, while the IRS - as a poor variant of human speech. Acceptability studies showed natural speech more enjoyed by Hungarian listeners and more critical by Italian. All versions of synthetic speech were judged as less acceptable than natural but after improvement most of listeners changed their mind.

Download Full-text

Reexamining Synthetic Speech: Intelligibility and the Effects of Age, Task, and Speech Type on Recall

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/154193120705101819 ◽

2007 ◽

Vol 51 (18) ◽

pp. 1143-1147 ◽

Cited By ~ 3

Author(s):

Jefferson B. Hardee ◽

Christopher B. Mayhorn

Keyword(s):

Older Adults ◽

Software Design ◽

Speech Intelligibility ◽

Natural Speech ◽

Synthetic Speech ◽

Future Research ◽

Limited Capacity ◽

Adult Group ◽

Younger Adults ◽

Difficult Time

Synthetic speech is a technology that can be utilized to convey information and aid people in their tasks. Older adults in particular are a population that may be able to benefit from synthetic speech, and they are a population that has been investigated in a limited capacity. The current researchers intended to elucidate lingering conflicts in previous research on the intelligibility and recall of word and stories in synthetic speech for older and younger adults and how that compared to similar conditions in natural speech. Twenty-four older and 24 younger adults completed intelligibility and recall tasks with word lists and stories. Results indicated that older adults had a more difficult time with all speech, natural speech was easier to understand and remember than synthetic speech, and stories were easier to recall than words. Results also indicated that older adults had a more difficult time understanding synthetic words as compared to natural words than younger adults. In addition, older adults improved differentially with the recall of stories as opposed to words when compared to the younger adult group. Potential directions for synthetic speech software design and future research are discussed.

Download Full-text

A Robust Dual-Microphone Generalized Sidelobe Canceller Using a Bone-Conduction Sensor for Speech Enhancement

Sensors ◽

10.3390/s21051878 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1878

Author(s):

Yi Zhou ◽

Haiping Wang ◽

Yijing Chu ◽

Hongqing Liu

Keyword(s):

Speech Enhancement ◽

Speech Intelligibility ◽

Bone Conduction ◽

Speech Quality ◽

Generalized Sidelobe Canceller ◽

Spatially Distributed ◽

Interference Signals ◽

Adaptive Noise ◽

Adaptive Noise Canceller ◽

Sidelobe Canceller

The use of multiple spatially distributed microphones allows performing spatial filtering along with conventional temporal filtering, which can better reject the interference signals, leading to an overall improvement of the speech quality. In this paper, we propose a novel dual-microphone generalized sidelobe canceller (GSC) algorithm assisted by a bone-conduction (BC) sensor for speech enhancement, which is named BC-assisted GSC (BCA-GSC) algorithm. The BC sensor is relatively insensitive to the ambient noise compared to the conventional air-conduction (AC) microphone. Hence, BC speech can be analyzed to generate very accurate voice activity detection (VAD), even in a high noise environment. The proposed algorithm incorporates the VAD information obtained by the BC speech into the adaptive blocking matrix (ABM) and adaptive noise canceller (ANC) in GSC. By using VAD to control ABM and combining VAD with signal-to-interference ratio (SIR) to control ANC, the proposed method could suppress interferences and improve the overall performance of GSC significantly. It is verified by experiments that the proposed GSC system not only improves speech quality remarkably but also boosts speech intelligibility.

Download Full-text

A comparative study of scores on computer-based tests and paper-based tests

Behaviour and Information Technology ◽

10.1080/0144929x.2012.710647 ◽

2012 ◽

Vol 33 (4) ◽

pp. 410-422 ◽

Cited By ~ 16

Author(s):

Hanho Jeong

Keyword(s):

Comparative Study ◽

Computer Based

Download Full-text

Limitations of osseous genioplasty in relation to the displacement distance: a computer-based comparative study

Oral Surgery Oral Medicine Oral Pathology and Oral Radiology ◽

10.1016/j.oooo.2015.06.040 ◽

2015 ◽

Vol 120 (6) ◽

pp. 670-678 ◽

Cited By ~ 7

Author(s):

Stephan Christian Möhlhenrich ◽

Nicole Heussen ◽

Mohammad Kamal ◽

Ulrike Fritz ◽

Frank Hölzle ◽

...

Keyword(s):

Comparative Study ◽

Computer Based

Download Full-text

Subjective speech quality and speech intelligibility evaluation of single-channel dereverberation algorithms

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) ◽

10.1109/iwaenc.2014.6954313 ◽

2014 ◽

Cited By ~ 11

Author(s):

Anna Warzybok ◽

Ina Kodrasi ◽

Jan Ole Jungmann ◽

Emanuel Habets ◽

Timo Gerkmann ◽

...

Keyword(s):

Speech Intelligibility ◽

Single Channel ◽

Speech Quality

Download Full-text

Quantifying the Relation Between Speech Quality and Speech Intelligibility

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3803.714 ◽

1995 ◽

Vol 38 (3) ◽

pp. 714-725 ◽

Cited By ~ 45

Author(s):

Jill E. Preminger ◽

Dianne J. Van Tasell

Keyword(s):

Speech Intelligibility ◽

Normal Hearing ◽

Speech Quality ◽

Category Rating ◽

Single Dimension ◽

Quality Dimensions ◽

Highly Correlated ◽

Quality Measurements

The purpose of the present research was to examine the relation between speech quality and speech intelligibility. Speech quality measurements were made using continuous discourse and a category rating procedure for the following dimensions: intelligibility, pleasantness, loudness, effort, and total impression. Measurements were made using a group of listeners with normal hearing for a set of stimulus conditions in which intelligibility varied, and for a set of stimulus conditions in which intelligibility was held constant near 100%. When ratings were made for a set of stimulus conditions in which intelligibility was allowed to vary (a) intersubject reliability was high (i.e., different listeners interpreted the dimensions in a similar manner); and (b) the speech quality dimensions of intelligibility, effort, and loudness were indistinguishable. When ratings were made for a set of stimulus conditions in which intelligibility was held constant (a) intersubject reliability was reduced, indicating that different listeners interpreted the dimensions in different ways; (b) most listeners rated each dimension differently, indicating that the dimensions were unique; and (c) across listeners, no single dimension was highly correlated with total impression. These results can be used in order to examine the relation between speech quality and speech intelligibility.

Download Full-text

Auditory event-related potentials index faster processing of natural speech but not synthetic speech over nonspeech analogs in children

Brain and Language ◽

10.1016/j.bandl.2020.104825 ◽

2020 ◽

Vol 207 ◽

pp. 104825 ◽

Cited By ~ 1

Author(s):

Allison Whitten ◽

Alexandra P. Key ◽

Antje S. Mefferd ◽

James W. Bodfish

Keyword(s):

Event Related Potentials ◽

Natural Speech ◽

Synthetic Speech ◽

Auditory Event ◽

Related Potentials ◽

Auditory Event Related Potentials

Download Full-text

Gaining phonetic knowledge whilst improving synthetic speech quality?

Journal of Phonetics ◽

10.1016/s0095-4470(19)30308-0 ◽

1991 ◽

Vol 19 (1) ◽

pp. 139-146 ◽

Cited By ~ 2

Author(s):

Louis C.W. Pols ◽

Renée van Bezooijen

Keyword(s):

Speech Quality ◽

Synthetic Speech

Download Full-text

Text-to-Speech Synthesis

Encyclopedia of Multimedia Technology and Networking ◽

10.4018/978-1-59140-561-0.ch135 ◽

2011 ◽

pp. 957-963

Author(s):

Mahbubur R. Syed ◽

Shuvro Chakrobartty ◽

Robert J. Bignall

Keyword(s):

Speech Production ◽

Speech Synthesis ◽

Synthetic Speech ◽

Practical Application ◽

Text To Speech ◽

Synthesis System ◽

System A ◽

Vocal System ◽

Text To Speech Synthesis ◽

Computer Based

Speech synthesis is the process of producing natural-sounding, highly intelligible synthetic speech simulated by a machine in such a way that it sounds as if it was produced by a human vocal system. A text-to-speech (TTS) synthesis system is a computer-based system where the input is text and the output is a simulated vocalization of that text. Before the 1970s, most speech synthesis was achieved with hardware, but this was costly and it proved impossible to properly simulate natural speech production. Since the 1970s, the use of computers has made the practical application of speech synthesis more feasible.

Download Full-text

Synthetic Speech References for Automatic Pathological Speech Intelligibility Assessment

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054765 ◽

2020 ◽

Author(s):

Parvaneh Janbakhshi ◽

Ina Kodrasi ◽

Herve Bourlard

Keyword(s):

Speech Intelligibility ◽

Synthetic Speech

Download Full-text