Restoration of Missing Voiced Speech Signal Segments

Author(s):  
Sarunas Paulikas
Keyword(s):  
2007 ◽  
Vol 2007 ◽  
pp. 1-5 ◽  
Author(s):  
Aïcha Bouzid ◽  
Noureddine Ellouze

This paper describes a multiscale product method (MPM) for open quotient measure in voiced speech. The method is based on determining the glottal closing and opening instants. The proposed approach consists of making the products of wavelet transform of speech signal at different scales in order to enhance the edge detection and parameter estimation. We show that the proposed method is effective and robust for detecting speech singularity. Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important in a wide range of speech processing tasks. In this paper, accurate estimation of GCIs and GOIs is used to measure the local open quotient (Oq) which is the ratio of the open time by the pitch period. Multiscale product operates automatically on speech signal; the reference electroglottogram (EGG) signal is used for performance evaluation. The ratio of good GCI detection is 95.5% and that of GOI is 76%. The pitch period relative error is 2.6% and the open phase relative error is 5.6%. The relative error measured on open quotient reaches 3% for the whole Keele database.


2014 ◽  
Vol 8 (1) ◽  
pp. 508-511
Author(s):  
Zhongbao Chen ◽  
Zhigang Fang ◽  
Jie Xu ◽  
Pengying Du ◽  
Xiaoping Luo

Speech can be broadly categorized into voiceless, voiced, and mute signal, in which voiced speech can be further classified into vowel and voiced consonant. With the ever increasing demand of the speech synthesis applications, it is urgent to develop an effective classification method to differentiate vowel and voiced consonant signal since they are two distinct components that affect the naturalness of the synthetic speech signal. State-of-the-arts algorithms for speech signal classification are effective in classifying voiceless, voiced and mute speech signal, however, not effective in further classifying the voiced signal. In view of the issue, a new algorithm for speech classification based on Gaussian Mixture Model (GMM) is proposed, which can directly classify a speech into voiceless, voiced consonant, vowel and mute signal. Simulation results demonstrate that the proposed algorithm is effective even under the noisy environments.


2012 ◽  
Vol 532-533 ◽  
pp. 1253-1257
Author(s):  
Li Hai Yao ◽  
Jie Xu ◽  
Hao Jiang

Speech can be broadly categorized into voiceless, voiced, and mute signal, in which voiced speech can be further classified into vowel and voiced consonant. With the ever increasing demand of the speech synthesis applications, it is urgent to develop an effective classification method to differentiate vowel and voiced consonant signal since they are two distinct components that affect the naturalness of the synthetic speech signal. State-of-the-arts algorithms for speech signal classification are effective in classifying voiceless, voiced and mute speech signal, however, not effective in further classifying the voiced signal. In view of the issue, a new algorithm for speech classification based on Gaussian Mixture Model (GMM) is proposed, which can directly classify a speech into voiceless, voiced consonant, vowel and mute signal. Specifically, a new speech feature is proposed, and the GMM is also modified for speech classification. Simulation results demonstrate that the proposed algorithm is effective even under the noisy environments.


Author(s):  
Martin Chavant ◽  
Alexis Hervais-Adelman ◽  
Olivier Macherey

Purpose An increasing number of individuals with residual or even normal contralateral hearing are being considered for cochlear implantation. It remains unknown whether the presence of contralateral hearing is beneficial or detrimental to their perceptual learning of cochlear implant (CI)–processed speech. The aim of this experiment was to provide a first insight into this question using acoustic simulations of CI processing. Method Sixty normal-hearing listeners took part in an auditory perceptual learning experiment. Each subject was randomly assigned to one of three groups of 20 referred to as NORMAL, LOWPASS, and NOTHING. The experiment consisted of two test phases separated by a training phase. In the test phases, all subjects were tested on recognition of monosyllabic words passed through a six-channel “PSHC” vocoder presented to a single ear. In the training phase, which consisted of listening to a 25-min audio book, all subjects were also presented with the same vocoded speech in one ear but the signal they received in their other ear differed across groups. The NORMAL group was presented with the unprocessed speech signal, the LOWPASS group with a low-pass filtered version of the speech signal, and the NOTHING group with no sound at all. Results The improvement in speech scores following training was significantly smaller for the NORMAL than for the LOWPASS and NOTHING groups. Conclusions This study suggests that the presentation of normal speech in the contralateral ear reduces or slows down perceptual learning of vocoded speech but that an unintelligible low-pass filtered contralateral signal does not have this effect. Potential implications for the rehabilitation of CI patients with partial or full contralateral hearing are discussed.


2011 ◽  
Vol 21 (2) ◽  
pp. 44-54
Author(s):  
Kerry Callahan Mandulak

Spectral moment analysis (SMA) is an acoustic analysis tool that shows promise for enhancing our understanding of normal and disordered speech production. It can augment auditory-perceptual analysis used to investigate differences across speakers and groups and can provide unique information regarding specific aspects of the speech signal. The purpose of this paper is to illustrate the utility of SMA as a clinical measure for both clinical speech production assessment and research applications documenting speech outcome measurements. Although acoustic analysis has become more readily available and accessible, clinicians need training with, and exposure to, acoustic analysis methods in order to integrate them into traditional methods used to assess speech production.


Sign in / Sign up

Export Citation Format

Share Document