scholarly journals Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform

2007 ◽  
Vol 2007 ◽  
pp. 1-5 ◽  
Author(s):  
Aïcha Bouzid ◽  
Noureddine Ellouze

This paper describes a multiscale product method (MPM) for open quotient measure in voiced speech. The method is based on determining the glottal closing and opening instants. The proposed approach consists of making the products of wavelet transform of speech signal at different scales in order to enhance the edge detection and parameter estimation. We show that the proposed method is effective and robust for detecting speech singularity. Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important in a wide range of speech processing tasks. In this paper, accurate estimation of GCIs and GOIs is used to measure the local open quotient (Oq) which is the ratio of the open time by the pitch period. Multiscale product operates automatically on speech signal; the reference electroglottogram (EGG) signal is used for performance evaluation. The ratio of good GCI detection is 95.5% and that of GOI is 76%. The pitch period relative error is 2.6% and the open phase relative error is 5.6%. The relative error measured on open quotient reaches 3% for the whole Keele database.

Author(s):  
Aicha Bouzid ◽  
Noureddine Elouze

This paper deals with glottal parameter estimation such as local pitch and open quotient from electroglottographic signal (EGG). This estimation is based on glottal closing instants and glottal opening instants determined by a multi-scale product of this signal. Wavelet transform of EGG signal is made with a quadratic spline function. Wavelet coefficients calculated on different dyadic scales, show modulus maxima at localized discontinuities of EGG signal. The detected maxima and minima correspond to the glottal opening and closing instants called GOIs and GCIs. To improve the estimate precision, we operate the multi-scale product of wavelet transform coefficients of three successive dyadic scales. This processing enhances edge detection. A Multi-scale product is a nonlinear combination of successive scales; it reduces noise and spurious peaks. We apply cubic root amplitude on the product to improve the representation of weak amplitudes. The method has a good representation of GCI and a best detection of GOI. The method was tested on the Keele University database; it is effective and robust in multiple cases even for a typical signal showing undetermined GOIs and multiple peaks at GCIs. Finally precise measurement of these instants allows accurate estimation of prosodic parameters as local pitch and open quotient.


Author(s):  
Mourad Talbi ◽  
Med Salim Bouhlel

Background: In this paper, we propose a secure image watermarking technique which is applied to grayscale and color images. It consists in applying the SVD (Singular Value Decomposition) in the Lifting Wavelet Transform domain for embedding a speech image (the watermark) into the host image. Methods: It also uses signature in the embedding and extraction steps. Its performance is justified by the computation of PSNR (Pick Signal to Noise Ratio), SSIM (Structural Similarity), SNR (Signal to Noise Ratio), SegSNR (Segmental SNR) and PESQ (Perceptual Evaluation Speech Quality). Results: The PSNR and SSIM are used for evaluating the perceptual quality of the watermarked image compared to the original image. The SNR, SegSNR and PESQ are used for evaluating the perceptual quality of the reconstructed or extracted speech signal compared to the original speech signal. Conclusion: The Results obtained from computation of PSNR, SSIM, SNR, SegSNR and PESQ show the performance of the proposed technique.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Bilal Elghadyry ◽  
Faissal Ouardi ◽  
Sébastien Verel

AbstractWeighted finite-state transducers have been shown to be a general and efficient representation in many applications such as text and speech processing, computational biology, and machine learning. The composition of weighted finite-state transducers constitutes a fundamental and common operation between these applications. The NP-hardness of the composition computation problem presents a challenge that leads us to devise efficient algorithms on a large scale when considering more than two transducers. This paper describes a parallel computation of weighted finite transducers composition in MapReduce framework. To the best of our knowledge, this paper is the first to tackle this task using MapReduce methods. First, we analyze the communication cost of this problem using Afrati et al. model. Then, we propose three MapReduce methods based respectively on input alphabet mapping, state mapping, and hybrid mapping. Finally, intensive experiments on a wide range of weighted finite-state transducers are conducted to compare the proposed methods and show their efficiency for large-scale data.


2021 ◽  
Vol 22 (5) ◽  
pp. 481-508
Author(s):  
Robert P. Carlyon ◽  
Tobias Goehring

AbstractCochlear implants (CIs) are the world’s most successful sensory prosthesis and have been the subject of intense research and development in recent decades. We critically review the progress in CI research, and its success in improving patient outcomes, from the turn of the century to the present day. The review focuses on the processing, stimulation, and audiological methods that have been used to try to improve speech perception by human CI listeners, and on fundamental new insights in the response of the auditory system to electrical stimulation. The introduction of directional microphones and of new noise reduction and pre-processing algorithms has produced robust and sometimes substantial improvements. Novel speech-processing algorithms, the use of current-focusing methods, and individualised (patient-by-patient) deactivation of subsets of electrodes have produced more modest improvements. We argue that incremental advances have and will continue to be made, that collectively these may substantially improve patient outcomes, but that the modest size of each individual advance will require greater attention to experimental design and power. We also briefly discuss the potential and limitations of promising technologies that are currently being developed in animal models, and suggest strategies for researchers to collectively maximise the potential of CIs to improve hearing in a wide range of listening situations.


Author(s):  
Paul S. Addison

Redundancy: it is a word heavy with connotations of lacking usefulness. I often hear that the rationale for not using the continuous wavelet transform (CWT)—even when it appears most appropriate for the problem at hand—is that it is ‘redundant’. Sometimes the conversation ends there, as if self-explanatory. However, in the context of the CWT, ‘redundant’ is not a pejorative term, it simply refers to a less compact form used to represent the information within the signal. The benefit of this new form—the CWT—is that it allows for intricate structural characteristics of the signal information to be made manifest within the transform space, where it can be more amenable to study: resolution over redundancy. Once the signal information is in CWT form, a range of powerful analysis methods can then be employed for its extraction, interpretation and/or manipulation. This theme issue is intended to provide the reader with an overview of the current state of the art of CWT analysis methods from across a wide range of numerate disciplines, including fluid dynamics, structural mechanics, geophysics, medicine, astronomy and finance. This article is part of the theme issue ‘Redundancy rules: the continuous wavelet transform comes of age’.


Author(s):  
Chenyu Zhou ◽  
Liangyao Yu ◽  
Yong Li ◽  
Jian Song

Accurate estimation of sideslip angle is essential for vehicle stability control. For commercial vehicles, the estimation of sideslip angle is challenging due to severe load transfer and tire nonlinearity. This paper presents a robust sideslip angle observer of commercial vehicles based on identification of tire cornering stiffness. Since tire cornering stiffness of commercial vehicles is greatly affected by tire force and road adhesion coefficient, it cannot be treated as a constant. To estimate the cornering stiffness in real time, the neural network model constructed by Levenberg-Marquardt backpropagation (LMBP) algorithm is employed. LMBP is a fast convergent supervised learning algorithm, which combines the steepest descent method and gauss-newton method, and is widely used in system parameter estimation. LMBP does not rely on the mathematical model of the actual system when building the neural network. Therefore, when the mathematical model is difficult to establish, LMBP can play a very good role. Considering the complexity of tire modeling, this study adopted LMBP algorithm to estimate tire cornering stiffness, which have simplified the tire model and improved the estimation accuracy. Combined with neural network, A time-varying Kalman filter (TVKF) is designed to observe the sideslip angle of commercial vehicles. To validate the feasibility of the proposed estimation algorithm, multiple driving maneuvers under different road surface friction have been carried out. The test results show that the proposed method has better accuracy than the existing algorithm, and it’s robust over a wide range of driving conditions.


2017 ◽  
Author(s):  
Monica Yin Chen Li ◽  
David Braze ◽  
Anuenue Kukona ◽  
Clinton L. Johns ◽  
Whitney Tabor ◽  
...  

Many studies have established a link between phonological abilities (indexed by phonological awareness and phonological memory tasks) and typical and atypical reading development. Individuals who perform poorly on phonological assessments have been mostly assumed to have underspecified (or “fuzzy”) phonological re- presentations, with typical phonemic categories, but with greater category overlap due to imprecise encoding. An alternative posits that poor readers have overspecified phonological representations, with speech sounds perceived allophonically (phonetically distinct variants of a single phonemic category). On both accounts, mismatch between phonological categories and orthography leads to reading difficulty. Here, we consider the implications of these accounts for online speech processing. We used eye tracking and an individual differences approach to assess sensitivity to subphonemic detail in a community sample of young adults with a wide range of reading-related skills. Subphonemic sensitivity inversely correlated with meta-phonological task performance, consistent with overspecification.


Author(s):  
M. Yasin Pir ◽  
Mohamad Idris Wani

Speech forms a significant means of communication and the variation in pitch of a speech signal of a gender is commonly used to classify gender as male or female. In this study, we propose a system for gender classification from speech by combining hybrid model of 1-D Stationary Wavelet Transform (SWT) and artificial neural network. Features such as power spectral density, frequency, and amplitude of human voice samples were used to classify the gender. We use Daubechies wavelet transform at different levels for decomposition and reconstruction of the signal. The reconstructed signal is fed to artificial neural network using feed forward network for classification of gender. This study uses 400 voice samples of both the genders from Michigan University database which has been sampled at 16000 Hz. The experimental results show that the proposed method has more than 94% classification efficiency for both training and testing datasets.


2021 ◽  
Vol 4 (3) ◽  
pp. 37-41
Author(s):  
Sayora Ibragimova ◽  

This work deals with basic theory of wavelet transform and multi-scale analysis of speech signals, briefly reviewed the main differences between wavelet transform and Fourier transform in the analysis of speech signals. The possibilities to use the method of wavelet analysis to speech recognition systems and its main advantages. In most existing systems of recognition and analysis of speech sound considered as a stream of vectors whose elements are some frequency response. Therefore, the speech processing in real time using sequential algorithms requires computing resources with high performance. Examples of how this method can be used when processing speech signals and build standards for systems of recognition.Key words: digital signal processing, Fourier transform, wavelet analysis, speech signal, wavelet transform


Sign in / Sign up

Export Citation Format

Share Document