Short‐Time Spectrum and “Cepstrum” Techniques for Vocal‐Pitch Detection

1964 ◽  
Vol 36 (2) ◽  
pp. 296-302 ◽  
Author(s):  
A. Michael Noll
Author(s):  
Christopher A. Lerch ◽  
Richard H. Lyon

Abstract A method termed harmonic tracking is developed to recover time dependent gear motion from machine casing vibration. The harmonic tracking method uses short-time spectral generation and a subsequent set of algorithms to locate and track gear meshing frequencies as functions of time. The meshing frequencies are then integrated with respect to time to obtain the rotation of individual gears. More specifically, spectral generation is performed using the discrete Fourier transform, and the locating and tracking algorithms involve locating tones in each short-time spectrum and tracking them through successive spectra to recover gear meshing harmonics. The harmonic tracking method is found to be more robust than demodulation-based methods in the presence of measurement noise and signal distortion from the structural transfer function between gears and the casing. The harmonic tracking method is tested, both through simulation and experiments involving motor-operated valves (MOV’s) as part of the development of a diagnostic system for MOV’s. In all cases, the harmonic tracking method is found to recover gear motion with sufficient accuracy to perform diagnostics. The harmonic tracking method should be generally applicable to situations in which a non-invasive technique is required for determining the time-dependent angular speeds and displacements of gearbox input, intermediary, and output shafts.


1964 ◽  
Vol 36 (5) ◽  
pp. 1030-1030 ◽  
Author(s):  
A. M. Noll ◽  
M. R. Schroeder
Keyword(s):  

2018 ◽  
Vol 210 ◽  
pp. 05010
Author(s):  
Xiaodong Zhuang ◽  
Nikos Mastorakis

A statistical study is implemented on the short-time spectrum of one main category of random signals. For the signals with massive and random micro-sources, a new statistic feature of the short-time amplitude spectrum is discovered, which reveals the relationship between the amplitude’s average and its standard for each frequency component. Moreover, the association between the amplitude distributions for different frequency components is also studied. A model representing such association is presented, which accords well with the statistic feature discovered. The analysis result has potential application in signal classification, and also in the study of system characteristics underlying the observed signal.


Speech is classified into voice, unvoiced and silence. The voice speech is the periodic vibration of vocal folds. Background noise affects the speech signals. In many speech applications calculation of pitch plays a major role. The paper proposes a pitch detection algorithm based on the short-time average magnitude difference function (AMDF) and the short-term autocorrelation function (ACF). Detecting the Pitch within the speech signal is important in most of all the speech related applications. Detection of Pitch is useful in identification of speaker. One solution to get detect with the pitch is by using the time domain algorithms. This paper gives idea about estimation and detection of pitch in time domain algorithm for different voice samples


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Lotfi Salhi ◽  
Adnane Cherif

This paper focuses on a robust feature extraction algorithm for automatic classification of pathological and normal voices in noisy environments. The proposed algorithm is based on human auditory processing and the nonlinear Teager-Kaiser energy operator. The robust features which labeled Teager Energy Cepstrum Coefficients (TECCs) are computed in three steps. Firstly, each speech signal frame is passed through a Gammatone or Mel scale triangular filter bank. Then, the absolute value of the Teager energy operator of the short-time spectrum is calculated. Finally, the discrete cosine transform of the log-filtered Teager Energy spectrum is applied. This feature is proposed to identify the pathological voices using a developed neural system of multilayer perceptron (MLP). We evaluate the developed method using mixed voice database composed of recorded voice samples from normophonic or dysphonic speakers. In order to show the robustness of the proposed feature in detection of pathological voices at different White Gaussian noise levels, we compare its performance with results for clean environments. The experimental results show that TECCs computed from Gammatone filter bank are more robust in noisy environments than other extracted features, while their performance is practically similar to clean environments.


Sign in / Sign up

Export Citation Format

Share Document