scholarly journals Mel-frequency cepstral coefficients derived using the zero-time windowing spectrum for classification of phonation types in singing

2019 ◽  
Vol 146 (5) ◽  
pp. EL418-EL423
Author(s):  
Sudarsana Reddy Kadiri ◽  
Paavo Alku
2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Ömer Eskidere ◽  
Ahmet Gürhanlı

The Mel Frequency Cepstral Coefficients (MFCCs) are widely used in order to extract essential information from a voice signal and became a popular feature extractor used in audio processing. However, MFCC features are usually calculated from a single window (taper) characterized by large variance. This study shows investigations on reducing variance for the classification of two different voice qualities (normal voice and disordered voice) using multitaper MFCC features. We also compare their performance by newly proposed windowing techniques and conventional single-taper technique. The results demonstrate that adapted weighted Thomson multitaper method could distinguish between normal voice and disordered voice better than the results done by the conventional single-taper (Hamming window) technique and two newly proposed windowing methods. The multitaper MFCC features may be helpful in identifying voices at risk for a real pathology that has to be proven later.


Information ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 23
Author(s):  
Marzena Mięsikowska

The aim of this study was to perform discriminant analysis of voice commands in the presence of an unmanned aerial vehicle equipped with four rotating propellers, as well as to obtain background sound levels and speech intelligibility. The measurements were taken in laboratory conditions in the absence of the unmanned aerial vehicle and the presence of the unmanned aerial vehicle. Discriminant analysis of speech commands (left, right, up, down, forward, backward, start, and stop) was performed based on mel-frequency cepstral coefficients. Ten male speakers took part in this experiment. The unmanned aerial vehicle hovered at a height of 1.8 m during the recordings at a distance of 2 m from the speaker and 0.3 m above the measuring equipment. Discriminant analysis based on mel-frequency cepstral coefficients showed promising classification of speech commands equal to 76.2% for male speakers. Evaluated speech intelligibility during recordings and obtained sound levels in the presence of the unmanned aerial vehicle during recordings did not exclude verbal communication with the unmanned aerial vehicle for male speakers.


Sign in / Sign up

Export Citation Format

Share Document