Vowel Classification Based on Fundamental Frequency and Formant Frequencies

James Hillenbrand; Robert T. Gayvert

doi:10.1044/jshr.3604.694

Vowel Classification Based on Fundamental Frequency and Formant Frequencies

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3604.694 ◽

1993 ◽

Vol 36 (4) ◽

pp. 694-700 ◽

Cited By ~ 37

Author(s):

James Hillenbrand ◽

Robert T. Gayvert

Keyword(s):

Fundamental Frequency ◽

Spectral Measurements ◽

Linear Discriminant ◽

Formant Frequencies ◽

Linear Frequency ◽

Frequency Scale ◽

Classification Technique ◽

Women And Children ◽

Linear Discriminant Classifier

A quadratic discriminant classification technique was used to classify spectral measurements from vowels spoken by men, women, and children. The parameters used to train the discriminant classifier consisted of various combinations of fundamental frequency and the three lowest formant frequencies. Several nonlinear auditory transforms were evaluated. Unlike previous studies using a linear discriminant classifier, there was no advantage in category separability for any of the nonlinear auditory transforms over a linear frequency scale, and no advantage for spectral distances over absolute frequencies. However, it was found that parameter sets using nonlinear transforms and spectral differences reduced the differences between phonetically equivalent tokens produced by different groups of talkers.

Download Full-text

Quantitative and Descriptive Comparison of Four Acoustic Analysis Systems: Vowel Measurements

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2013/12-0103) ◽

2014 ◽

Vol 57 (1) ◽

pp. 26-45 ◽

Cited By ~ 24

Author(s):

Carlyn Burris ◽

Houri K. Vorperian ◽

Marios Fourakis ◽

Ray D. Kent ◽

Daniel M. Bolt

Keyword(s):

Fundamental Frequency ◽

Adult Male ◽

Acoustic Analysis ◽

Published Data ◽

Analysis Software ◽

Formant Frequencies ◽

Acoustic Measures ◽

Software Packages ◽

Women And Children

Purpose This study examines accuracy and comparability of 4 trademarked acoustic analysis software packages (AASPs): Praat, WaveSurfer, TF32, and CSL by using synthesized and natural vowels. Features of AASPs are also described. Method Synthesized and natural vowels were analyzed using each of the AASP's default settings to secure 9 acoustic measures: fundamental frequency (F0), formant frequencies (F1–F4), and formant bandwidths (B1–B4). The discrepancy between the software measured values and the input values (synthesized, previously reported, and manual measurements) was used to assess comparability and accuracy. Basic AASP features are described. Results Results indicate that Praat, WaveSurfer, and TF32 generate accurate and comparable F0 and F1–F4 data for synthesized vowels and adult male natural vowels. Results varied by vowel for women and children, with some serious errors. Bandwidth measurements by AASPs were highly inaccurate as compared with manual measurements and published data on formant bandwidths. Conclusions Values of F0 and F1–F4 are generally consistent and fairly accurate for adult vowels and for some child vowels using the default settings in Praat, WaveSurfer, and TF32. Manipulation of default settings yields improved output values in TF32 and CSL. Caution is recommended especially before accepting F1–F4 results for children and B1–B4 results for all speakers.

Download Full-text

PENGUCAPAN MAKHRAJ DARI UNIT BUNYI TERKECIL HURUF HIJAIYAH BERDASARKAN FREKUENSI DASAR DAN FREKUENSI FORMANT UNTUK MEDIA PEMBELAJARAN MEMBACA ALQURAN

ALQALAM ◽

10.32678/alqalam.v32i2.552 ◽

2015 ◽

Vol 32 (2) ◽

pp. 284

Author(s):

Muhammad Subali ◽

Miftah Andriansyah ◽

Christanto Sinambela

Keyword(s):

College Student ◽

Fundamental Frequency ◽

Autocorrelation Function ◽

Adult Male ◽

Formant Frequency ◽

Formant Frequencies ◽

Male Speaker ◽

Similarities And Differences

This article aims to look at the similarities and differences in the fundamental frequency and formant frequencies using the autocorrelation function and LPCfunction in GUI MATLAB 2012b on sound hijaiyah letters for adult male speaker beginner and expert based on makhraj pronunciation and both of speaker will be analysis on matching distance of the sound use DTW method on cepstrum. Subject for speech beginner makhraj pronunciation are taken from college student of Universitas Gunadarma and SITC aged 22 years old Data of the speech beginner makhraj pronunciation is recorded using MATLAB algorithm on GUI Subject for speech expert makhraj pronunciation are taken from previous research. They are 20-30 years old from the time of taking data. The sound will be extracted to get the value of the fundamental frequency and formant frequency. After getting both frequencies, it will be obtained analysis of the similarities and differences in the fundamental frequency and formant frequencies of speech beginner and expert and it will shows matching distance of both speech. The result is all of speech beginner and expert based on makhraj pronunciation have different values of fundamental frequency and formant frequency. Then the results of the analysis matching distance using method DTW showed that obtained in the range of 28.9746 to 136.4 between speech beginner and expert based on makhraj pronunciation. Keywords: fundamental frequency, formant frequency, hijaiyah letters, makhraj

Download Full-text

Field Propagation Experiments of Male African Savanna Elephant Rumbles: A Focus on the Transmission of Formant Frequencies

Animals ◽

10.3390/ani8100167 ◽

2018 ◽

Vol 8 (10) ◽

pp. 167 ◽

Cited By ~ 2

Author(s):

Anton Baotic ◽

Maxime Garcia ◽

Markus Boeckle ◽

Angela Stoeger

Keyword(s):

Fundamental Frequency ◽

Vocal Communication ◽

Vocal Tract ◽

Natural Habitat ◽

Ecological Factors ◽

Transmission Efficiency ◽

Long Distance ◽

African Savanna ◽

Formant Frequencies ◽

Resonance Frequencies

African savanna elephants live in dynamic fission–fusion societies and exhibit a sophisticated vocal communication system. Their most frequent call-type is the ‘rumble’, with a fundamental frequency (which refers to the lowest vocal fold vibration rate when producing a vocalization) near or in the infrasonic range. Rumbles are used in a wide variety of behavioral contexts, for short- and long-distance communication, and convey contextual and physical information. For example, maturity (age and size) is encoded in male rumbles by formant frequencies (the resonance frequencies of the vocal tract), having the most informative power. As sound propagates, however, its spectral and temporal structures degrade progressively. Our study used manipulated and resynthesized male social rumbles to simulate large and small individuals (based on different formant values) to quantify whether this phenotypic information efficiently transmits over long distances. To examine transmission efficiency and the potential influences of ecological factors, we broadcasted and re-recorded rumbles at distances of up to 1.5 km in two different habitats at the Addo Elephant National Park, South Africa. Our results show that rumbles were affected by spectral–temporal degradation over distance. Interestingly and unlike previous findings, the transmission of formants was better than that of the fundamental frequency. Our findings demonstrate the importance of formant frequencies for the efficiency of rumble propagation and the transmission of information content in a savanna elephant’s natural habitat.

Download Full-text

Associations Between Speaking Fundamental Frequency, Vowel Formant Frequencies, and Listener Perceptions of Speaker Gender and Vocal Femininity–Masculinity

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00747 ◽

2021 ◽

pp. 1-23

Author(s):

Yeptain Leung ◽

Jennifer Oates ◽

Siew-Pang Chan ◽

Viktória Papp

Keyword(s):

Fundamental Frequency ◽

Structural Equation ◽

Model Building ◽

Principal Component ◽

Equation Modeling ◽

Formant Frequencies ◽

Vowel Formant ◽

Vowel Space ◽

Australian English ◽

Speaking Fundamental Frequency

Purpose The aim of the study was to examine associations between speaking fundamental frequency ( f os ), vowel formant frequencies ( F ), listener perceptions of speaker gender, and vocal femininity–masculinity. Method An exploratory study was undertaken to examine associations between f os , F 1 – F 3 , listener perceptions of speaker gender (nominal scale), and vocal femininity–masculinity (visual analog scale). For 379 speakers of Australian English aged 18–60 years, f os mode and F 1 – F 3 (12 monophthongs; total of 36 F s) were analyzed on a standard reading passage. Seventeen listeners rated speaker gender and vocal femininity–masculinity on randomized audio recordings of these speakers. Results Model building using principal component analysis suggested the 36 F s could be succinctly reduced to seven principal components (PCs). Generalized structural equation modeling (with the seven PCs of F and f os as predictors) suggested that only F 2 and f os predicted listener perceptions of speaker gender (male, female, unable to decide). However, listener perceptions of vocal femininity–masculinity behaved differently and were predicted by F 1 , F 3 , and the contrast between monophthongs at the extremities of the F 1 acoustic vowel space, in addition to F 2 and f os . Furthermore, listeners' perceptions of speaker gender also influenced ratings of vocal femininity–masculinity substantially. Conclusion Adjusted odds ratios highlighted the substantially larger contribution of F to listener perceptions of speaker gender and vocal femininity–masculinity relative to f os than has previously been reported.

Download Full-text

Implementation of Linear Discriminant Classifier in 130nm Silicon Process

2018 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas.2018.8351829 ◽

2018 ◽

Cited By ~ 1

Author(s):

M. Munir Hasan ◽

Jeremy Holleman

Keyword(s):

Linear Discriminant ◽

Linear Discriminant Classifier

Download Full-text

Non-Intrusive Load Disaggregation by Linear Classifier Group Considering Multi-Feature Integration

Applied Sciences ◽

10.3390/app9173558 ◽

2019 ◽

Vol 9 (17) ◽

pp. 3558 ◽

Cited By ~ 3

Author(s):

Jinying Yu ◽

Yuchen Gao ◽

Yuxin Wu ◽

Dian Jiao ◽

Chang Su ◽

...

Keyword(s):

Identification Accuracy ◽

Data Set ◽

Linear Discriminant ◽

Practical Applications ◽

Core Technology ◽

Open Source Data ◽

Source Data ◽

Load Monitoring ◽

Global Similarity ◽

Linear Discriminant Classifier

Non-intrusive load monitoring (NILM) is a core technology for demand response (DR) and energy conservation services. Traditional NILM methods are rarely combined with practical applications, and most studies aim to disaggregate the whole loads in a household, which leads to low identification accuracy. In this method, the event detection method is used to obtain the switching event sets of all loads, and the power consumption curves of independent unknown electrical appliances in a period are disaggregated by utilizing comprehensive features. A linear discriminant classifier group based on multi-feature global similarity is used for load identification. The uniqueness of our algorithm is that it designs an event detector based on steady-state segmentation and a linear discriminant classifier group based on multi-feature global similarity. The simulation is carried out on an open source data set. The results demonstrate the effectiveness and high accuracy of the multi-feature integrated classification (MFIC) algorithm by using the state-of-the-art NILM methods as benchmarks.

Download Full-text

Acoustic Integrity of Speech Production in Children With Moderate and Severe Hearing Impairment

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3501.88 ◽

1992 ◽

Vol 35 (1) ◽

pp. 88-95 ◽

Cited By ~ 23

Author(s):

John Ryalls ◽

Annie Larouche

Keyword(s):

Hearing Impairment ◽

Speech Production ◽

Fundamental Frequency ◽

Voice Onset Time ◽

Total Duration ◽

Onset Time ◽

Hearing Impaired ◽

Formant Frequencies ◽

Standard Deviations

Ten normally hearing and 10 age-matched subjects with moderate-to-severe hearing impairment were recorded producing a protocol of 18 basic syllables [/pi/,/pa/,/pu/; /bi/,/ba/,/bu/; /ti/,/ta/,/tu/; /di/,/da/,/du/; /ki/,/ka/,/ku/; /gi/,/ga/,/gu/] repeated five times. The resulting 90 syllables were digitized and measured for (a) total duration; (b) voice-onset time (VOT) of the initial consonant; (c) fundamental frequency (F 0 ) at midpoint of vowel; and (d) formant frequencies (F 1 , F 2 , F 3 ), also measured at midpoint of vowel. Statistical comparisons were conducted on (a) average values for each syllable, and (b) standard deviations. Although there were numerical differences between normally hearing and hearing-impaired groups, few differences were statistically significant.

Download Full-text

Stress Emotion Recognition Based on RSP and EMG Signals

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.709.827 ◽

2013 ◽

Vol 709 ◽

pp. 827-831 ◽

Cited By ~ 9

Author(s):

Chang Zhi Wei

Keyword(s):

Emotion Recognition ◽

Stress Level ◽

Detection Rate ◽

Physiological Signals ◽

Correct Detection ◽

Fisher Linear Discriminant ◽

Linear Discriminant ◽

Emg Signal ◽

Detection Of Stress ◽

Linear Discriminant Classifier

To recognize the stress emotion, a subject was put alternately in periods of high and low stress by configuring the speed and difficulty of a game named Tetris. The respiration (RSP) signal and the electromyogram (EMG) signal with different stress level were then acquired. After preprocessing, the mathematical features were calculated and automatic detection of stress level based on Fisher linear discriminant classifier was realized. The results show that the average correct detection rate of stress level based on the EMG signal can reach 97.8%. That of the RSP signal is only 86.7%. The EMG signal is more effective than the RSP signal in detection of stress level. Union of multiple physiological signals can effectively improve the correct detection rate.

Download Full-text

The Identification of a Speaker's Sex from Synthesized Vowels

Perceptual and Motor Skills ◽

10.2466/pms.1998.87.2.595 ◽

1998 ◽

Vol 87 (2) ◽

pp. 595-600 ◽

Cited By ~ 22

Author(s):

S. P. Whiteside

Keyword(s):

Fundamental Frequency ◽

Test Scores ◽

Test Analysis ◽

Listening Test ◽

Formant Frequencies ◽

Perceptual Salience ◽

Fundamental Frequencies

This experiment assessed whether fundamental frequency or formant frequencies have more perceptual salience in the identification of the sex of the speaker from synthesized vowels. Four sets of ten vowels were synthesized by combining fundamental frequencies and formant frequencies with different permutations 50 listeners took part in a listening test. Analysis of the listening test scores suggested that for 36 vowels, the fundamental frequency (F0) was probably the most salient perceptual cue. For the remaining four vowels, however, this was not the case as either the formant frequencies or the onset-offset patterns of the F0 appeared to have some perceptual salience.

Download Full-text

Improving the Performance of Two-state Mental Task Brain-Computer Interface Design Using Linear Discriminant Classifier

EUROCON 2005 - The International Conference on "Computer as a Tool" ◽

10.1109/eurcon.2005.1629949 ◽

2005 ◽

Author(s):

R. Palaniappan ◽

Nai-Jen Huan

Keyword(s):

Interface Design ◽

Brain Computer Interface ◽

Computer Interface ◽

Mental Task ◽

Linear Discriminant ◽

State Mental ◽

Linear Discriminant Classifier

Download Full-text