Experiment on investigating the voice cords functioning by the speech signal

2002 ◽  
Vol 48 (4) ◽  
pp. 497-501
Author(s):  
A. V. Nikolaev
Keyword(s):  
2021 ◽  
Author(s):  
Olga A. Loskutova ◽  
Anastasia. V. Nenko ◽  
Yana. A. Berg ◽  
Daria V. Borovikova ◽  
Anton V. Yupashevsky

2003 ◽  
Vol 60 (2) ◽  
pp. 155-159 ◽  
Author(s):  
Jovisa Obrenovic ◽  
Milkica Nesic ◽  
Vladimir Nesic ◽  
Snezana Cekic

The influence of intensive acute hypoxia on the frequency-amplitude formant vocal O characteristics was investigated in this study. Examinees were exposed to the simulated altitudes of 5 500 m and 6 700 m in climabaro chamber and resolved Lotig?s test in the conditions of normoxia, i.e. pronounced the three-digit numbers beginning from 900, but in reversed order. Frequency and intensity values of vocal O (F1, F2, F3 and F4) extracted from the context of the pronunciation of the word eight (osam in Serbian), were measured by spectral speech signal analysis. Changes in frequency values and the intensity of the formants were examined. The obtained results showed that there were no significant changes of the formant frequencies in hypoxia condition compared to normoxia. Though significant changes of formant?s intensities were found compared to normoxia on the cited altitudes. The rise of formants intensities was found at the altitude of 5 500 m. Hypoxia at the altitude of 6 700 m caused the significant fall of the intensities in the initial period, compared to normoxia. The prolonged hypoxia exposure caused the rise of the formant intensities compared to the altitude of 5 500 m. In may be concluded that due to different altitudes, hypoxia causes different effects on the formants structure changes, compared to normoxia.


Author(s):  
Shibanee Dash . ◽  
Mihir Narayan Mohanty .

Modern wireless communication has gained a improved position as compared to previous time. Similarly, speech communication is the major focus area of research in respective applications. Many developments are done in this field. In this work, we have chosen the OFDM modulation based communication system, as it has importance in both licensed and unlicensed wireless communication platform. The voice signal is passed though the proposed model to obtain at the receiver end. Due to different circumstances, the signal may be corrupted partially at the user end. Authors try to achieve a better signal for reception using a neural network model of RBFN. The parameters are chosen for the RBFN model, as energy, ZCR, ACF, and fundamental frequency of the speech signal. In one part these parameters have eligibility to eliminate noise partially, where as in other part the RBFN model with these parameters proves its efficacy for both noisy speech signals with noisy channel as Gaussian channel. The efficiency of OFDM model is verified in terms of symbol error rate and the transmitted speech signal is evaluated in term of SNR that shows the reduction of noise. For visual inspection, a sample of signal, noisy signal and received signal is also shown. The experiment is performed with 5dB, 10dB, 15dB noise levels. The result proves the performance of RBFN model as the filter.The performance is measured as the listener’s voice in each condition. The results show that, at the time of the voice in noise environment, proposed technique improves the intelligibility on speech quality.


2013 ◽  
Vol 278-280 ◽  
pp. 1124-1128
Author(s):  
Yi Long You ◽  
Fei Zhang ◽  
Bu Lei Zuo ◽  
Feng Xiang You

Although traditional algorithms can led to suppressed voice in the noise, but the distortion of the voice is inevitable. An introduction is made as to the speech signal enhancement with an improved threshold method. Compared MATLAB experimental simulation on simulated platform with traditional enhanced algorithm, this paper aims to verify this method can effectively remove the noise in the signal, enhanced voice quality, improve speech intelligibility, and achieve the effect of the enhanced speech signal.


Author(s):  
M. S. Heetha ◽  
M. Shenbagapriya ◽  
M. Bharanidharan

Visually impaired people face many challenges in the society; particularly students with visual impairments face unique challenges in the education environment. They struggle a lot to access the information, so to resolve this obstacle in reading and to allow the visually impaired students to fully access and participate in the curriculum with the greatest possible level of independence, a Braille transliteration system using VLSI is designed. Here Braille input is given to FPGA Virtex-4 kit via Braille keyboard. The Braille language is converted into English language by decoding logic in VHDL/Verilog and then the corresponding alphabet letter is converted into speech signal with the help of the algorithm. Speaker is used for the voice output. This project allows the visually impaired people to get literate also the person can get a conformation about what is being typed, every time that character is being pressed, this prevents the occurrence of mistakes.


2011 ◽  
Vol 267 ◽  
pp. 762-767
Author(s):  
Ji Xiang Lu ◽  
Ping Wang ◽  
Hong Zhong Shi ◽  
Xin Wang

As the primary research area of the Multimoda1 Human-computer Interaction, Voice Interaction mainly involves extraction and identification of the natural speech signal, where the former provides the reliable signal sources, which are analyzed by the latter. The multichannel speech enhancement technology is studied in this paper, aiming at the Voice Interactive. The simulated results show the effectiveness and superiority of the improved algorithm proposed in the paper.


2012 ◽  
Vol 58 (2) ◽  
pp. 165-170 ◽  
Author(s):  
Dorota Kamińska ◽  
Adam Pelikant

Recognition of Human Emotion from a Speech Signal Based on Plutchik's ModelMachine recognition of human emotional states is an essential part in improving man-machine interaction. During expressive speech the voice conveys semantic message as well as the information about emotional state of the speaker. The pitch contour is one of the most significant properties of speech, which is affected by the emotional state. Therefore pitch features have been commonly used in systems for automatic emotion detection. In this work different intensities of emotions and their influence on pitch features have been studied. This understanding is important to develop such a system. Intensities of emotions are presented on Plutchik's cone-shaped 3D model. ThekNearest Neighbor algorithm has been used for classification. The classification has been divided into two parts. First, the primary emotion has been detected, then its intensity has been specified. The results show that the recognition accuracy of the system is over 50% for primary emotions, and over 70% for its intensities.


Author(s):  
Dea Sifana Ramadhina ◽  
Rita Magdalena ◽  
Sofia Saidah

Voice is one of the parameters in the identification process of a person. Through the voice, information will be obtained such as gender, age, and even the identity of the speaker. Speaker recognition is a method to narrow down crimes and frauds committed by voice. So that it will minimize the occurrence of faking one's identity. The Method of Mel Frequency Cepstrum Coefficient (MFCC) can be used in the speech recognition system. The process of feature extraction of speech signal using MFCC will produce acoustic speech signal. The classification, Hidden Markov Models (HMM) is used to match unidentified speaker’s voice with the voices in database. In this research, the system is used to verify the speaker, namely 15 text dependent in Indonesian. On testing the speaker with the same as database, the highest accuracy is 99,16%.


Author(s):  
Eydis Huld Magnusdottir ◽  
Michal Borsky ◽  
Manuela Meier ◽  
Kamilla Johannsdottir ◽  
Jon Gudnason

Monitoring cognitive workload from speech signals has received much attention from researchers in the past few years as it has the potential to improve performance and fidelity in human decision making. The bulk of the research has focused on classifying speech from talkers participating in cognitive workload experiments using simple reading tasks, memory span tests and the Stroop test, typically into three levels of low, medium and high cognitive workload. This study focuses on using parameters extracted from the vocal tract and the voice source components of the speech signal for cognitive workload monitoring. The experiment used in this study contains 98 participants, the levels were obtained by using a reading task and three Stroop tasks which were randomly ordered for each participant and an adequate rest time was used inbetween tasks to mitigate the effect of cognitive workload from one task affecting the subsequent one. Vocal tract features were obtained from the first three formants and voice source features were extracted using signal analysis on the inverse filtered speech signal. The results show that on their own, the vocal tract features outperform the voice source features. The MCR of 33.92% ± 1.05 was achieved with a SVM classifier. A weighted combination of vocal tract and voice source features classified with SWM classifier fused at the output level achieved the lowest MCR of  32.5%.


2021 ◽  
Vol 58 (2) ◽  
pp. 6497-6501
Author(s):  
N Mekebayev, O Mamyrbayev, M Turdalyuly, D Oralbekova, M Tasbolatov

Digital processing of speech signal and the voice recognition algorithm is very important for fast and accurate automatic scoring of the recognition technology. A voice is a signal of infinite information.  The direct analysis and synthesis of a complex speech signal is due to the fact that the information is contained in the signal. Speech is the most natural way of communicating people. The task of speech recognition is to convert speech into a sequence of words using a computer program. This article presents an algorithm of extracting MFCC for speech recognition. The MFCC algorithm reduces the processing power by 53% compared to the conventional algorithm. Automatic speech recognition using Matlab.


Sign in / Sign up

Export Citation Format

Share Document