Experiment on investigating the voice cords functioning by the speech signal

A. V. Nikolaev

doi:10.1134/1.1494032

The Spectral Characteristics Research of the Voice-Speech Signal in Dysphonia

10.1109/apeie52976.2021.9647441 ◽

2021 ◽

Author(s):

Olga A. Loskutova ◽

Anastasia. V. Nenko ◽

Yana. A. Berg ◽

Daria V. Borovikova ◽

Anton V. Yupashevsky

Keyword(s):

Speech Signal ◽

Spectral Characteristics ◽

The Voice

Download Full-text

Formant structure of the voice during the intensive acute hypoxia

Vojnosanitetski pregled ◽

10.2298/vsp0302155o ◽

2003 ◽

Vol 60 (2) ◽

pp. 155-159 ◽

Cited By ~ 2

Author(s):

Jovisa Obrenovic ◽

Milkica Nesic ◽

Vladimir Nesic ◽

Snezana Cekic

Keyword(s):

Speech Signal ◽

Signal Analysis ◽

Acute Hypoxia ◽

Initial Period ◽

Formant Frequencies ◽

Hypoxia Exposure ◽

Structure Changes ◽

The Voice ◽

Reversed Order ◽

Different Altitudes

The influence of intensive acute hypoxia on the frequency-amplitude formant vocal O characteristics was investigated in this study. Examinees were exposed to the simulated altitudes of 5 500 m and 6 700 m in climabaro chamber and resolved Lotig?s test in the conditions of normoxia, i.e. pronounced the three-digit numbers beginning from 900, but in reversed order. Frequency and intensity values of vocal O (F1, F2, F3 and F4) extracted from the context of the pronunciation of the word eight (osam in Serbian), were measured by spectral speech signal analysis. Changes in frequency values and the intensity of the formants were examined. The obtained results showed that there were no significant changes of the formant frequencies in hypoxia condition compared to normoxia. Though significant changes of formant?s intensities were found compared to normoxia on the cited altitudes. The rise of formants intensities was found at the altitude of 5 500 m. Hypoxia at the altitude of 6 700 m caused the significant fall of the intensities in the initial period, compared to normoxia. The prolonged hypoxia exposure caused the rise of the formant intensities compared to the altitude of 5 500 m. In may be concluded that due to different altitudes, hypoxia causes different effects on the formants structure changes, compared to normoxia.

Download Full-text

A Machine Learning Approach for Speech Detection in Modern Wireless Communication Environment

International Journal of Machine Learning and Networked Collaborative Engineering ◽

10.30991/ijmlnce.2018v02i04.004 ◽

2018 ◽

Vol 2 (4) ◽

Author(s):

Shibanee Dash . ◽

Mihir Narayan Mohanty .

Keyword(s):

Wireless Communication ◽

Speech Signal ◽

Symbol Error Rate ◽

Speech Communication ◽

Noise Levels ◽

Speech Detection ◽

Voice Signal ◽

Proposed Model ◽

Machine Learning Approach ◽

The Voice

Modern wireless communication has gained a improved position as compared to previous time. Similarly, speech communication is the major focus area of research in respective applications. Many developments are done in this field. In this work, we have chosen the OFDM modulation based communication system, as it has importance in both licensed and unlicensed wireless communication platform. The voice signal is passed though the proposed model to obtain at the receiver end. Due to different circumstances, the signal may be corrupted partially at the user end. Authors try to achieve a better signal for reception using a neural network model of RBFN. The parameters are chosen for the RBFN model, as energy, ZCR, ACF, and fundamental frequency of the speech signal. In one part these parameters have eligibility to eliminate noise partially, where as in other part the RBFN model with these parameters proves its efficacy for both noisy speech signals with noisy channel as Gaussian channel. The efficiency of OFDM model is verified in terms of symbol error rate and the transmitted speech signal is evaluated in term of SNR that shows the reduction of noise. For visual inspection, a sample of signal, noisy signal and received signal is also shown. The experiment is performed with 5dB, 10dB, 15dB noise levels. The result proves the performance of RBFN model as the filter.The performance is measured as the listener’s voice in each condition. The results show that, at the time of the voice in noise environment, proposed technique improves the intelligibility on speech quality.

Download Full-text

Research of Enhanced Speech Signal Based on MATLAB Simulation Platform

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.278-280.1124 ◽

2013 ◽

Vol 278-280 ◽

pp. 1124-1128

Author(s):

Yi Long You ◽

Fei Zhang ◽

Bu Lei Zuo ◽

Feng Xiang You

Keyword(s):

Speech Signal ◽

Speech Intelligibility ◽

Voice Quality ◽

Signal Enhancement ◽

Experimental Simulation ◽

Simulation Platform ◽

Matlab Simulation ◽

Threshold Method ◽

The Voice

Although traditional algorithms can led to suppressed voice in the noise, but the distortion of the voice is inevitable. An introduction is made as to the speech signal enhancement with an improved threshold method. Compared MATLAB experimental simulation on simulated platform with traditional enhanced algorithm, this paper aims to verify this method can effectively remove the noise in the signal, enhanced voice quality, improve speech intelligibility, and achieve the effect of the enhanced speech signal.

Download Full-text

Human Computer Interface for Victims using FPGA

International Journal of Students Research in Technology & Management ◽

10.18510/ijsrtm.2015.322 ◽

2015 ◽

Vol 3 (2) ◽

pp. 245-250

Author(s):

M. S. Heetha ◽

M. Shenbagapriya ◽

M. Bharanidharan

Keyword(s):

Speech Signal ◽

Visually Impaired ◽

English Language ◽

Computer Interface ◽

Human Computer Interface ◽

Visually Impaired People ◽

Education Environment ◽

Impaired People ◽

Visually Impaired Students ◽

The Voice

Visually impaired people face many challenges in the society; particularly students with visual impairments face unique challenges in the education environment. They struggle a lot to access the information, so to resolve this obstacle in reading and to allow the visually impaired students to fully access and participate in the curriculum with the greatest possible level of independence, a Braille transliteration system using VLSI is designed. Here Braille input is given to FPGA Virtex-4 kit via Braille keyboard. The Braille language is converted into English language by decoding logic in VHDL/Verilog and then the corresponding alphabet letter is converted into speech signal with the help of the algorithm. Speaker is used for the voice output. This project allows the visually impaired people to get literate also the person can get a conformation about what is being typed, every time that character is being pressed, this prevents the occurrence of mistakes.

Download Full-text

Study on Multichannel Speech Enhancement Technology in Voice Human-Computer Interaction

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.267.762 ◽

2011 ◽

Vol 267 ◽

pp. 762-767

Author(s):

Ji Xiang Lu ◽

Ping Wang ◽

Hong Zhong Shi ◽

Xin Wang

Keyword(s):

Human Computer Interaction ◽

Speech Enhancement ◽

Speech Signal ◽

Research Area ◽

Primary Research ◽

Enhancement Technology ◽

Voice Interaction ◽

The Voice ◽

Signal Sources ◽

Computer Interaction

As the primary research area of the Multimoda1 Human-computer Interaction, Voice Interaction mainly involves extraction and identification of the natural speech signal, where the former provides the reliable signal sources, which are analyzed by the latter. The multichannel speech enhancement technology is studied in this paper, aiming at the Voice Interactive. The simulated results show the effectiveness and superiority of the improved algorithm proposed in the paper.

Download Full-text

Recognition of Human Emotion from a Speech Signal Based on Plutchik's Model

International Journal of Electronics and Telecommunications ◽

10.2478/v10177-012-0024-4 ◽

2012 ◽

Vol 58 (2) ◽

pp. 165-170 ◽

Cited By ~ 15

Author(s):

Dorota Kamińska ◽

Adam Pelikant

Keyword(s):

Speech Signal ◽

3D Model ◽

Nearest Neighbor ◽

Emotional State ◽

Pitch Contour ◽

Emotional States ◽

Emotion Detection ◽

Human Emotion ◽

The Voice ◽

Machine Interaction

Recognition of Human Emotion from a Speech Signal Based on Plutchik's ModelMachine recognition of human emotional states is an essential part in improving man-machine interaction. During expressive speech the voice conveys semantic message as well as the information about emotional state of the speaker. The pitch contour is one of the most significant properties of speech, which is affected by the emotional state. Therefore pitch features have been commonly used in systems for automatic emotion detection. In this work different intensities of emotions and their influence on pitch features have been studied. This understanding is important to develop such a system. Intensities of emotions are presented on Plutchik's cone-shaped 3D model. ThekNearest Neighbor algorithm has been used for classification. The classification has been divided into two parts. First, the primary emotion has been detected, then its intensity has been specified. The results show that the recognition accuracy of the system is over 50% for primary emotions, and over 70% for its intensities.

Download Full-text

Individual Identification Through Voice Using Mel-Frequency Cepstrum Coefficient (MFCC) and Hidden Markov Models (HMM) Method

Journal of Measurements Electronics Communications and Systems ◽

10.25124/jmecs.v7i1.3553 ◽

2020 ◽

Vol 7 (1) ◽

pp. 26

Author(s):

Dea Sifana Ramadhina ◽

Rita Magdalena ◽

Sofia Saidah

Keyword(s):

Hidden Markov Models ◽

Speaker Recognition ◽

Speech Signal ◽

Markov Models ◽

Hidden Markov ◽

Recognition System ◽

Individual Identification ◽

Acoustic Speech Signal ◽

Mel Frequency Cepstrum Coefficient ◽

The Voice

Voice is one of the parameters in the identification process of a person. Through the voice, information will be obtained such as gender, age, and even the identity of the speaker. Speaker recognition is a method to narrow down crimes and frauds committed by voice. So that it will minimize the occurrence of faking one's identity. The Method of Mel Frequency Cepstrum Coefficient (MFCC) can be used in the speech recognition system. The process of feature extraction of speech signal using MFCC will produce acoustic speech signal. The classification, Hidden Markov Models (HMM) is used to match unidentified speaker’s voice with the voices in database. In this research, the system is used to verify the speaker, namely 15 text dependent in Indonesian. On testing the speaker with the same as database, the highest accuracy is 99,16%.

Download Full-text

Monitoring Cognitive Workload Using Vocal Tract and Voice Source Features

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.10414 ◽

2017 ◽

Vol 61 (4) ◽

pp. 297 ◽

Cited By ~ 3

Author(s):

Eydis Huld Magnusdottir ◽

Michal Borsky ◽

Manuela Meier ◽

Kamilla Johannsdottir ◽

Jon Gudnason

Keyword(s):

Speech Signal ◽

Vocal Tract ◽

Memory Span ◽

Svm Classifier ◽

Cognitive Workload ◽

Voice Source ◽

Human Decision ◽

Rest Time ◽

The Voice ◽

Stroop Tasks

Monitoring cognitive workload from speech signals has received much attention from researchers in the past few years as it has the potential to improve performance and fidelity in human decision making. The bulk of the research has focused on classifying speech from talkers participating in cognitive workload experiments using simple reading tasks, memory span tests and the Stroop test, typically into three levels of low, medium and high cognitive workload. This study focuses on using parameters extracted from the vocal tract and the voice source components of the speech signal for cognitive workload monitoring. The experiment used in this study contains 98 participants, the levels were obtained by using a reading task and three Stroop tasks which were randomly ordered for each participant and an adequate rest time was used inbetween tasks to mitigate the effect of cognitive workload from one task affecting the subsequent one. Vocal tract features were obtained from the first three formants and voice source features were extracted using signal analysis on the inverse filtered speech signal. The results show that on their own, the vocal tract features outperform the voice source features. The MCR of 33.92% ± 1.05 was achieved with a SVM classifier. A weighted combination of vocal tract and voice source features classified with SWM classifier fused at the output level achieved the lowest MCR of 32.5%.

Download Full-text

Algorithms and architectures of speech recognition systems

Psychology and Education Journal ◽

10.17762/pae.v58i2.3182 ◽

2021 ◽

Vol 58 (2) ◽

pp. 6497-6501

Author(s):

N Mekebayev, O Mamyrbayev, M Turdalyuly, D Oralbekova, M Tasbolatov

Keyword(s):

Speech Recognition ◽

Speech Signal ◽

Direct Analysis ◽

Digital Processing ◽

Recognition Algorithm ◽

Processing Power ◽

Analysis And Synthesis ◽

Automatic Scoring ◽

The Voice ◽

Natural Way

Digital processing of speech signal and the voice recognition algorithm is very important for fast and accurate automatic scoring of the recognition technology. A voice is a signal of infinite information. The direct analysis and synthesis of a complex speech signal is due to the fact that the information is contained in the signal. Speech is the most natural way of communicating people. The task of speech recognition is to convert speech into a sequence of words using a computer program. This article presents an algorithm of extracting MFCC for speech recognition. The MFCC algorithm reduces the processing power by 53% compared to the conventional algorithm. Automatic speech recognition using Matlab.

Download Full-text