Review on human speech Recognition Techniques

A MULTILINGUAL APPROACH TO TASK-ORIENTED MAN-MACHINE DIALOGUE BY VOICE

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001488000339 ◽

1988 ◽

Vol 02 (03) ◽

pp. 573-588

Author(s):

PHILIPPE MORIN ◽

JEAN-PAUL HATON ◽

JEAN-MARIE PIERREL ◽

GUENTHER RUSKE ◽

WALTER WEIGEL

Keyword(s):

Speech Recognition ◽

Electronic Mail ◽

Dialogue Systems ◽

Human Speech ◽

Artificial Languages ◽

Man Machine Dialogue ◽

Machine Communication ◽

Task Oriented ◽

Multilingual Approach ◽

Multimedia Interfaces

In the framework of man-machine communication, oral dialogue has a particular place since human speech presents several advantages when used either alone or in multimedia interfaces. The last decade has witnessed a proliferation of research into speech recognition and understanding, but few systems have been defined with a view to managing and understanding an actual man-machine dialogue. The PARTNER system that we describe in this paper proposes a solution in the case of task oriented dialogue with the use of artificial languages. A description of the essential characteristics of dialogue systems is followed by a presentation of the architecture and the principles of the PARTNER system. Finally, we present the most recent results obtained in the oral management of electronic mail in French and German.

Download Full-text

Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

The Journal of the Acoustical Society of America ◽

10.1121/1.1624065 ◽

2003 ◽

Vol 114 (6) ◽

pp. 3032-3035 ◽

Cited By ~ 9

Author(s):

Odette Scharenborg ◽

Louis ten Bosch ◽

Lou Boves ◽

Dennis Norris

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Human Speech ◽

End To End

Download Full-text

Flexible human speech recognition

1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings ◽

10.1109/asru.1997.659015 ◽

2002 ◽

Cited By ~ 7

Author(s):

L.C.W. Pols

Keyword(s):

Speech Recognition ◽

Human Speech

Download Full-text

Structure in talker variability: How much is there and how much can it help?

10.31234/osf.io/a4tkn ◽

2018 ◽

Author(s):

Dave F Kleinschmidt

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

R Package ◽

Ideal Observer ◽

Linguistic Variation ◽

Robust Speech Recognition ◽

Talker Variability ◽

New Techniques ◽

Human Speech ◽

The Face

One of the persistent puzzles in understanding human speech perception is how listeners cope with talker variability. One thing that might help listeners is structure in talker variability: rather than varying randomly, talkers of the same gender, dialect, age, etc. tend to produce language in similar ways. Sociolinguistic research has shown that listeners are sensitive to this covariation between linguistic variation and socio-indexical variables. In this paper I present new techniques based on ideal observer models to quantify 1) the amount and type of structure in talker variation, and 2) how useful such structure can be for robust speech recognition in the face of talker variability. I demonstrate these techniques in two phonetic domains---word-initial stop voicing and vowel identity---and show that these domains have different amounts and types of talker variability, consistent with previous, impressionistic findings. An `R` package accompanies this paper, enabling researchers to apply these techniques to their own data.

Download Full-text

On integrating insights from human speech perception into automatic speech recognition

10.21437/interspeech.2005-475 ◽

2005 ◽

Author(s):

Sorin Dusan ◽

Larry R. Rabiner

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

Automatic Speech Recognition ◽

Human Speech

Download Full-text

Automatic and human speech recognition in null grammar

The Journal of the Acoustical Society of America ◽

10.1121/1.3654648 ◽

2011 ◽

Vol 130 (4) ◽

pp. 2407-2407

Author(s):

Amit Juneja

Keyword(s):

Speech Recognition ◽

Human Speech

Download Full-text

EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition

Cognitive Science ◽

10.1111/cogs.12823 ◽

2020 ◽

Vol 44 (4) ◽

Cited By ~ 1

Author(s):

James S. Magnuson ◽

Heejo You ◽

Sahil Luthra ◽

Monica Li ◽

Hosung Nam ◽

...

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Network Model ◽

Neural Network Model ◽

Human Speech

Download Full-text

Speech Recognition Using Elman Artificial Neural Network and Linear Predictive Coding

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190411113728 ◽

2020 ◽

Vol 13 (4) ◽

pp. 650-656

Author(s):

Somayeh Khajehasani ◽

Louiza Dehyadegari

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Intelligent System ◽

Recognition Accuracy ◽

Predictive Coding ◽

Recognition System ◽

Visual Methods ◽

Linear Predictive Coding ◽

Elman Neural Network ◽

Human Speech

Background: Today, the automatic intelligent system requirement has caused an increasing consideration on the interactive modern techniques between human being and machine. These techniques generally consist of two types: audio and visual methods. Meanwhile, the need for developing the algorithms that enable the human speech recognition by machine is of high importance and frequently studied by the researchers. Objective: Using artificial intelligence methods has led to better results in human speech recognition, but the basic problem is the lack of an appropriate strategy to select the recognition data among the huge amount of speech information that practically makes it impossible for the available algorithms to work. Method: In this article, to solve the problem, the linear predictive coding coefficients extraction method is used to sum up the data related to the English digits pronunciation. After extracting the database, it is utilized to an Elman neural network to recognize the relation between the linear coding coefficients of an audio file with the pronounced digit. Results: The results show that this method has a good performance compared to other methods. According to the experiments, the obtained results of network training (99% recognition accuracy) indicate that the network still has better performance than RBF despite many errors. Conclusion: The results of the experiments showed that the Elman memory neural network has had an acceptable performance in recognizing the speech signal compared to the other algorithms. The use of the linear predictive coding coefficients along with the Elman neural network has led to higher recognition accuracy and improved the speech recognition system.

Download Full-text