Automatic alignment of phonetic events with x‐ray microbeam articulatory data and the acoustic speech signal

J. H. Greenwald; A. K. Krishnamurthy; O. Fujimura

doi:10.1121/1.2027318

Automatic alignment of phonetic events with x‐ray microbeam articulatory data and the acoustic speech signal

The Journal of the Acoustical Society of America ◽

10.1121/1.2027318 ◽

1989 ◽

Vol 86 (S1) ◽

pp. S116-S116 ◽

Cited By ~ 1

Author(s):

J. H. Greenwald ◽

A. K. Krishnamurthy ◽

O. Fujimura

Keyword(s):

Speech Signal ◽

X Ray ◽

Automatic Alignment ◽

Acoustic Speech Signal

Download Full-text

Visual speech signal representations according to an articulatory model based on X-ray motion films

Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol. IV. Conference D: Architectures for Vision and Pattern Recognition, ◽

10.1109/icpr.1992.202062 ◽

2003 ◽

Author(s):

Wen Chengyi ◽

Chen Nan ◽

Zhang Yufu

Keyword(s):

Speech Signal ◽

Visual Speech ◽

X Ray ◽

Model Based ◽

Signal Representations ◽

Articulatory Model

Download Full-text

Erratum: “Manifestations of Task‐Induced Stress in the Acoustic Speech Signal” [J. Acoust. Soc. Amer. 44, 993–1001 (1968)]

The Journal of the Acoustical Society of America ◽

10.1121/1.1911411 ◽

1969 ◽

Vol 45 (2) ◽

pp. 519-519

Author(s):

Michael H. L. Hecker

Keyword(s):

Speech Signal ◽

Induced Stress ◽

Acoustic Speech Signal

Download Full-text

Automatic alignment of a Kirkpatrick-Baez active optic by use of a soft-x-ray Hartmann wavefront sensor

Optics Letters ◽

10.1364/ol.31.000199 ◽

2006 ◽

Vol 31 (2) ◽

pp. 199 ◽

Cited By ~ 24

Author(s):

Pascal Mercère ◽

Mourad Idir ◽

Thierry Moreno ◽

Gilles Cauchon ◽

Guillaume Dovillaire ◽

...

Keyword(s):

Wavefront Sensor ◽

X Ray ◽

Automatic Alignment

Download Full-text

Automatic alignment of a phonetic transcription with articulatory events from x‐ray data of continuous speech utterances

The Journal of the Acoustical Society of America ◽

10.1121/1.2017157 ◽

1979 ◽

Vol 65 (S1) ◽

pp. S22-S22 ◽

Cited By ~ 2

Author(s):

W. L. Nelson

Keyword(s):

Continuous Speech ◽

X Ray ◽

Phonetic Transcription ◽

Automatic Alignment

Download Full-text

Audio-Visual and Visual-Only Speech and Speaker Recognition

Visual Speech Recognition ◽

10.4018/978-1-60566-186-5.ch001 ◽

2009 ◽

pp. 1-38 ◽

Cited By ~ 5

Author(s):

Derek J. Shiell ◽

Louis H. Terry ◽

Petar S. Aleksic ◽

Aggelos K. Katsaggelos

Keyword(s):

Speaker Recognition ◽

Speech Signal ◽

Low Cost ◽

Visual Speech ◽

Future Research ◽

User Cooperation ◽

Acoustic Speech Signal ◽

Recognition Systems ◽

Acoustic Environments ◽

Main Components

The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person’s voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today’s society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed.

Download Full-text

Optimal estimation of vocal tract area functions from speech signal constrained by X-ray microbeam data

IEEE International Conference on Acoustics Speech and Signal Processing ◽

10.1109/icassp.1993.319361 ◽

1993 ◽

Author(s):

Q. Guo ◽

P. Milenkovic

Keyword(s):

Speech Signal ◽

Vocal Tract ◽

Optimal Estimation ◽

X Ray

Download Full-text

Manifestations of Task‐Induced Stress in the Acoustic Speech Signal

The Journal of the Acoustical Society of America ◽

10.1121/1.1911241 ◽

1968 ◽

Vol 44 (4) ◽

pp. 993-1001 ◽

Cited By ~ 52

Author(s):

Michael H. L. Hecker ◽

Kenneth N. Stevens ◽

Gottfried von Bismarck ◽

Carl E. Williams

Keyword(s):

Speech Signal ◽

Induced Stress ◽

Acoustic Speech Signal

Download Full-text

Automatic alignment and reconstruction of images for soft X-ray tomography

Journal of Structural Biology ◽

10.1016/j.jsb.2011.11.027 ◽

2012 ◽

Vol 177 (2) ◽

pp. 259-266 ◽

Cited By ~ 43

Author(s):

Dilworth Y. Parkinson ◽

Christian Knoechel ◽

Chao Yang ◽

Carolyn A. Larabell ◽

Mark A. Le Gros

Keyword(s):

X Ray ◽

Automatic Alignment

Download Full-text

Individual Identification Through Voice Using Mel-Frequency Cepstrum Coefficient (MFCC) and Hidden Markov Models (HMM) Method

Journal of Measurements Electronics Communications and Systems ◽

10.25124/jmecs.v7i1.3553 ◽

2020 ◽

Vol 7 (1) ◽

pp. 26

Author(s):

Dea Sifana Ramadhina ◽

Rita Magdalena ◽

Sofia Saidah

Keyword(s):

Hidden Markov Models ◽

Speaker Recognition ◽

Speech Signal ◽

Markov Models ◽

Hidden Markov ◽

Recognition System ◽

Individual Identification ◽

Acoustic Speech Signal ◽

Mel Frequency Cepstrum Coefficient ◽

The Voice

Voice is one of the parameters in the identification process of a person. Through the voice, information will be obtained such as gender, age, and even the identity of the speaker. Speaker recognition is a method to narrow down crimes and frauds committed by voice. So that it will minimize the occurrence of faking one's identity. The Method of Mel Frequency Cepstrum Coefficient (MFCC) can be used in the speech recognition system. The process of feature extraction of speech signal using MFCC will produce acoustic speech signal. The classification, Hidden Markov Models (HMM) is used to match unidentified speaker’s voice with the voices in database. In this research, the system is used to verify the speaker, namely 15 text dependent in Indonesian. On testing the speaker with the same as database, the highest accuracy is 99,16%.

Download Full-text

Bimodal classification of English allophones employing acoustic speech signal and facial motion capture

The Journal of the Acoustical Society of America ◽

10.1121/1.5067951 ◽

2018 ◽

Vol 144 (3) ◽

pp. 1801-1802

Author(s):

Andrzej Czyzewski ◽

Szymon Zaporowski ◽

Bozena Kostek

Keyword(s):

Motion Capture ◽

Speech Signal ◽

Facial Motion ◽

Acoustic Speech Signal

Download Full-text