Speaker diarization using eye-gaze information in multi-party conversations

Author(s):  
Koji Inoue ◽  
Yukoh Wakabayashi ◽  
Hiromasa Yoshimoto ◽  
Tatsuya Kawahara
Keyword(s):  
Eye Gaze ◽  
Author(s):  
Tatsuya Kawahara ◽  
Takuma Iwatate ◽  
Koji Inoue ◽  
Soichiro Hayashi ◽  
Hiromasa Yoshimoto ◽  
...  

Conversations in poster sessions in academic events, referred to as poster conversations, pose interesting, and challenging topics on multi-modal signal and information processing. We have developed a smart posterboard for multi-modal recording and analysis of poster conversations. The smart posterboard has multiple sensing devices to record poster conversations, so we can review who came to the poster and what kind of questions or comments he/she made. The conversation analysis incorporates face and eye-gaze tracking for effective speaker diarization. It is demonstrated that eye-gaze information is useful for predicting turn-taking and also improving speaker diarization. Moreover, high-level indexing of interest and comprehension level of the audience is explored based on the multi-modal behaviors during the conversation. This is realized by predicting the audience's speech acts such as questions and reactive tokens.


2015 ◽  
Author(s):  
Koji Inoue ◽  
Yukoh Wakabayashi ◽  
Hiromasa Yoshimoto ◽  
Katsuya Takanashi ◽  
Tatsuya Kawahara
Keyword(s):  
Eye Gaze ◽  

2014 ◽  
Vol 23 (1) ◽  
pp. 42-54 ◽  
Author(s):  
Tanya Rose Curtis

As the field of telepractice grows, perceived barriers to service delivery must be anticipated and addressed in order to provide appropriate service delivery to individuals who will benefit from this model. When applying telepractice to the field of AAC, additional barriers are encountered when clients with complex communication needs are unable to speak, often present with severe quadriplegia and are unable to position themselves or access the computer independently, and/or may have cognitive impairments and limited computer experience. Some access methods, such as eye gaze, can also present technological challenges in the telepractice environment. These barriers can be overcome, and telepractice is not only practical and effective, but often a preferred means of service delivery for persons with complex communication needs.


2014 ◽  
Vol 23 (3) ◽  
pp. 132-139 ◽  
Author(s):  
Lauren Zubow ◽  
Richard Hurtig

Children with Rett Syndrome (RS) are reported to use multiple modalities to communicate although their intentionality is often questioned (Bartolotta, Zipp, Simpkins, & Glazewski, 2011; Hetzroni & Rubin, 2006; Sigafoos et al., 2000; Sigafoos, Woodyatt, Tuckeer, Roberts-Pennell, & Pittendreigh, 2000). This paper will present results of a study analyzing the unconventional vocalizations of a child with RS. The primary research question addresses the ability of familiar and unfamiliar listeners to interpret unconventional vocalizations as “yes” or “no” responses. This paper will also address the acoustic analysis and perceptual judgments of these vocalizations. Pre-recorded isolated vocalizations of “yes” and “no” were presented to 5 listeners (mother, father, 1 unfamiliar, and 2 familiar clinicians) and the listeners were asked to rate the vocalizations as either “yes” or “no.” The ratings were compared to the original identification made by the child's mother during the face-to-face interaction from which the samples were drawn. Findings of this study suggest, in this case, the child's vocalizations were intentional and could be interpreted by familiar and unfamiliar listeners as either “yes” or “no” without contextual or visual cues. The results suggest that communication partners should be trained to attend to eye-gaze and vocalizations to ensure the child's intended choice is accurately understood.


2006 ◽  
Author(s):  
Christopher R. Jones ◽  
Russell H. Fazio ◽  
Michael Olson

Sign in / Sign up

Export Citation Format

Share Document