multimodal dialogue Latest Research Papers

Recognizing Social Signals with Weakly Supervised Multitask Learning for Multimodal Dialogue Systems

10.1145/3462244.3479927 ◽

2021 ◽

Author(s):

Yuki Hirano ◽

Shogo Okada ◽

Kazunori Komatani

Keyword(s):

Multitask Learning ◽

Dialogue Systems ◽

Social Signals ◽

Weakly Supervised ◽

Multimodal Dialogue

Download Full-text

Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent

10.21437/interspeech.2021-1796 ◽

2021 ◽

Author(s):

Hardik Kothare ◽

Vikram Ramanarayanan ◽

Oliver Roesler ◽

Michael Neumann ◽

Jackson Liscombe ◽

...

Keyword(s):

Autism Spectrum Disorder ◽

Autism Spectrum ◽

Spectrum Disorder ◽

Multimodal Dialogue

Download Full-text

Multi-Aspect Controlled Response Generation in a Multimodal Dialogue System using Hierarchical Transformer Network

10.1109/ijcnn52387.2021.9533886 ◽

2021 ◽

Author(s):

Mauajama Firdaus ◽

Nidhi Thakur ◽

Asif Ekbal

Keyword(s):

Dialogue System ◽

Multimodal Dialogue

Download Full-text

Multimodal dialogue in small-group mathematics learning

Learning Culture and Social Interaction ◽

10.1016/j.lcsi.2021.100491 ◽

2021 ◽

Vol 29 ◽

pp. 100491

Author(s):

Rotem Abdu ◽

Gitte van Helden ◽

Rosa Alberto ◽

Arthur Bakker

Keyword(s):

Small Group ◽

Mathematics Learning ◽

Multimodal Dialogue

Download Full-text

Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent

10.1101/2021.04.10.439293 ◽

2021 ◽

Author(s):

Hardik Kothare ◽

Vikram Ramanarayanan ◽

Oliver Roesler ◽

Michael Neumann ◽

Jackson Liscombe ◽

...

Keyword(s):

Autism Spectrum Disorder ◽

Preliminary Evidence ◽

Autism Spectrum ◽

Spectrum Disorder ◽

Dominant Hand ◽

Motor Speed ◽

Virtual Agent ◽

Multimodal Dialogue ◽

Clinical Measures ◽

Vocal Affect

We explore the utility of an on-demand multimodal conversational platform in extracting speech and facial metrics in children with Autism Spectrum Disorder (ASD). We investigate the extent to which these metrics correlate with objective clinical measures, particularly as they pertain to the interplay between the affective, phonatory and motoric subsystems. 22 participants diagnosed with ASD engaged with a virtual agent in conversational affect production tasks designed to elicit facial and vocal affect. We found significant correlations between vocal pitch and loudness extracted by our platform during these tasks and accuracy in recognition of facial and vocal affect, assessed via the Diagnostic Analysis of Nonverbal Accuracy-2 (DANVA-2) neuropsychological task. We also found significant correlations between jaw kinematic metrics extracted using our platform and motor speed of the dominant hand assessed via a standardised neuropsychological finger tapping task. These findings offer preliminary evidence for the usefulness of these audiovisual analytic metrics and could help us better model the interplay between different physiological subsystems in individuals with ASD.

Download Full-text

Aspect-Aware Response Generation for Multimodal Dialogue System

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3430752 ◽

2021 ◽

Vol 12 (2) ◽

pp. 1-33

Author(s):

Mauajama Firdaus ◽

Nidhi Thakur ◽

Asif Ekbal

Keyword(s):

User Satisfaction ◽

Dialogue Systems ◽

Conversational Agents ◽

Dialogue System ◽

Multimodal Systems ◽

Dialog System ◽

Language And Vision ◽

Memory Network ◽

Multimodal Dialogue ◽

Task Oriented

Multimodality in dialogue systems has opened up new frontiers for the creation of robust conversational agents. Any multimodal system aims at bridging the gap between language and vision by leveraging diverse and often complementary information from image, audio, and video, as well as text. For every task-oriented dialog system, different aspects of the product or service are crucial for satisfying the user’s demands. Based upon the aspect, the user decides upon selecting the product or service. The ability to generate responses with the specified aspects in a goal-oriented dialogue setup facilitates user satisfaction by fulfilling the user’s goals. Therefore, in our current work, we propose the task of aspect controlled response generation in a multimodal task-oriented dialog system. We employ a multimodal hierarchical memory network for generating responses that utilize information from both text and images. As there was no readily available data for building such multimodal systems, we create a Multi-Domain Multi-Modal Dialog (MDMMD++) dataset. The dataset comprises the conversations having both text and images belonging to the four different domains, such as hotels, restaurants, electronics, and furniture. Quantitative and qualitative analysis on the newly created MDMMD++ dataset shows that the proposed methodology outperforms the baseline models for the proposed task of aspect controlled response generation.

Download Full-text

Multimodal Dialogue Data Collection and Analysis of Annotation Disagreement

Lecture Notes in Electrical Engineering - Increasing Naturalness and Flexibility in Spoken Dialogue Interaction ◽

10.1007/978-981-15-9323-9_17 ◽

2021 ◽

pp. 201-213

Author(s):

Kazunori Komatani ◽

Shogo Okada ◽

Haruto Nishimoto ◽

Masahiro Araki ◽

Mikio Nakano

Keyword(s):

Data Collection ◽

Data Collection And Analysis ◽

Multimodal Dialogue

Download Full-text

Women in poetry and comics: multimodal dialogue between John Keats and Edna St. Vincent Millay

Cuadernos del Centro de Estudios de Diseño y Comunicación ◽

10.18682/cdc.vi123.4403 ◽

2020 ◽

Author(s):

Ana Abril Hernández

Keyword(s):

Twentieth Century ◽

Meaning Making ◽

John Keats ◽

Point Of View ◽

Comparative Approach ◽

Literary Genres ◽

Multimodal Texts ◽

Narrative Poetry ◽

Early Twentieth Century ◽

Multimodal Dialogue

The comic form of art has witnessed a dramatic increase in the number of readers who have chosen this medium to deepen into meaning-making processes in multimodal texts. But little has been said so far about the adaptation of certain literary genres to the comic form. This is the case of poetry, narrative poetry, in particular illustrated in this study in the celebrated ballad: “La belle dame sans merci” (1819) by John Keats and the modern sonnet: “The singing-woman from the wood’s edge” (1920) by the American feminist poet Edna St. Vincent Millay. In view of two recent adaptations to the comic format of these poems, the present investigation explores from a comparative approach the semiotic processes at stake in representing women from the poets’ own point of view and also from their corresponding graphic artists’ in order to have a look at the changes in the depiction of women in poetry from the Romantic image of women to the view of women in the early twentieth century to the present day.

Download Full-text

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

PLoS ONE ◽

10.1371/journal.pone.0241271 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0241271

Author(s):

Mauajama Firdaus ◽

Arunav Pratap Shandeelya ◽

Asif Ekbal

Keyword(s):

Large Scale ◽

Contextual Information ◽

Empirical Evaluation ◽

Dialogue Systems ◽

Beam Search ◽

Dialogue System ◽

Text And Image ◽

Dialog System ◽

Multimodal Dialogue ◽

Task Oriented

Multimodal dialogue system, due to its many-fold applications, has gained much attention to the researchers and developers in recent times. With the release of large-scale multimodal dialog dataset Saha et al. 2018 on the fashion domain, it has been possible to investigate the dialogue systems having both textual and visual modalities. Response generation is an essential aspect of every dialogue system, and making the responses diverse is an important problem. For any goal-oriented conversational agent, the system’s responses must be informative, diverse and polite, that may lead to better user experiences. In this paper, we propose an end-to-end neural framework for generating varied responses in a multimodal dialogue setup capturing information from both the text and image. Multimodal encoder with co-attention between the text and image is used for focusing on the different modalities to obtain better contextual information. For effective information sharing across the modalities, we combine the information of text and images using the BLOCK fusion technique that helps in learning an improved multimodal representation. We employ stochastic beam search with Gumble Top K-tricks to achieve diversified responses while preserving the content and politeness in the responses. Experimental results show that our proposed approach performs significantly better compared to the existing and baseline methods in terms of distinct metrics, and thereby generates more diverse responses that are informative, interesting and polite without any loss of information. Empirical evaluation also reveals that images, while used along with the text, improve the efficiency of the model in generating diversified responses.

Download Full-text

Multimodal Dialogue Systems via Capturing Context-aware Dependencies of Semantic Elements

Proceedings of the 28th ACM International Conference on Multimedia ◽

10.1145/3394171.3413679 ◽

2020 ◽

Author(s):

Weidong He ◽

Zhi Li ◽

Dongcai Lu ◽

Enhong Chen ◽

Tong Xu ◽

...

Keyword(s):

Context Aware ◽

Dialogue Systems ◽

Multimodal Dialogue

Download Full-text

multimodal dialogue
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Recognizing Social Signals with Weakly Supervised Multitask Learning for Multimodal Dialogue Systems

Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent

Multi-Aspect Controlled Response Generation in a Multimodal Dialogue System using Hierarchical Transformer Network

Multimodal dialogue in small-group mathematics learning

Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent

Aspect-Aware Response Generation for Multimodal Dialogue System

Multimodal Dialogue Data Collection and Analysis of Annotation Disagreement

Women in poetry and comics: multimodal dialogue between John Keats and Edna St. Vincent Millay

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

Multimodal Dialogue Systems via Capturing Context-aware Dependencies of Semantic Elements

Export Citation Format

multimodal dialogueRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Recognizing Social Signals with Weakly Supervised Multitask Learning for Multimodal Dialogue Systems

Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent

Multi-Aspect Controlled Response Generation in a Multimodal Dialogue System using Hierarchical Transformer Network

Multimodal dialogue in small-group mathematics learning

Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent

Aspect-Aware Response Generation for Multimodal Dialogue System

Multimodal Dialogue Data Collection and Analysis of Annotation Disagreement

Women in poetry and comics: multimodal dialogue between John Keats and Edna St. Vincent Millay

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

Multimodal Dialogue Systems via Capturing Context-aware Dependencies of Semantic Elements

multimodal dialogue
Recently Published Documents