Vision–Language–Knowledge Co-Embedding for Visual Commonsense Reasoning

JaeYun Lee; Incheol Kim

doi:10.3390/s21092911

Vision–Language–Knowledge Co-Embedding for Visual Commonsense Reasoning

Sensors ◽

10.3390/s21092911 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2911

Author(s):

JaeYun Lee ◽

Incheol Kim

Keyword(s):

Natural Language ◽

Structural Information ◽

Input Image ◽

Knowledge Graph ◽

Commonsense Reasoning ◽

Proposed Model ◽

Language Knowledge ◽

Natural Language Question ◽

Knowledge Graphs ◽

Language Question

Visual commonsense reasoning is an intelligent task performed to decide the most appropriate answer to a question while providing the rationale or reason for the answer when an image, a natural language question, and candidate responses are given. For effective visual commonsense reasoning, both the knowledge acquisition problem and the multimodal alignment problem need to be solved. Therefore, we propose a novel Vision–Language–Knowledge Co-embedding (ViLaKC) model that extracts knowledge graphs relevant to the question from an external knowledge base, ConceptNet, and uses them together with the input image to answer the question. The proposed model uses a pretrained vision–language–knowledge embedding module, which co-embeds multimodal data including images, natural language texts, and knowledge graphs into a single feature vector. To reflect the structural information of the knowledge graph, the proposed model uses the graph convolutional neural network layer to embed the knowledge graph first and then uses multi-head self-attention layers to co-embed it with the image and natural language question. The effectiveness and performance of the proposed model are experimentally validated using the VCR v1.0 benchmark dataset.

Download Full-text

Leveraging Knowledge Graph Embeddings for Natural Language Question Answering

Database Systems for Advanced Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-18576-3_39 ◽

2019 ◽

pp. 659-675 ◽

Cited By ~ 5

Author(s):

Ruijie Wang ◽

Meng Wang ◽

Jun Liu ◽

Weitong Chen ◽

Michael Cochez ◽

...

Keyword(s):

Natural Language ◽

Question Answering ◽

Knowledge Graph ◽

Graph Embeddings ◽

Natural Language Question ◽

Language Question

Download Full-text

Bio-SODA: Enabling Natural Language Question Answering over Knowledge Graphs without Training Data

33rd International Conference on Scientific and Statistical Database Management ◽

10.1145/3468791.3469119 ◽

2021 ◽

Author(s):

Ana Claudia Sima ◽

Tarcisio Mendes de Farias ◽

Maria Anisimova ◽

Christophe Dessimoz ◽

Marc Robinson-Rechavi ◽

...

Keyword(s):

Natural Language ◽

Question Answering ◽

Training Data ◽

Natural Language Question ◽

Knowledge Graphs ◽

Language Question

Download Full-text

Intelligent Interaction with Virtual Geographical Environments Based on Geographic Knowledge Graph

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8100428 ◽

2019 ◽

Vol 8 (10) ◽

pp. 428 ◽

Cited By ~ 1

Author(s):

Bingchuan Jiang ◽

Liheng Tan ◽

Yan Ren ◽

Feng Li

Keyword(s):

Natural Language ◽

Large Scale ◽

Question Answering ◽

Knowledge Graph ◽

Virtual Geographic Environment ◽

Geographic Knowledge ◽

Natural Language Question ◽

Geographic Environment ◽

Intelligent Interaction ◽

Language Question

The core of intelligent virtual geographical environments (VGEs) is the formal expression of geographic knowledge. Its purpose is to transform the data, information, and scenes of a virtual geographic environment into “knowledge” that can be recognized by computer, so that the computer can understand the virtual geographic environment more easily. A geographic knowledge graph (GeoKG) is a large-scale semantic web that stores geographical knowledge in a structured form. Based on a geographic knowledge base and a geospatial database, intelligent interactions with virtual geographical environments can be realized by natural language question answering, entity links, and so on. In this paper, a knowledge-enhanced Virtual geographical environments service framework is proposed. We construct a multi-level semantic parsing model and an enhanced GeoKG for structured geographic information data, such as digital maps, 3D virtual scenes, and unstructured information data. Based on the GeoKG, we propose a bilateral LSTM-CRF (long short-term memory- conditional random field) model to achieve natural language question answering for VGEs and conduct experiments on the method. The results prove that the method of intelligent interaction based on the knowledge graph can bridge the distance between people and virtual environments.

Download Full-text

Interactive natural language question answering over knowledge graphs

Information Sciences ◽

10.1016/j.ins.2018.12.032 ◽

2019 ◽

Vol 481 ◽

pp. 141-159 ◽

Cited By ~ 15

Author(s):

Weiguo Zheng ◽

Hong Cheng ◽

Jeffrey Xu Yu ◽

Lei Zou ◽

Kangfei Zhao

Keyword(s):

Natural Language ◽

Question Answering ◽

Natural Language Question ◽

Knowledge Graphs ◽

Language Question

Download Full-text

Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model

10.18653/v1/e17-1036 ◽

2017 ◽

Cited By ~ 7

Author(s):

Sathish Reddy ◽

Dinesh Raghu ◽

Mitesh M. Khapra ◽

Sachindra Joshi

Keyword(s):

Natural Language ◽

Knowledge Graph ◽

Generation Model ◽

Question Generation ◽

Natural Language Question ◽

Language Question

Download Full-text

Confidence-Aware Embedding for Knowledge Graph Entity Typing

Complexity ◽

10.1155/2021/3473849 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Yu Zhao ◽

Jiayue Hou ◽

Zongjian Yu ◽

Yun Zhang ◽

Qing Li

Keyword(s):

Structural Information ◽

Source Code ◽

Experimental Results ◽

Knowledge Graph ◽

Separate Entity ◽

Noise Detection ◽

Type Space ◽

Proposed Model ◽

Benchmark Datasets ◽

Knowledge Graphs

Knowledge graphs (KGs) entity typing aims to predict the potential types to an entity, that is, (entity, entity type = ?). Recently, several embedding models are proposed for KG entity types prediction according to the existing typing information of the (entity, entity type) tuples in KGs. However, most of them unreasonably assume that all existing entity typing instances in KGs are completely correct, which ignore the nonnegligible entity type noises and may lead to potential errors for the downstream tasks. To address this problem, we propose ConfE, a novel confidence-aware embedding approach for modeling the (entity, entity type) tuples, which takes tuple confidence into consideration for learning better embeddings. Specifically, we learn the embeddings of entities and entity types in separate entity space and entity type space since they are different objects in KGs. We utilize an asymmetric matrix to specify the interaction of their embeddings and incorporate the tuple confidence as well. To make the tuple confidence more universal, we consider only the internal structural information in existing KGs. We evaluate our model on two tasks, including entity type noise detection and entity type prediction. The extensive experimental results in two public benchmark datasets (i.e., FB15kET and YAGO43kET) demonstrate that our proposed model outperforms all baselines on all tasks, which verify the effectiveness of ConfE in learning better embeddings on noisy KGs. The source code and data of this work can be obtained from https://github.com/swufenlp/ConfE.

Download Full-text

Natural language question answering over knowledge graph: the marriage of SPARQL query and keyword search

Knowledge and Information Systems ◽

10.1007/s10115-020-01534-4 ◽

2021 ◽

Author(s):

Xin Hu ◽

Jiangli Duan ◽

Depeng Dang

Keyword(s):

Natural Language ◽

Question Answering ◽

Keyword Search ◽

Sparql Query ◽

Knowledge Graph ◽

Natural Language Question ◽

Language Question

Download Full-text

Question-aware memory network for multi-hop question answering in human–robot interaction

Complex & Intelligent Systems ◽

10.1007/s40747-021-00448-0 ◽

2021 ◽

Author(s):

Xinmeng Li ◽

Mamoun Alazab ◽

Qian Li ◽

Keping Yu ◽

Quanjun Yin

Keyword(s):

Question Answering ◽

State Of The Art ◽

Human Robot Interaction ◽

Knowledge Graph ◽

Robot Interaction ◽

Natural Language Question ◽

Memory Network ◽

The Given ◽

Fine Tune ◽

Language Question

AbstractKnowledge graph question answering is an important technology in intelligent human–robot interaction, which aims at automatically giving answer to human natural language question with the given knowledge graph. For the multi-relation question with higher variety and complexity, the tokens of the question have different priority for the triples selection in the reasoning steps. Most existing models take the question as a whole and ignore the priority information in it. To solve this problem, we propose question-aware memory network for multi-hop question answering, named QA2MN, to update the attention on question timely in the reasoning process. In addition, we incorporate graph context information into knowledge graph embedding model to increase the ability to represent entities and relations. We use it to initialize the QA2MN model and fine-tune it in the training process. We evaluate QA2MN on PathQuestion and WorldCup2014, two representative datasets for complex multi-hop question answering. The result demonstrates that QA2MN achieves state-of-the-art Hits@1 accuracy on the two datasets, which validates the effectiveness of our model.

Download Full-text