RDMMFET: Representation of Dense Multimodality Fusion Encoder Based on Transformer

Mobile Information Systems ◽

10.1155/2021/2662064 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Xu Zhang ◽

DeZhi Han ◽

Chin-Chen Chang

Keyword(s):

Question Answering ◽

Language Model ◽

Visual Images ◽

Text And Image ◽

Data Set ◽

Fine Grained ◽

Natural Language Question ◽

The Relationship ◽

Language Question ◽

Better Than

Visual question answering (VQA) is the natural language question-answering of visual images. The model of VQA needs to make corresponding answers according to specific questions based on understanding images, the most important of which is to understand the relationship between images and language. Therefore, this paper proposes a new model, Representation of Dense Multimodality Fusion Encoder Based on Transformer, for short, RDMMFET, which can learn the related knowledge between vision and language. The RDMMFET model consists of three parts: dense language encoder, image encoder, and multimodality fusion encoder. In addition, we designed three types of pretraining tasks: masked language model, masked image model, and multimodality fusion task. These pretraining tasks can help to understand the fine-grained alignment between text and image regions. Simulation results on the VQA v2.0 data set show that the RDMMFET model can work better than the previous model. Finally, we conducted detailed ablation studies on the RDMMFET model and provided the results of attention visualization, which proves that the RDMMFET model can significantly improve the effect of VQA.

Download Full-text

Natural Language Access to Enterprise Data

AI Magazine ◽

10.1609/aimag.v35i1.2502 ◽

2014 ◽

Vol 35 (1) ◽

pp. 38 ◽

Cited By ~ 2

Author(s):

Ulli Waltinger ◽

Dan Tecuci ◽

Mihaela Olteanu ◽

Vlad Mocanu ◽

Sean Sullivan

Keyword(s):

Natural Language ◽

Question Answering ◽

Data Sources ◽

Easy Access ◽

Data Set ◽

Language Access ◽

Question Answering System ◽

Natural Language Question ◽

Language Question

This paper describes USI Answers — a natural language question answering system for enterprise data. We report on the progress towards the goal of offering easy access to enterprise data to a large number of business users, most of whom are not familiar with the specific syntax or semantics of the underlying data sources. Additional complications come from the nature of the data, which comes both as structured and unstructured. The proposed solution allows users to express questions in natural language, makes apparent the system's interpretation of the query, and allows easy query adjustment and reformulation. The application is in use by more than 1500 users from Siemens Energy. We evaluate our approach on a data set consisting of fleet data.

Download Full-text

Question-aware memory network for multi-hop question answering in human–robot interaction

Complex & Intelligent Systems ◽

10.1007/s40747-021-00448-0 ◽

2021 ◽

Author(s):

Xinmeng Li ◽

Mamoun Alazab ◽

Qian Li ◽

Keping Yu ◽

Quanjun Yin

Keyword(s):

Question Answering ◽

State Of The Art ◽

Human Robot Interaction ◽

Knowledge Graph ◽

Robot Interaction ◽

Natural Language Question ◽

Memory Network ◽

The Given ◽

Fine Tune ◽

Language Question

AbstractKnowledge graph question answering is an important technology in intelligent human–robot interaction, which aims at automatically giving answer to human natural language question with the given knowledge graph. For the multi-relation question with higher variety and complexity, the tokens of the question have different priority for the triples selection in the reasoning steps. Most existing models take the question as a whole and ignore the priority information in it. To solve this problem, we propose question-aware memory network for multi-hop question answering, named QA2MN, to update the attention on question timely in the reasoning process. In addition, we incorporate graph context information into knowledge graph embedding model to increase the ability to represent entities and relations. We use it to initialize the QA2MN model and fine-tune it in the training process. We evaluate QA2MN on PathQuestion and WorldCup2014, two representative datasets for complex multi-hop question answering. The result demonstrates that QA2MN achieves state-of-the-art Hits@1 accuracy on the two datasets, which validates the effectiveness of our model.

Download Full-text

Structured Two-Stream Attention Network for Video Question Answering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016391 ◽

2019 ◽

Vol 33 ◽

pp. 6391-6398 ◽

Cited By ~ 3

Author(s):

Lianli Gao ◽

Pengpeng Zeng ◽

Jingkuan Song ◽

Yuan-Fang Li ◽

Wu Liu ◽

...

Keyword(s):

Large Scale ◽

Question Answering ◽

Free Form ◽

Image Region ◽

Attention Network ◽

Natural Language Question ◽

Temporal Structures ◽

Text Features ◽

Video Input ◽

Language Question

To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA. Compared with image QA that focuses primarily on understanding the associations between image region-level details and corresponding questions, video QA requires a model to jointly reason across both spatial and long-range temporal structures of a video as well as text to provide an accurate answer. In this paper, we specifically tackle the problem of video QA by proposing a Structured Two-stream Attention network, namely STA, to answer a free-form or open-ended natural language question about the content of a given video. First, we infer rich longrange temporal structures in videos using our structured segment component and encode text features. Then, our structured two-stream attention component simultaneously localizes important visual instance, reduces the influence of background video and focuses on the relevant text. Finally, the structured two-stream fusion component incorporates different segments of query and video aware context representation and infers the answers. Experiments on the large-scale video QA dataset TGIF-QA show that our proposed method significantly surpasses the best counterpart (i.e., with one representation for the video input) by 13.0%, 13.5%, 11.0% and 0.3 for Action, Trans., TrameQA and Count tasks. It also outperforms the best competitor (i.e., with two representations) on the Action, Trans., TrameQA tasks by 4.1%, 4.7%, and 5.1%.

Download Full-text

Natural language question answering over RDF data

Proceedings of the 2013 international conference on Management of data - SIGMOD '13 ◽

10.1145/2463676.2463725 ◽

2013 ◽

Cited By ~ 3

Author(s):

Ruizhe Huang ◽

Lei Zou

Keyword(s):

Natural Language ◽

Question Answering ◽

Natural Language Question ◽

Rdf Data ◽

Language Question

Download Full-text

A terminological simplification transformation for natural language question-answering systems

10.3115/981131.981164 ◽

1986 ◽

Cited By ~ 4

Author(s):

David G. Stallard

Keyword(s):

Natural Language ◽

Question Answering ◽

Question Answering Systems ◽

Natural Language Question ◽

Language Question

Download Full-text

Ensemble Approach for Natural Language Question Answering Problem

2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW) ◽

10.1109/candarw.2019.00039 ◽

2019 ◽

Author(s):

Anna Aniol ◽

Marcin Pietron ◽

Jerzy Duda

Keyword(s):

Natural Language ◽

Question Answering ◽

Ensemble Approach ◽

Natural Language Question ◽

Language Question

Download Full-text

Improve Language Modeling for Code Completion Through Learning General Token Repetition of Source Code with Optimized Memory

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194019400229 ◽

2019 ◽

Vol 29 (11n12) ◽

pp. 1801-1818

Author(s):

Yixiao Yang ◽

Xiang Chen ◽

Jiaguang Sun

Keyword(s):

Prediction Accuracy ◽

Language Model ◽

Source Code ◽

Attention Mechanism ◽

Data Set ◽

Training Time ◽

Unseen Data ◽

Code Completion ◽

High Chance ◽

Better Than

In last few years, applying language model to source code is the state-of-the-art method for solving the problem of code completion. However, compared with natural language, code has more obvious repetition characteristics. For example, a variable can be used many times in the following code. Variables in source code have a high chance to be repetitive. Cloned code and templates, also have the property of token repetition. Capturing the token repetition of source code is important. In different projects, variables or types are usually named differently. This means that a model trained in a finite data set will encounter a lot of unseen variables or types in another data set. How to model the semantics of the unseen data and how to predict the unseen data based on the patterns of token repetition are two challenges in code completion. Hence, in this paper, token repetition is modelled as a graph, we propose a novel REP model which is based on deep neural graph network to learn the code toke repetition. The REP model is to identify the edge connections of a graph to recognize the token repetition. For predicting the token repetition of token [Formula: see text], the information of all the previous tokens needs to be considered. We use memory neural network (MNN) to model the semantics of each distinct token to make the framework of REP model more targeted. The experiments indicate that the REP model performs better than LSTM model. Compared with Attention-Pointer network, we also discover that the attention mechanism does not work in all situations. The proposed REP model could achieve similar or slightly better prediction accuracy compared to Attention-Pointer network and consume less training time. We also find other attention mechanism which could further improve the prediction accuracy.

Download Full-text

Industry Watch

Natural Language Engineering ◽

10.1017/s1351324907004573 ◽

2007 ◽

Vol 13 (2) ◽

pp. 185-189

Author(s):

ROBERT DALE

Keyword(s):

Natural Language ◽

Search Engine ◽

Question Answering ◽

University Of California ◽

News Item ◽

News Sources ◽

Language Technology ◽

Natural Language Question ◽

The University ◽

Language Question

“Powerset Hype to Boiling Point”, said a February headline on TechCrunch. In the last installment of this column, I asked whether 2007 would be the year of question-answering. My query was occasioned by a number of new attempts at natural language question-answering that were being promoted in the marketplace as the next advance upon search, and particularly by the buzz around the stealth-mode natural language search company Powerset. That buzz continued with a major news item in the first quarter of this year: in February, Xerox PARC and PowerSet struck a much-anticipated deal whereby PowerSet won exclusive rights to use PARC's natural language technology, as announced in a VentureBeat posting. Following the scoop, other news sources drew the battle lines with titles like “Can natural language search bring down Google?”, “Xerox vs. Google?”, and “Powerset and Xerox PARC team up to beat Google”. An April posting on Barron's Online noted that an analyst at Global Equities Research had cited Powerset in his downgrading of Google from Buy to Neutral. And, all this on the basis of a product which, at the time of writing, very few people have actually seen. Indications are that the search engine is expected to go live by the end of the year, so we have a few more months to wait to see whether this really is a Google-killer. Meanwhile, another question remaining unanswered is what happened to the Powerset engineer who seemed less sure about the technology's capabilities: see the segment at the end of D7TV's PartyCrasher video from the Powerset launch party. For a more confident appraisal of natural language search, check out the podcast of Barney Pell, CEO of Powerset, giving a lecture at the University of California–Berkeley.

Download Full-text

An application of automated reasoning in natural language question answering

AI Communications ◽

10.3233/aic-2010-0461 ◽

2010 ◽

Vol 23 (2-3) ◽

pp. 241-265 ◽

Cited By ~ 13

Author(s):

Ulrich Furbach ◽

Ingo Glöckner ◽

Björn Pelzer

Keyword(s):

Natural Language ◽

Automated Reasoning ◽

Question Answering ◽

Natural Language Question ◽

Language Question

Download Full-text

A Quantum Expectation Value Based Language Model with Application to Question Answering

Entropy ◽

10.3390/e22050533 ◽

2020 ◽

Vol 22 (5) ◽

pp. 533

Author(s):

Qin Zhao ◽

Chenguang Hou ◽

Changjian Liu ◽

Peng Zhang ◽

Ruifeng Xu

Keyword(s):

Hilbert Space ◽

Density Matrix ◽

Question Answering ◽

Language Model ◽

Language Models ◽

Quantum Model ◽

Expectation Value ◽

Proposed Model ◽

Matching Score ◽

The Relationship

Quantum-inspired language models have been introduced to Information Retrieval due to their transparency and interpretability. While exciting progresses have been made, current studies mainly investigate the relationship between density matrices of difference sentence subspaces of a semantic Hilbert space. The Hilbert space as a whole which has a unique density matrix is lack of exploration. In this paper, we propose a novel Quantum Expectation Value based Language Model (QEV-LM). A unique shared density matrix is constructed for the Semantic Hilbert Space. Words and sentences are viewed as different observables in this quantum model. Under this background, a matching score describing the similarity between a question-answer pair is naturally explained as the quantum expectation value of a joint question-answer observable. In addition to the theoretical soundness, experiment results on the TREC-QA and WIKIQA datasets demonstrate the computational efficiency of our proposed model with excellent performance and low time consumption.

Download Full-text