Break It Down: A Question Understanding Benchmark

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00309 ◽

2020 ◽

Vol 8 ◽

pp. 183-198

Author(s):

Tomer Wolfson ◽

Mor Geva ◽

Ankit Gupta ◽

Matt Gardner ◽

Yoav Goldberg ◽

...

Keyword(s):

Natural Language ◽

Formal Language ◽

Question Answering ◽

Semantic Parsing ◽

Open Domain ◽

Meaning Representation

Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning Representation (QDMR) for questions. QDMR constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question. We develop a crowdsourcing pipeline, showing that quality QDMRs can be annotated at scale, and release the Break dataset, containing over 83K pairs of questions and their QDMRs. We demonstrate the utility of QDMR by showing that (a) it can be used to improve open-domain question answering on the HotpotQA dataset, (b) it can be deterministically converted to a pseudo-SQL formal language, which can alleviate annotation in semantic parsing applications. Last, we use Break to train a sequence-to-sequence model with copying that parses questions into QDMR structures, and show that it substantially outperforms several natural baselines.

Download Full-text

Textual Question Answering for Semantic Parsing in Natural Language Processing

2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT) ◽

10.1109/icasert.2019.8934734 ◽

2019 ◽

Author(s):

Jaydeb Sarker ◽

Mustain Billah ◽

Md. Al Mamun

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Parsing

Download Full-text

Geographic Question Answering: Challenges, Uniqueness, Classification, and Future Directions

AGILE: GIScience Series ◽

10.5194/agile-giss-2-8-2021 ◽

2021 ◽

Vol 2 ◽

pp. 1-21

Author(s):

Gengchen Mai ◽

Krzysztof Janowicz ◽

Rui Zhu ◽

Ling Cai ◽

Ni Lao

Keyword(s):

Artificial Intelligence ◽

Natural Language ◽

Question Answering ◽

Future Research ◽

Open Domain ◽

Future Directions ◽

Research Directions ◽

Classification Framework ◽

Substantial Progress ◽

Future Research Directions

Abstract. As an important part of Artificial Intelligence (AI), Question Answering (QA) aims at generating answers to questions phrased in natural language. While there has been substantial progress in open-domain question answering, QA systems are still struggling to answer questions which involve geographic entities or concepts and that require spatial operations. In this paper, we discuss the problem of geographic question answering (GeoQA). We first investigate the reasons why geographic questions are difficult to answer by analyzing challenges of geographic questions. We discuss the uniqueness of geographic questions compared to general QA. Then we review existing work on GeoQA and classify them by the types of questions they can address. Based on this survey, we provide a generic classification framework for geographic questions. Finally, we conclude our work by pointing out unique future research directions for GeoQA.

Download Full-text

A New Corpus and Imitation Learning Framework for Context-Dependent Semantic Parsing

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00202 ◽

2014 ◽

Vol 2 ◽

pp. 547-560 ◽

Cited By ~ 4

Author(s):

Andreas Vlachos ◽

Stephen Clark

Keyword(s):

Information System ◽

Natural Language ◽

Learning Algorithm ◽

Imitation Learning ◽

Semantic Parsing ◽

Learning Framework ◽

Test Sets ◽

Context Dependent ◽

Meaning Representation ◽

Tourist Information

Semantic parsing is the task of translating natural language utterances into a machine-interpretable meaning representation. Most approaches to this task have been evaluated on a small number of existing corpora which assume that all utterances must be interpreted according to a database and typically ignore context. In this paper we present a new, publicly available corpus for context-dependent semantic parsing. The MRL used for the annotation was designed to support a portable, interactive tourist information system. We develop a semantic parser for this corpus by adapting the imitation learning algorithm DAgger without requiring alignment information during training. DAgger improves upon independently trained classifiers by 9.0 and 4.8 points in F-score on the development and test sets respectively.

Download Full-text

Semantic Parsing of Ambiguous Input through Paraphrasing and Verification

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00159 ◽

2015 ◽

Vol 3 ◽

pp. 571-584

Author(s):

Philip Arthur ◽

Graham Neubig ◽

Sakriani Sakti ◽

Tomoki Toda ◽

Satoshi Nakamura

Keyword(s):

Natural Language ◽

Language Model ◽

Input String ◽

New Method ◽

Semantic Parsing ◽

Search Queries ◽

Input Sentence ◽

Meaning Representation ◽

Context Free ◽

Do So

We propose a new method for semantic parsing of ambiguous and ungrammatical input, such as search queries. We do so by building on an existing semantic parsing framework that uses synchronous context free grammars (SCFG) to jointly model the input sentence and output meaning representation. We generalize this SCFG framework to allow not one, but multiple outputs. Using this formalism, we construct a grammar that takes an ambiguous input string and jointly maps it into both a meaning representation and a natural language paraphrase that is less ambiguous than the original input. This paraphrase can be used to disambiguate the meaning representation via verification using a language model that calculates the probability of each paraphrase.

Download Full-text

Finding Answers to Questions, in Text Collections or Web, in Open Domain or Specialty Domains

Next Generation Search Engines ◽

10.4018/978-1-4666-0330-1.ch015 ◽

2012 ◽

pp. 344-370

Author(s):

Brigitte Grau

Keyword(s):

Natural Language ◽

Information Management ◽

Question Answering ◽

Target Language ◽

Open Domain ◽

Text Collections ◽

Factual Question ◽

Structured Knowledge ◽

Answering Questions ◽

The Web

This chapter is dedicated to factual question answering, i.e., extracting precise and exact answers to question given in natural language from texts. A question in natural language gives more information than a bag of word query (i.e., a query made of a list of words), and provides clues for finding precise answers. The author first focuses on the presentation of the underlying problems mainly due to the existence of linguistic variations between questions and their answerable pieces of texts for selecting relevant passages and extracting reliable answers. The author first presents how to answer factual question in open domain. The author also presents answering questions in specialty domain as it requires dealing with semi-structured knowledge and specialized terminologies, and can lead to different applications, as information management in corporations for example. Searching answers on the Web constitutes another application frame and introduces specificities linked to Web redundancy or collaborative usage. Besides, the Web is also multilingual, and a challenging problem consists in searching answers in target language documents other than the source language of the question. For all these topics, this chapter presents main approaches and the remaining problems.

Download Full-text

Learning to Map Frequent Phrases to Sub-Structures of Meaning Representation for Neural Semantic Parsing

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6253 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7546-7553

Author(s):

Bo Chen ◽

Xianpei Han ◽

Ben He ◽

Le Sun

Keyword(s):

Natural Language ◽

Performance Improvement ◽

Search Space ◽

Great Benefit ◽

Semantic Parsing ◽

Mismatch Problem ◽

Weakly Supervised ◽

Meaning Representation

Neural semantic parsers usually generate meaning representation tokens from natural language tokens via an encoder-decoder model. However, there is often a vocabulary-mismatch problem between natural language utterances and logical forms. That is, one word maps to several atomic logical tokens, which need to be handled as a whole, rather than individual logical tokens at multiple steps. In this paper, we propose that the vocabulary-mismatch problem can be effectively resolved by leveraging appropriate logical tokens. Specifically, we exploit macro actions, which are of the same granularity of words/phrases, and allow the model to learn mappings from frequent phrases to corresponding sub-structures of meaning representation. Furthermore, macro actions are compact, and therefore utilizing them can significantly reduce the search space, which brings a great benefit to weakly supervised semantic parsing. Experiments show that our method leads to substantial performance improvement on three benchmarks, in both supervised and weakly supervised settings.

Download Full-text

Natural Language Generation at Scale: A Case Study for Open Domain Question Answering

10.18653/v1/w19-8657 ◽

2019 ◽

Cited By ~ 1

Author(s):

Alessandra Cervone ◽

Chandra Khatri ◽

Rahul Goel ◽

Behnam Hedayatnia ◽

Anu Venkatesh ◽

...

Keyword(s):

Natural Language ◽

Question Answering ◽

Natural Language Generation ◽

Open Domain ◽

Language Generation

Download Full-text

A Brief Survey of Question Answering Systems

International Journal of Artificial Intelligence & Applications ◽

10.5121/ijaia.2021.12501 ◽

2021 ◽

Vol 12 (5) ◽

pp. 01-07

Author(s):

Michael Caballero

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Open Domain ◽

Knowledge Based ◽

Current State ◽

Introductory Overview ◽

Building Systems ◽

Question Answering Systems

Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science focused on building systems that automatically answer questions from humans in natural language. This survey summarizes the history and current state of the field and is intended as an introductory overview of QA systems. After discussing QA history, this paper summarizes the different approaches to the architecture of QA systems -- whether they are closed or open-domain and whether they are text-based, knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and different evaluation metrics are discussed.

Download Full-text