Conversations with Search Engines: SERP-based Conversational Response Generation

Pengjie Ren; Zhumin Chen; Zhaochun Ren; Evangelos Kanoulas; Christof Monz; Maarten De Rijke

doi:10.1145/3432726

Conversations with Search Engines: SERP-based Conversational Response Generation

ACM Transactions on Information Systems ◽

10.1145/3432726 ◽

2021 ◽

Vol 39 (4) ◽

pp. 1-29

Author(s):

Pengjie Ren ◽

Zhumin Chen ◽

Zhaochun Ren ◽

Evangelos Kanoulas ◽

Christof Monz ◽

...

Keyword(s):

Natural Language ◽

Search Engines ◽

Information Needs ◽

State Of The Art ◽

User Studies ◽

Future Research ◽

System Response ◽

List Type ◽

Conversational Agent ◽

Complex Information

In this article, we address the problem of answering complex information needs by conducting conversations with search engines , in the sense that users can express their queries in natural language and directly receive the information they need from a short system response in a conversational manner. Recently, there have been some attempts towards a similar goal, e.g., studies on Conversational Agent s (CAs) and Conversational Search (CS). However, they either do not address complex information needs in search scenarios or they are limited to the development of conceptual frameworks and/or laboratory-based user studies. We pursue two goals in this article: (1) the creation of a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines, and (2) the development of a state-of-the-art pipeline for conversations with search engines, Conversations with Search Engines (CaSE), using this dataset. SaaC is built based on a multi-turn conversational search dataset, where we further employ workers from a crowdsourcing platform to summarize each relevant passage into a short, conversational response. CaSE enhances the state-of-the-art by introducing a supporting token identification module and a prior-aware pointer generator, which enables us to generate more accurate responses. We carry out experiments to show that CaSE is able to outperform strong baselines. We also conduct extensive analyses on the SaaC dataset to show where there is room for further improvement beyond CaSE. Finally, we release the SaaC dataset and the code for CaSE and all models used for comparison to facilitate future research on this topic.

Download Full-text

TIPS: Time-aware Personalised Semantic-based query auto-completion

Journal of Information Science ◽

10.1177/0165551520968690 ◽

2020 ◽

pp. 016555152096869

Author(s):

Saedeh Tahery ◽

Saeed Farzi

Keyword(s):

Search Engines ◽

Information Needs ◽

Semantic Information ◽

State Of The Art ◽

Language Model ◽

Experimental Studies ◽

Short Length ◽

Context Aware ◽

Ranked List ◽

Time Aware

With the rapid growth of the Internet, search engines play vital roles in meeting the users’ information needs. However, formulating information needs to simple queries for canonical users is a problem yet. Therefore, query auto-completion, which is one of the most important characteristics of the search engines, is leveraged to provide a ranked list of queries matching the user’s entered prefix. Although query auto-completion utilises useful information provided by search engine logs, time-, semantic- and context-aware features are still important resources of extra knowledge. Specifically, in this study, a hybrid query auto-completion system called TIPS ( Time-aware Personalised Semantic-based query auto-completion) is introduced to combine the well-known systems performing based on popularity and neural language model. Furthermore, this system is supplemented by time-aware features that blend both context and semantic information in a collaborative manner. Experimental studies on the standard AOL dataset are conducted to compare our proposed system with state-of-the-art methods, that is, FactorCell, ConcatCell and Unadapted. The results illustrate the significant superiorities of TIPS in terms of mean reciprocal rank (MRR), especially for short-length prefixes.

Download Full-text

Complementary Auxiliary Classifiers for Label-Conditional Text Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6346 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8303-8310

Author(s):

Yuan Li ◽

Chunyuan Li ◽

Yizhe Zhang ◽

Xiujun Li ◽

Guoqing Zheng ◽

...

Keyword(s):

Natural Language ◽

State Of The Art ◽

Language Models ◽

Future Research ◽

Text Generation ◽

Trade Off

Learning to generate text with a given label is a challenging task because natural language sentences are highly variable and ambiguous. It renders difficulties in trade-off between sentence quality and label fidelity. In this paper, we present CARA to alleviate the issue, where two auxiliary classifiers work simultaneously to ensure that (1) the encoder learns disentangled features and (2) the generator produces label-related sentences. Two practical techniques are further proposed to improve the performance, including annealing the learning signal from the auxiliary classifier, and enhancing the encoder with pre-trained language models. To establish a comprehensive benchmark fostering future research, we consider a suite of four datasets, and systematically reproduce three representative methods. CARA shows consistent improvement over the previous methods on the task of label-conditional text generation, and achieves state-of-the-art on the task of attribute transfer.

Download Full-text

Search evolution for ease and speed: A call to action for what’s been lost

Journal of Librarianship and Information Science ◽

10.1177/0961000620980827 ◽

2021 ◽

pp. 096100062098082

Author(s):

Virginia M. Tucker ◽

Sylvia L. Edwards

Keyword(s):

Grounded Theory ◽

Search Engines ◽

Interface Design ◽

Information Needs ◽

Web Search ◽

Information Science ◽

Web Searching ◽

Call To Action ◽

Complex Information ◽

Holistic Understanding

In recent years, leading website search engines have abandoned vital search features supporting complex information needs, evolving instead for the marketplace and for users seeking speedy answers to easy questions. The consequences are troubling, for researchers and for information science educators, with concerns ranging from the very relevance of search results and the unknowing of what is missing, to the novice searcher’s waning ability to frame potent queries and to learn ways to refine results. We report on a grounded theory study of search experiences of information professionals and graduate students (n = 20) that contributes a holistic understanding of web searching, using its findings both to frame what is lacking in the design evolution of search engines for complex information needs and to outline a way forward. One goal of the study was to evaluate an established model of web searching, called Net Lenses, a theoretical framework shown to be highly relevant during the study’s grounded theory secondary literature review. The original Net Lenses research used phenomenography to identify variation in the web search experiences of university students (n = 41), evidencing four categories according to the characteristics of searcher awareness, approach to learning, response to obstacles and search outcomes. This study validated the model and led to an expanded version, Net Lenses 2.0, with five categories of search experience, reflecting the complex information needs of more advanced searchers. This resultant Net Lenses 2.0 model is discussed with its implications for search engine design, for advanced searchers and also for learning-to-search modes, much needed by searchers seeking to develop their abilities. The study’s implications coalesce in a call to action for more inclusive search interface design, and an agenda is put forth for how information researchers, educators and literacy advocates can move forward in their intersecting domains.

Download Full-text

Deep Learning for Video Captioning: A Review

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/877 ◽

2019 ◽

Cited By ~ 1

Author(s):

Shaoxiang Chen ◽

Ting Yao ◽

Yu-Gang Jiang

Keyword(s):

Deep Learning ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Future Research ◽

Research Directions ◽

Video Captioning ◽

Future Research Directions ◽

Review State ◽

Vision And Language

Deep learning has achieved great successes in solving specific artificial intelligence problems recently. Substantial progresses are made on Computer Vision (CV) and Natural Language Processing (NLP). As a connection between the two worlds of vision and language, video captioning is the task of producing a natural-language utterance (usually a sentence) that describes the visual content of a video. The task is naturally decomposed into two sub-tasks. One is to encode a video via a thorough understanding and learn visual representation. The other is caption generation, which decodes the learned representation into a sequential sentence, word by word. In this survey, we first formulate the problem of video captioning, then review state-of-the-art methods categorized by their emphasis on vision or language, and followed by a summary of standard datasets and representative approaches. Finally, we highlight the challenges which are not yet fully understood in this task and present future research directions.

Download Full-text

Information Grounds Theory (1999, 2004)

Information Seeking Behavior and Technology Adoption - Advances in Knowledge Acquisition, Transfer, and Management ◽

10.4018/978-1-4666-8156-9.ch010 ◽

2015 ◽

pp. 149-161

Author(s):

Ali Saif Al-Aufi

Keyword(s):

Social Context ◽

Information Needs ◽

Information Science ◽

Information Behavior ◽

Research Topic ◽

User Studies ◽

Future Research ◽

Library And Information Science ◽

Social Settings ◽

And Behavior

As a study and research topic, information behavior has always remained central to the field of library and information science. Theorization in this area has, likewise, constantly attracted the interest of many researchers and so outbound empirical research has generated an enormous body of literature. Information grounds as a theory is one of the latest evolving phenomena in the area of information behavior. It seeks to interpret peoples' interaction with information in a social context. This chapter attempts to elaborate on the emergence and development of information grounds and its capability to delineate everyday information behavior. The chapter also reviews the literature that has used information grounds as a basis for interpreting information behavior in different social settings and it identifies opportunities for future research in user studies that can build upon information grounds to explore and clarify the information needs and behavior of certain groups in different socio-cultural environments.

Download Full-text

The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa106 ◽

2020 ◽

Vol 27 (10) ◽

pp. 1529-1537 ◽

Cited By ~ 1

Author(s):

Sam Henry ◽

Yanshan Wang ◽

Feichen Shen ◽

Ozlem Uzuner

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Controlled Vocabulary ◽

Future Research ◽

Shared Task ◽

Data Set ◽

Clinical Records ◽

Medical Concepts

Abstract Objective The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task track 3, focused on medical concept normalization (MCN) in clinical records. This track aimed to assess the state of the art in identifying and matching salient medical concepts to a controlled vocabulary. In this paper, we describe the task, describe the data set used, compare the participating systems, present results, identify the strengths and limitations of the current state of the art, and identify directions for future research. Materials and Methods Participating teams were provided with narrative discharge summaries in which text spans corresponding to medical concepts were identified. This paper refers to these text spans as mentions. Teams were tasked with normalizing these mentions to concepts, represented by concept unique identifiers, within the Unified Medical Language System. Submitted systems represented 4 broad categories of approaches: cascading dictionary matching, cosine distance, deep learning, and retrieve-and-rank systems. Disambiguation modules were common across all approaches. Results A total of 33 teams participated in the MCN task. The best-performing team achieved an accuracy of 0.8526. The median and mean performances among all teams were 0.7733 and 0.7426, respectively. Conclusions Overall performance among the top 10 teams was high. However, several mention types were challenging for all teams. These included mentions requiring disambiguation of misspelled words, acronyms, abbreviations, and mentions with more than 1 possible semantic type. Also challenging were complex mentions of long, multi-word terms that may require new ways of extracting and representing mention meaning, the use of domain knowledge, parse trees, or hand-crafted rules.

Download Full-text

Expert Finding Systems: A Systematic Review

Applied Sciences ◽

10.3390/app9204250 ◽

2019 ◽

Vol 9 (20) ◽

pp. 4250 ◽

Cited By ~ 2

Author(s):

Omayma Husain ◽

Naomie Salim ◽

Rose Alinda Alias ◽

Samah Abdelsalam ◽

Alzubair Hassan

Keyword(s):

Systematic Review ◽

Conference Proceedings ◽

Information Needs ◽

State Of The Art ◽

Future Research ◽

Expert Finding ◽

Specific Nature ◽

Systems Research ◽

Data Overload ◽

Expertise Retrieval

The data overload problem and the specific nature of the experts’ knowledge can hinder many users from finding experts with the expertise they required. There are several expert finding systems, which aim to solve the data overload problem and often recommend experts who can fulfil the users’ information needs. This study conducted a Systematic Literature Review on the state-of-the-art expert finding systems and expertise seeking studies published between 2010 and 2019. We used a systematic process to select ninety-six articles, consisting of 57 journals, 34 conference proceedings, three book chapters, and one thesis. This study analyses the domains of expert finding systems, expertise sources, methods, and datasets. It also discusses the differences between expertise retrieval and seeking. Moreover, it identifies the contextual factors that have been combined into expert finding systems. Finally, it identifies five gaps in expert finding systems for future research. This review indicated that ≈65% of expert finding systems are used in the academic domain. This review forms a basis for future expert finding systems research.

Download Full-text

A Systematic Literature Review of the EFQM Excellence Model from 1991 to 2019

International Journal of Applied Research in Management and Economics ◽

10.33422/ijarme.v2i2.211 ◽

1970 ◽

Vol 2 (2) ◽

pp. 11-15

Author(s):

Muhammad Yousaf ◽

Petr Bris

Keyword(s):

Quality Management ◽

Data Collection ◽

Literature Review ◽

Systematic Literature Review ◽

Quantitative Research ◽

Web Of Science ◽

State Of The Art ◽

Future Research ◽

European Foundation ◽

The Subject

A systematic literature review (SLR) from 1991 to 2019 is carried out about EFQM (European Foundation for Quality Management) excellence model in this paper. The aim of the paper is to present state of the art in quantitative research on the EFQM excellence model that will guide future research lines in this field. The articles were searched with the help of six strings and these six strings were executed in three popular databases i.e. Scopus, Web of Science, and Science Direct. Around 584 peer-reviewed articles examined, which are directly linked with the subject of quantitative research on the EFQM excellence model. About 108 papers were chosen finally, then the purpose, data collection, conclusion, contributions, and type of quantitative of the selected papers are discussed and analyzed briefly in this study. Thus, this study identifies the focus areas of the researchers and knowledge gaps in empirical quantitative literature on the EFQM excellence model. This article also presents the lines of future research.

Download Full-text

Classification of means and methods of the Web semantic retrieval

PROBLEMS IN PROGRAMMING ◽

10.15407/pp2017.01.030 ◽

2017 ◽

pp. 030-050

Author(s):

J.V. Rogushina ◽

Keyword(s):

Search Engines ◽

Domain Knowledge ◽

Information Needs ◽

Web Search ◽

User Interaction ◽

Query Languages ◽

Semantic Search ◽

Semantic Retrieval ◽

The Web

Problems associated with the improve ment of information retrieval for open environment are considered and the need for it’s semantization is grounded. Thecurrent state and prospects of development of semantic search engines that are focused on the Web information resources processing are analysed, the criteria for the classification of such systems are reviewed. In this analysis the significant attention is paid to the semantic search use of ontologies that contain knowledge about the subject area and the search users. The sources of ontological knowledge and methods of their processing for the improvement of the search procedures are considered. Examples of semantic search systems that use structured query languages (eg, SPARQL), lists of keywords and queries in natural language are proposed. Such criteria for the classification of semantic search engines like architecture, coupling, transparency, user context, modification requests, ontology structure, etc. are considered. Different ways of support of semantic and otology based modification of user queries that improve the completeness and accuracy of the search are analyzed. On base of analysis of the properties of existing semantic search engines in terms of these criteria, the areas for further improvement of these systems are selected: the development of metasearch systems, semantic modification of user requests, the determination of an user-acceptable transparency level of the search procedures, flexibility of domain knowledge management tools, increasing productivity and scalability. In addition, the development of means of semantic Web search needs in use of some external knowledge base which contains knowledge about the domain of user information needs, and in providing the users with the ability to independent selection of knowledge that is used in the search process. There is necessary to take into account the history of user interaction with the retrieval system and the search context for personalization of the query results and their ordering in accordance with the user information needs. All these aspects were taken into account in the design and implementation of semantic search engine "MAIPS" that is based on an ontological model of users and resources cooperation into the Web.

Download Full-text