“Clickable real world” information retrieval application based on geo-visual clustering

This article describes how semantic web data sources follow linked data principles to facilitate efficient information retrieval and knowledge sharing. These data sources may provide complementary, overlapping or contradicting information. In order to integrate these data sources, the authors perform entity linking. Entity linking is an important task of identifying and linking entities across data sources that refer to the same real-world entities. In this work, they have proposed a genetic fuzzy approach to learn linkage rules for entity linking. This method is domain independent, automatic and scalable. Their approach uses fuzzy logic to adapt mutation and crossover rates of genetic programming to ensure guided convergence. The authors' experimental evaluation demonstrates that our approach is competitive and make significant improvements over state of the art methods.

Download Full-text

A Database Project in a Small Company (or How the Real World Doesn't Always Follow the Book)

Database Technologies ◽

10.4018/978-1-60566-058-5.ch030 ◽

2009 ◽

pp. 468-483

Author(s):

Efrem Mallach

Keyword(s):

Information Retrieval ◽

Real World ◽

Retrieval System ◽

Information Retrieval System ◽

Small Company ◽

The Real ◽

Design And Implementation ◽

Development Methods

The case study describes a small consulting company’s experience in the design and implementation of a database and associated information retrieval system. Their choices are explained within the context of the firm’s needs and constraints. Issues associated with development methods are discussed, along with problems that arose from not following proper development disciplines.

Download Full-text

SEMANTIC FIELD: A THEORETICAL PERSPECTIVE OF MODELING INFORMATION RETRIEVAL

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213009000421 ◽

2009 ◽

Vol 18 (06) ◽

pp. 825-851

Author(s):

KUN YUE ◽

WEI-YI LIU

Keyword(s):

Information Retrieval ◽

Theoretical Model ◽

Electrostatic Field ◽

Real World ◽

Field Model ◽

Theoretical Perspective ◽

Keyword Extraction ◽

Semantic Field ◽

Novel Method ◽

Interpretable Model

Information retrieval has been paid much attention and it is widely studied and applied in real world paradigms. For various aspects of information retrieval, various approaches have been proposed from various perspectives. It is necessary to provide a formally-unified and physically-interpretable model for classical problems in information retrieval (e.g., document classification, authority-page selection, and keyword extraction, etc.). In this paper we propose a theoretical model, called semantic field, inspired by the theories of lexical semantics and electrostatic field. Based on this physical model, information retrieval can be viewed from a theoretical perspective and interpreted by people's physical intuitions and natural heuristics. Centered on the concept of semantic field, we give some relevant properties, including semantic affinity, semantic coacervation degree and radiation of a semantic source. As the representative application of the proposed semantic field model, a novel method for automatic keyword extraction is discussed, and the feasibility is verified by corresponding experiments.

Download Full-text

Hashtag2Vec: Learning Hashtag Representation with Relational Hierarchical Embedding Model

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/480 ◽

2018 ◽

Cited By ~ 3

Author(s):

Jie Liu ◽

Zhicheng He ◽

Yalou Huang

Keyword(s):

Social Networks ◽

Information Retrieval ◽

Social Network ◽

Real World ◽

Heterogeneous Network ◽

Event Analysis ◽

Short Text ◽

Theme Discovery ◽

Real World Datasets ◽

Text Content

Hashtags have always been important elements in many social network platforms and micro-blog services. Semantic understanding of hashtags is a critical and fundamental task for many applications on social networks, such as event analysis, theme discovery, information retrieval, etc. However, this task is challenging due to the sparsity, polysemy, and synonymy of hashtags. In this paper, we investigate the problem of hashtag embedding by combining the short text content with the various heterogeneous relations in social networks. Specifically, we first establish a network with hashtags as its nodes. Hierarchically, each of the hashtag nodes is associated with a set of tweets and each tweet contains a set of words. Then we devise an embedding model, called Hashtag2Vec, which exploits multiple relations of hashtag-hashtag, hashtag-tweet, tweet-word, and word-word relations based on the hierarchical heterogeneous network. In addition to embedding the hashtags, our proposed framework is capable of embedding the short social texts as well. Extensive experiments are conducted on two real-world datasets, and the results demonstrate the effectiveness of the proposed method.

Download Full-text

Report on the SIGIR 2019 Workshop on eCommerce (ECOM19)

ACM SIGIR Forum ◽

10.1145/3458553.3458555 ◽

2019 ◽

Vol 53 (2) ◽

pp. 11-19

Author(s):

Jon Degenhardt ◽

Surya Kallumadi ◽

Utkarsh Porwal ◽

Andrew Trotman

Keyword(s):

Information Retrieval ◽

Real World ◽

Poster Session ◽

Panel Discussion ◽

Product Search ◽

Full Day

The SIGIR 2019 Workshop on eCommerce (ECOM19), was a full day workshop that took place on Thursday, July 25, 2019 in Paris, France. The purpose of the workshop was to serve as a platform for publication and discussion of Information Retrieval and NLP research and their applications in the domain of eCommerce. The workshop program was designed to bring together practitioners and researchers from academia and industry to discuss the challenges and approaches to product search and recommendation in the eCommerce domain. A second goal was to run a data challence on real-world eCommerce data. The workshop drew contributions from both industry as well as academia, in total the workshop received 38 submissions, and accepted 24 (63%). There were two keynotes by invited speakers, a poster session where all the accepted submissions were presented, a panel discussion, and three short talks by invited speakers.

Download Full-text

Reference by Name vs. Location in a Computer Filing System

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/154193128603000821 ◽

1986 ◽

Vol 30 (8) ◽

pp. 824-828 ◽

Cited By ~ 6

Author(s):

Susan T. Dumais ◽

Annette L. Wright

Keyword(s):

Information Retrieval ◽

Real World ◽

The Real ◽

Retrieval Accuracy ◽

Filing System ◽

Spatial Metaphors ◽

Paper And Pencil

The traditional name-based approach to storing and retrieving information in computers in now being supplemented on some systems by a spatial alternative – often driven by an office or desktop metaphor. These systems attempt to take advantage of the important role that location plays in retrieving objects in the real world (i.e. we must know where things are in order to retrieve them). This paper extends recent research by Jones and Dumais (1986) which used paper and pencil simulations to compare reference by name versus location. A computer filing system was developed in which folders could be stored and retrieved using combinations of location and name cues. Accuracy of location reference in a Location-only condition was initially comparable to that in a Name-only condition, but declined much more rapidly with increases in the number of objects. Adding location to name information did not improve retrieval accuracy, but was costly in terms of initial specification time. These results call into question the generality of spatial metaphors for information retrieval applications.

Download Full-text

Real-World Fuzzy Logic Applications in Data Mining and Information Retrieval

Fuzzy Logic - Studies in Fuzziness and Soft Computing ◽

10.1007/978-3-540-71258-9_11 ◽

2007 ◽

pp. 219-247 ◽

Cited By ~ 7

Author(s):

Bernadette Bouchon-Meunier ◽

Marcin Detyniecki ◽

Marie-Jeanne Lesot ◽

Christophe Marsala ◽

Maria Rifqi

Keyword(s):

Data Mining ◽

Fuzzy Logic ◽

Information Retrieval ◽

Real World

Download Full-text

Modelling search and stopping in interactive information retrieval

ACM SIGIR Forum ◽

10.1145/3458537.3458543 ◽

2019 ◽

Vol 53 (1) ◽

pp. 40-41

Author(s):

David Maxwell

Keyword(s):

Information Retrieval ◽

Real World ◽

Search Process ◽

Decision Point ◽

Search Tasks ◽

Complex Picture ◽

And Performance ◽

High Level ◽

The University ◽

Relevant Material

Searching for information when using a computerised retrieval system is a complex and inherently interactive process. Individuals during a search session may issue multiple queries, and examine a varying number of result summaries and documents per query. Searchers must also decide when to stop assessing content for relevance - or decide when to stop their search session altogether. Despite being such a fundamental activity, only a limited number of studies have explored stopping behaviours in detail, with a majority reporting that searchers stop because they decide that what they have found feels " good enough ". Notwithstanding the limited exploration of stopping during search, the phenomenon is central to the study of Information Retrieval, playing a role in the models and measures that we employ. However, the current de facto assumption considers that searchers will examine k documents - examining up to a fixed depth. In this thesis, we examine searcher stopping behaviours under a number of different search contexts. We conduct and report on two user studies, examining how result summary lengths and a variation of search tasks and goals affect such behaviours. Interaction data from these studies are then used to ground extensive simulations of interaction , exploring a number of different stopping heuristics (operationalised as twelve stopping strategies). We consider how well the proposed strategies perform and match up with real-world stopping behaviours. As part of our contribution, we also propose the Complex Searcher Model , a high-level conceptual searcher model that encodes stopping behaviours at different points throughout the search process (see Figure 1 below). Within the Complex Searcher Model, we also propose a new results page stopping decision point. From this new stopping decision point, searchers can obtain an impression of the page before deciding to enter or abandon it. Results presented and discussed demonstrate that searchers employ a range of different stopping strategies, with no strategy standing out in terms of performance and approximations offered. Stopping behaviours are clearly not fixed, but are rather adaptive in nature. This complex picture reinforces the idea that modelling stopping behaviour is difficult. However, simplistic stopping strategies do offer good performance and approximations, such as the frustration -based stopping strategy. This strategy considers a searcher's tolerance to non-relevance. We also find that combination strategies - such as those combining a searcher's satisfaction with finding relevant material, and their frustration towards observing non-relevant material - also consistently offer good approximations and performance. In addition, we also demonstrate that the inclusion of the additional stopping decision point within the Complex Searcher Model provides significant improvements to performance over our baseline implementation. It also offers improvements to the approximations of real-world searcher stopping behaviours. This work motivates a revision of how we currently model the search process and demonstrates that different stopping heuristics need to be considered within the models and measures that we use in Information Retrieval. Measures should be reformed according to the stopping behaviours of searchers. A number of potential avenues for future exploration can also be considered, such as modelling the stopping behaviours of searchers individually (rather than as a population), and to explore and consider a wider variety of different stopping heuristics under different search contexts. Despite the inherently difficult task that understanding and modelling the stopping behaviours of searchers represents, potential benefits of further exploration in this area will undoubtedly aid the searchers of future retrieval systems - with further work bringing about improved interfaces and experiences. Doctoral Supervisor Dr Leif Azzopardi (University of Strathclyde, Scotland) Examination Committee Professor Iadh Ounis (University of Glasgow, Scotland) and Dr Suzan Verberne (Leiden University, The Netherlands). Thanks to both of you for your insightful and fair questioning during the defence! Availability This thesis is available to download from http://www.dmax.org.uk/thesis/, or the University of Glasgow's Enlighten repository - see http://theses.gla.ac.uk/41132/. A Quick Thank You Five years of hard work has got me to the point at which I can now submit the abstract of my doctoral thesis to the SIGIR Forum. There have been plenty of ups and downs, but I'm super pleased with the result! Even though there is only a single name on the front cover of this thesis, there are many people who have helped me get to where I am today. You all know who you are - from my friends and family, those who granted me so many fantastic opportunities to travel and see the world - and of course, to Leif. Thanks to all of you for confiding your belief and trust in me, even when I may have momentarily lost that belief and trust in myself. This thesis is for you all.

Download Full-text

Taking a Closed-Book Examination: Decoupling KB-Based Inference by Virtual Hypothesis for Answering Real-World Questions

Computational Intelligence and Neuroscience ◽

10.1155/2021/6689740 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Xiao Zhang ◽

Guorui Zhao

Keyword(s):

Information Retrieval ◽

Real World ◽

Question Answering ◽

Semantic Structure ◽

Long Distance ◽

Common Solution ◽

Semantic Associations ◽

High Level ◽

Inference Methods ◽

Complex Question

Complex question answering in real world is a comprehensive and challenging task due to its demand for deeper question understanding and deeper inference. Information retrieval is a common solution and easy to implement, but it cannot answer questions which need long-distance dependencies across multiple documents. Knowledge base (KB) organizes information as a graph, and KB-based inference can employ logic formulas or knowledge embeddings to capture such long-distance semantic associations. However, KB-based inference has not been applied to real-world question answering well, because there are gaps among natural language, complex semantic structure, and appropriate hypothesis for inference. We propose decoupling KB-based inference by transforming a question into a high-level triplet in the KB, which makes it possible to apply KB-based inference methods to answer complex questions. In addition, we create a specialized question answering dataset only for inference, and our method is proved to be effective by conducting experiments on both AI2 Science Questions dataset and ours.

Download Full-text