Burstiness in Query Log: Web Search Analysis by Combining Global and Local Evidences

Author(s):  
Chen Zhang ◽  
Sen Zhang ◽  
Chen Lei ◽  
Peiguang Lin
Author(s):  
Ji-Rong Wen

Web query log is a type of file keeping track of the activities of the users who are utilizing a search engine. Compared to traditional information retrieval setting in which documents are the only information source available, query logs are an additional information source in the Web search setting. Based on query logs, a set of Web mining techniques, such as log-based query clustering, log-based query expansion, collaborative filtering and personalized search, could be employed to improve the performance of Web search.


2019 ◽  
Vol 9 (6) ◽  
pp. 1181-1190 ◽  
Author(s):  
Mohib Ullah ◽  
Muhammad Arshad Islam ◽  
Rafiullah Khan ◽  
Muhammad Aleem ◽  
Muhammad Azhar Iqbal

Users around the world send queries to the Web Search Engine (WSE) to retrieve data from the Internet. Users usually take primary assistance relating to medical information from WSE via search queries. The search queries relating to diseases and treatment is contemplated to be the most personal facts about the user. The search queries often contain identifiable information that can be linked back to the originator, which can compromise the privacy of a user. In this work, we are proposing a distributed privacy-preserving protocol (OSLo) that eliminates limitation in the existing distributed privacy-preserving protocols and a framework, which evaluates the privacy of a user. The OSLo framework asses the local privacy relative to the group of users involved in forwarding query to the WSE and the profile privacy against the profiling of WSE. The privacy analysis shows that the local privacy of a user directly depends on the size of the group and inversely on the number of compromised users. We have performed experiments to evaluate the profile privacy of a user using a privacy metric Profile Exposure Level. The OSLo is simulated with a subset of 1000 users of the AOL query log. The results show that OSLo performs better than the benchmark privacy-preserving protocol on the basis of privacy and delay. Additionally, results depict that the privacy of a user depends on the size of the group.


2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Danyang Jiang ◽  
Honghui Chen ◽  
Fei Cai

Query autocompletion (QAC) is a common interactive feature of web search engines. It aims at assisting users to formulate queries and avoiding spelling mistakes by presenting them with a list of query completions as soon as they start typing in the search box. Existing QAC models mostly rank the query completions by their past popularity collected in the query logs. For some queries, their popularity exhibits relatively stable or periodic behavior while others may experience a sudden rise in their query popularity. Current time-sensitive QAC models focus on either periodicity or recency and are unable to respond swiftly to such sudden rise, resulting in a less optimal QAC performance. In this paper, we propose a hybrid QAC model that considers two temporal patterns of query’s popularity, that is, periodicity and burst trend. In detail, we first employ the Discrete Fourier Transform (DFT) to identify the periodicity of a query’s popularity, by which we forecast its future popularity. Then the burst trend of query’s popularity is detected and incorporated into the hybrid model with its cyclic behavior. Extensive experiments on a large, real-world query log dataset infer that modeling the temporal patterns of query popularity in the form of its periodicity and its burst trend can significantly improve the effectiveness of ranking query completions.


Author(s):  
Ji-Rong Wen

Web query log is a type of file keeping track of the activities of the users who are utilizing a search engine. Compared to traditional information retrieval setting in which documents are the only information source available, query logs are an additional information source in the Web search setting. Based on query logs, a set of Web mining techniques, such as log-based query clustering, log-based query expansion, collaborative filtering and personalized search, could be employed to improve the performance of Web search.


Author(s):  
Suruchi Chawla

This chapter explains the multi-agent system for effective information retrieval using information scent in query log mining. The precision of search results is low due to difficult to infer the information need of the small size search query and therefore information need of the user is not satisfied effectively. Information Scent is used for modeling the information need of user web search session and clustering is performed to identify the similar information need sessions. Hyper Link-Induced Topic Search (HITS) is executed on clusters to generate the Hubs and authorities for web page recommendations to users who search with similar intents. This multi-agent system based on clustered query sessions uses query operations like expansion and recommendation to infer the information need of user search queries and recommends Hubs and authorities for effective web search.


1999 ◽  
Vol 33 (1) ◽  
pp. 6-12 ◽  
Author(s):  
Craig Silverstein ◽  
Hannes Marais ◽  
Monika Henzinger ◽  
Michael Moricz

Author(s):  
Yuta Hayakawa ◽  
◽  
Masafumi Hagiwara

Systems capable of autonomous thinking are sometimes required to cope with unanticipated situations. An important issue in this context is knowledge – especially common sense – acquisition. In this paper, we propose novel quantitative common sense estimation methods and apply them to an automatic membership function generation system. Our proposed system estimates threshold values corresponding tolargeandsmallfor various kinds of objectattribute sets to form membership functions, where it attempts to relate each object to its corresponding impression. Two methods are proposed in this paper. The first, Method-1, obtains data from the top 1,000 snippets through a web search and estimates the global and local tendencies by clustering them. The second, Method-2, uses the number of hits from a web search together with parts of the results obtained through Method-1. In addition, we devise several techniques to eliminate unnecessary information in the retrieved web pages. We also carried out experiments that verified the effectiveness of our proposed methods and the method combining those two.


Sign in / Sign up

Export Citation Format

Share Document