Analysis of image search queries on the web: Query modification patterns and semantic attributes

2013 ◽  
Vol 64 (7) ◽  
pp. 1423-1441 ◽  
Author(s):  
Youngok Choi
2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Irene Amerini ◽  
Rudy Becarelli ◽  
Roberto Caldelli ◽  
Matteo Casini

Nowadays, determining if an image appeared somewhere on the web or in a magazine or is authentic or not has become crucial. Image forensics methods based on features have demonstrated so far to be very effective in detecting forgeries in which a portion of an image is cloned somewhere else onto the same image. Anyway such techniques cannot be adopted to deal with splicing attack, that is, when the image portion comes from another picture that then, usually, is not available anymore for an operation of feature match. In this paper, a procedure in which these techniques could also be employed will be shown to get rid of splicing attack by resorting to the use of some repositories of images available on the Internet like Google Images or TinEye Reverse Image Search. Experimental results are presented on some real case images retrieved on the Internet to demonstrate the capacity of the proposed procedure.


Author(s):  
Anselm Spoerri

This paper analyzes which pages and topics are the most popular on Wikipedia and why. For the period of September 2006 to January 2007, the 100 most visited Wikipedia pages in a month are identified and categorized in terms of the major topics of interest. The observed topics are compared with search behavior on the Web. Search queries, which are identical to the titles of the most popular Wikipedia pages, are submitted to major search engines and the positions of popular Wikipedia pages in the top 10 search results are determined. The presented data helps to explain how search engines, and Google in particular, fuel the growth and shape what is popular on Wikipedia.


2019 ◽  
Vol 9 (6) ◽  
pp. 1181-1190 ◽  
Author(s):  
Mohib Ullah ◽  
Muhammad Arshad Islam ◽  
Rafiullah Khan ◽  
Muhammad Aleem ◽  
Muhammad Azhar Iqbal

Users around the world send queries to the Web Search Engine (WSE) to retrieve data from the Internet. Users usually take primary assistance relating to medical information from WSE via search queries. The search queries relating to diseases and treatment is contemplated to be the most personal facts about the user. The search queries often contain identifiable information that can be linked back to the originator, which can compromise the privacy of a user. In this work, we are proposing a distributed privacy-preserving protocol (OSLo) that eliminates limitation in the existing distributed privacy-preserving protocols and a framework, which evaluates the privacy of a user. The OSLo framework asses the local privacy relative to the group of users involved in forwarding query to the WSE and the profile privacy against the profiling of WSE. The privacy analysis shows that the local privacy of a user directly depends on the size of the group and inversely on the number of compromised users. We have performed experiments to evaluate the profile privacy of a user using a privacy metric Profile Exposure Level. The OSLo is simulated with a subset of 1000 users of the AOL query log. The results show that OSLo performs better than the benchmark privacy-preserving protocol on the basis of privacy and delay. Additionally, results depict that the privacy of a user depends on the size of the group.


2014 ◽  
Vol 38 (2) ◽  
pp. 209-231 ◽  
Author(s):  
Darja Groselj

Purpose – This study aims to map the information landscape as it unfolds to users when they search for health topics on general search engines. Website sponsorship, platform type and linking patterns were analysed in order to advance the understanding of the provision of health information online. Design/methodology/approach – The landscape was sampled by ten very different search queries and crawled with VOSON software. Drawing on Roger's framework of information politics on the web, the landscape is described on two levels. The front-end is examined qualitatively by assessing website sponsorship and platform type. On the back-end, linking patterns are analysed using hyperlink network analysis. Findings – A vast majority of the websites have commercial and organisational sponsorship. The analysis of the platform type shows that health information is provided mainly on static homepages, informational portals and general news sites. A comparison of ten different health domains revealed substantial differences in their landscapes, related to domain-specific characteristics. Research limitations/implications – The size and properties of the web crawl were shaped by using third party software, and the generalisability of the results is limited by the selected search queries. Further research exploring how specific characteristics of different health domains shape provision of information online is suggested. Practical implications – The demonstrated method can be used by organisations to discern the characteristics of the online information landscape in which they operate and to inform their business strategies. Originality/value – The study examines health information landscapes on a large scale and makes an original contribution by comparing them across ten different health domains.


2015 ◽  
Vol 10 (S318) ◽  
pp. 270-273
Author(s):  
Stephen D. J. Gwyn ◽  
Norman Hill ◽  
JJ Kavelaars

AbstractWhile regular astronomical image archive searches can find images at a fixed location, they cannot find images of moving targets such as asteroids or comets. The Solar System Object Image Search (SSOIS) at the Canadian Astronomy Data Centre allows users to search for images of moving objects, allowing precoveries. SSOIS accepts as input either an object designation, a list of observations, a set of orbital elements, or a user-generated ephemeris for an object. It then searches for observations of that object over a range of dates. The user is then presented with a list of images containing that object from a variety of archives. Initially created to search the CFHT MegaCam archive, SSOIS has been extended to other telescopes including Gemini, Subaru/SuprimeCam, WISE, HST, the SDSS, AAT, the ING telescopes, the ESO telescopes, and the NOAO telescopes (KPNO/CTIO/WIYN), for a total of 24.5 million images. As the Pan-STARRS and Hyper Suprime-Cam archives become available, they will be incorporated as well. The SSOIS tool is located on the web at http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/ssois/.


Data ◽  
2019 ◽  
Vol 4 (3) ◽  
pp. 125 ◽  
Author(s):  
Artur Strzelecki

This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: Clicks volume (1), impressions volume (2), click-through ratio (3), and ranking position (4). Data dimensions are as follows: queries that are entered into search engines that trigger results with the researched domain name (1), page URLs from research domains which are available in the search engine results page (2), country of origin of search engine visitors (3), type of device used for the search (4), and date of the search (5). Search engine visibility data were obtained from the Google search console for the international online store, which is visible in 240 countries and territories for a period of 15 months. The data contain 123 K clicks and 4.86 M impressions for the web search and 22 K clicks and 9.07 M impressions for the image search. The proposed method for obtaining data can be applied in any other area, not only in the e-commerce industry.


Author(s):  
Michael Chau ◽  
Yan Lu ◽  
Xiao Fang ◽  
Christopher C. Yang

More non-English contents are now available on the World Wide Web and the number of non-English users on the Web is increasing. While it is important to understand the Web searching behavior of these non-English users, many previous studies on Web query logs have focused on analyzing English search logs and their results may not be directly applied to other languages. In this Chapter we discuss some methods and techniques that can be used to analyze search queries in Chinese. We also show an example of applying our methods on a Chinese Web search engine. Some interesting findings are reported.


2017 ◽  
Vol 9 (1) ◽  
pp. 252-259
Author(s):  
VajjaNarendra Nath ◽  
◽  
Sasidhar Vegi ◽  
Keyword(s):  

Tradterm ◽  
2021 ◽  
Vol 37 (2) ◽  
pp. 460-487
Author(s):  
Adauri Brezolin

Although it might appear contradictory to investigate noncanonical phraseological combinations in corpora, corpus linguistics research has revealed that they exceed canonical forms in number (Philip 2008). This paper intends to discuss the idea of fixedness by analyzing variant forms of idioms, and if they qualify as wordplay. The Web, our data source, is employed for collecting such noncanonical occurrences in both English and Portuguese using keywords on the Google Search Engine. Our discussion mainly draws on studies relating to fixed phrases (Kjellmer 1991; Granger & Paquot 2008; Tagnin 2013); phraseological skeletons (Renouf & Sinclair 1991; Philip 2008), and idiom transformations (Veisbergs 1997; Barta 2005). Due attention is also given to search queries of nonstandard forms of fixed expressions in corpora (Philip 2008), and the translation of idiom-based wordplay (Veisbergs 1997; Brezolin 2020)


Sign in / Sign up

Export Citation Format

Share Document