Harnessing search engine technology to raise awareness and availability of digital libraries

Author(s):  
Nir Yom Tov ◽  
Ariel J. Frank
2004 ◽  
Vol 10 (9) ◽  
Author(s):  
Friedrich Summann ◽  
Norbert Lossau

AI Magazine ◽  
2015 ◽  
Vol 36 (3) ◽  
pp. 35-48 ◽  
Author(s):  
Jian Wu ◽  
Kyle Mark Williams ◽  
Hung-Hsuan Chen ◽  
Madian Khabsa ◽  
Cornelia Caragea ◽  
...  

CiteSeerX is a digital library search engine providing access to more than five million scholarly documents with nearly a million users and millions of hits per day. We present key AI technologies used in the following components: document classification and de-duplication, document and citation clustering, automatic metadata extraction and indexing, and author disambiguation. These AI technologies have been developed by CiteSeerX group members over the past 5–6 years. We show the usage status, payoff, development challenges, main design concepts, and deployment and maintenance requirements. We also present AI technologies implemented in table and algorithm search, which are special search modes in CiteSeerX. While it is challenging to rebuild a system like CiteSeerX from scratch, many of these AI technologies are transferable to other digital libraries and/or search engines.


Author(s):  
Piotr Malak

Increasing amount of digitised content is a very promising fact. It promises a wide and easy access to digitised pictures of valuable content and artefacts. But reality shows us it may be contrary – we encounter problems searching digitised content. Metadata aggregating services solve the problem. Splendid example of such is Europeana providing universal access to European digital cultural heritage resources. Locally we may know digital libraries metadata aggregators, such as Polish FBC (Federation of Digital Libraries). Very functional they have limitation, though. They provide access to limited, in terms of formats, sources. Organisations aiming towards digital transformation face a new challenge – combining variety of available digital content formats. An attempt to address problem of access to different digital resources in a Leopoldina online platform. Developed by University of Wroclaw (UWr), in cooperation with PSNC – Polish national meta data aggregator for Europeana, it aims to aggregate and deliver digital resources of various UWr units. In current paper we present goals of the project, short description of different digital resources available, metadata handling and common, universal search engine design.


2000 ◽  
Vol 09 (03) ◽  
pp. 229-254
Author(s):  
A. N. ZINCIR-HEYWOOD ◽  
M. I. HEYWOOD ◽  
C. R. CHARTWIN ◽  
T. TUNALI

A platform for performing multi-agent searches in heterogeneous digital libraries is proposed. This differs significantly from previous approaches by completely removing the concept of a centralized search engine. Specifically, the organization of information held on domain index servers is constrained to conform to a virtual tree representation based on facets and global keyword concept schema particular to the set of information providers associated with the domain of interest (e.g. preparatory intranet). Simulation studies are used to compare this platform against a digital library platform presently in use, which employs the traditional central server scheme. Improvements in terms of query service time and robustness are demonstrated.


Author(s):  
George Buchanan ◽  
Annika Hinze

Information seeking is a complex task, and many models of the basic, individual seeking process have been proposed. Similarly, many tools now exist to support “sit-forward” information seeking by single users, where the solitary seeker interacts intensively with a search engine or classification scheme. However, in many situations, there is a clear interaction between social contexts beyond the immediate interaction between the user and the retrieval system.


2020 ◽  
Author(s):  
Muhaemin Sidiq ◽  
Ivan Hanafi ◽  
Fajar J. Ekaputra

Naturally, not all researchers can develop their own software to search for academic publications from digital libraries. Nevertheless, at several stages of their research, they will need to search digital libraries for relevant scientific publications and bibliometric information. There are typically two approaches used by researchers to search for scientific publications: (i) using Google Scholar search, or (ii) using publication metadata available from several sources, such as CrossRef and publishers. However, in developing countries like Indonesia, neither option provided users with complete information, since (i) Google Scholar does not provide bibliometric details, and (ii) complete bibliometric information from other sources is often not available due to incomplete data (e.g., CrossRef) or the necessity to pay a subscription fee (e.g., Springer and Elsevier). The development of Search Engine for Research Articles (SEforRA) is a solution to this issue which provides researchers with bibliometricready publication metadata. SEforRA extracts and processes data from CrossRef, publishers, and other sources to provide an integrated platform for researchers to search and retrieve publication metadata, which is ready to use further in their research. Keywords: search engine for research articles, academic search engines, text data mining, bibliometrics


Paper The goal of search engines is to return accurate and complete results. Satisfying concrete user information needs becomes more and more difficult because of inability in it complete explicit specification and short comes of keyword-based searching and indexing. General search engines have indexed millions of web resources and often return thousands of results to the user query (most of them often inadequate). To increase result’s precession, users sometimes choose search engines, specialized in searching concrete domain, personalized or semantic search. A grand variety of specialized search engines may be found (and used) in the internet, but no one may guarantee finding of existing in the web and needed for the concrete user resources. In this paper we present our research on building a meta-search engine that uses domain and user profile ontologies, as well as information (or metadata), directly extracted from web sites to improve search result quality. We state main requirements to the search engine for students, PHD students and scientists, propose a conceptual model and discuss approaches of it practical realization. Our prototype metasearch engine first perform interactive semantic query refinement and then, using refined query, it automatically generate several search queries, sends them to different digital libraries and web search engines, augments and ranks returned results, using ontologically represented domain and user metadata. For testing our model, we develop domain ontologies in the electronic domain. We will use ontological terminology representation to propose recommendations for query disambiguation, and to ensure knowledge for reranking the returned results. We also present some partial initial implementations query disambiguation strategies and testing results.


Sign in / Sign up

Export Citation Format

Share Document