scholarly journals A Supplementary Tool for Web-archiving Using Blockchain Technology

Author(s):  
John E. De Villiers ◽  
André P. Calitz

The usefulness of a uniform resource locator (URL) on the World Wide Web is reliant on the resource being hosted at the same URL in perpetuity. When URLs are altered or removed, this results in the resource, such as an image or document, being inaccessible. While web-archiving projects seek to prevent such a loss of online resources, providing complete backups of the web remains a formidable challenge. This article outlines the initial development and testing of a decentralised application (DApp), provisionally named Repudiation Chain, as a potential tool to help address these challenges presented by shifting URLs and uncertain web-archiving. Repudiation Chain seeks to make use of a blockchain smart contract mechanism in order to allow individual users to contribute to web-archiving. Repudiation Chain aims to offer unalterable assurance that a specific file and its URL existed at a given point in time—by generating a compact, non-reversible representation of the file at the time of its non-repudiation. If widely adopted, such a tool could contribute to decentralisation and democratisation of web-archiving.

Author(s):  
N. C. Rowe

The World Wide Web quickly evolved as a valuable resource for organizations to provide information and services to users. Much initial development of Web pages was done haphazardly. This resulted in many information gaps and inconsistencies between pages. Departments with more available time created more and better-designed Web pages even when they were no more important. Personnel who created Web pages would move to other jobs and their pages would become obsolete, but no one would bother to fix them. Two copies of the same information on the Web would become inconsistent when only one was updated, leaving the public wondering which was correct. Solutions were needed. We survey here the principal solution methods that have been developed.


2016 ◽  
Vol 35 (3) ◽  
pp. 64-72 ◽  
Author(s):  
Liladhar R. Pendse

Purpose The purpose of this paper is to highlight the web-archiving as a tool for possible collection development in a research level academic library. The paper highlights the web-archiving project that dealt with the contemporary Ukraine conflict. Currently, as the conflict in Ukraine drags on, the need for collecting and preserving the information from various web-based resources with different ideological orientations acquires a special importance. The demise of the Soviet Union in 1991 and the emergence of independent republics were heralded by some as a peaceful transition to the “free-market” style economies. This transition was nevertheless nuanced and not seamless. Besides the incomplete market liberalization, rent-seeking behaviors of different sort, it was also accompanied by the almost ubiquitous use of and access to the internet and the internet communication technologies. Now 24 years later, the ongoing conflict in Ukraine also appears to be unfolding on the World Wide Web. With the Russian annexation of Crimea and its unification to the Russian Federation, the governmental and non-governmental websites of the Ukrainian Crimea suddenly came to represent a sort of “an endangered archive”. Design/methodology/approach The main purpose of this project was to make the information that is contained in Ukrainian and Russia websites available to the wider body of scholars and students over the longer period of time in a web archive. The author does not take any ideological stance on the legal status of Crimea or on the ongoing conflict in Ukraine. There are currently several projects that are devoted to the preservation of these websites. This article also focuses on providing a survey of the landscape of these projects and highlights the ongoing web-archiving project that is entitled, “the Ukraine Crisis: 2014-2015” at the UC Berkeley Library. Findings The UC Berkeley’s Ukraine Conflict Archive was made available to public in March of 2015 after enough materials were archived. The initial purpose of the archive was to selectively harvest, and archive those websites that are bound to either disappear or change significantly during the evolution of Crimea’s accession to Russia. However, in the aftermath of the Crimean conflict, the ensuing of military conflict in Ukraine had forced to reevaluate the web-archiving strategy. The project was never envisioned to be a competing project to the Ukraine Conflict project. Instead, it was supposed to capture complimentary data that could have been missed by other similar projects. This web archive has been made public to provide a glimpse of what was happening and what is happening in Ukraine. Research limitations/implications Now 24 years later, the ongoing conflict in Ukraine also appears to be unfolding on the World Wide Web. With the Russian annexation of Crimea and its unification to the Russian Federation, the governmental and non-governmental websites of the Ukrainian Crimea suddenly came to represent a sort of “an endangered archive”. The impetus for archiving the selected Ukrainian websites came as a result of the changing geopolitical realities of Crimea. The daily changes to the websites and also loss of information that is contained within them is one of the many problems faced by the users of these websites. In some cases, the likelihood of these websites is relatively high. This in turn was followed by the author’s desire to preserve the information about the daily lives in Ukraine’s east in light of the unfolding violent armed conflict. Originality/value Upon close survey of the Library and Information Sciences currently published articles on Ukraine Conflict, no articles that are currently dedicated to archiving the Crimean and Ukrainian situations were found.


2017 ◽  
Author(s):  
Sumitra Duncan ◽  
Karl-Rainer Blumenthal

The vast expanse and volatility of art ephemera based on the World Wide Web pose significant threats to the completeness of the art historical record as sustained by art libraries. Towards its mission to enhance the resources available for current and future research through collaboration among leading museum libraries, the New York Art Resources Consortium (NYARC) collects, preserves, and provides access to art ephemera born in digital formats native to the web. It leverages its member institutions’ traditional collecting strengths and combined resources to establish an initial and model a permanently sustainable web archiving programme. This article introduces NYARC’s web archiving practices as they manifest at the principal stages in a typical web archive’s lifecycle, describes how each directly benefits from collaboration among its member libraries and external programme partners, and identifies opportunities for further art libraries and their consortia to participate in this important effort to serve and preserve at-risk art historical resources.


2013 ◽  
Vol 5 (3) ◽  
pp. 598-603
Author(s):  
Adoghe Anthony ◽  
Kayode Onasoga ◽  
Dike Ike ◽  
Olujimi Ajayi

Web archiving is the process of collecting valuable content from the World Wide Web in a an archival format, to ensure the information can be managed independently and preserved for the general public, historians, researchers, and future generation. If the Web is not preserved, eventually valuable content will be lost forever. The Web is a very valuable source of information and several government and private institutions are involved in archiving parts of it for various purposes. This paper gives an overview of web archiving, describes the techniques used in web archiving, discusses some challenges encountered during web archiving and gives possible solutions to these challenges.  


2016 ◽  
Vol 41 (2) ◽  
pp. 116-126 ◽  
Author(s):  
Sumitra Duncan ◽  
Karl-Rainer Blumenthal

The vast expanse and volatility of art ephemera based on the World Wide Web pose significant threats to the completeness of the art historical record. Towards its mission to enhance the resources available for current and future research through collaboration, the New York Art Resources Consortium (NYARC) collects, preserves, and provides access to art ephemera born in digital formats native to the web. It leverages its member institutions’ collecting strengths and resources to establish a permanently sustainable web archiving programme. This article introduces NYARC's web archiving practices at the principal stages in a typical web archive's lifecycle, describes how each benefit from collaboration among its member libraries and external programme partners, and identifies opportunities for further art libraries and consortia to participate in this important effort to preserve at-risk art historical resources.


2009 ◽  
Vol 28 (2) ◽  
pp. 81 ◽  
Author(s):  
John Carlo Bertot

<span>Public libraries were early adopters of Internet-based technologies and have provided public access to the Internet and computers since the early 1990s. The landscape of public-access Internet and computing was substantially different in the 1990s as the World Wide Web was only in its initial development. At that time, public libraries essentially experimented with publicaccess Internet and computer services, largely absorbing this service into existing service and resource provision without substantial consideration of the management, facilities, staffing, and other implications of public-access technology (PAT) services and resources. This article explores the implications for public libraries of the provision of PAT and seeks to look further to review issues and practices associated with PAT provision resources. While much research focuses on the amount of public access that </span><span>public libraries provide, little offers a view of the effect of public access on libraries. This article provides insights into some of the costs, issues, and challenges associated with public access and concludes with recommendations that require continued exploration.</span>


Author(s):  
Anthony D. Andre

This paper provides an overview of the various human factors and ergonomics (HF/E) resources on the World Wide Web (WWW). A list of the most popular and useful HF/E sites will be provided, along with several critical guidelines relevant to using the WWW. The reader will gain a clear understanding of how to find HF/E information on the Web and how to successfully use the Web towards various HF/E professional consulting activities. Finally, we consider the ergonomic implications of surfing the Web.


2016 ◽  
Vol 28 (2) ◽  
pp. 241-251 ◽  
Author(s):  
Luciane Lena Pessanha Monteiro ◽  
Mark Douglas de Azevedo Jacyntho

The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.


2018 ◽  
Vol 31 (5) ◽  
pp. 154-182
Author(s):  
Cadence Kinsey

This article analyses Camille Henrot’s 2013 film Grosse Fatigue in relation to the histories of hypermedia and modes of interaction with the World Wide Web. It considers the development of non-hierarchical systems for the organisation of information, and uses Grosse Fatigue to draw comparisons between the Web, the natural history museum and the archive. At stake in focusing on the way in which information is organised through hypermedia is the question of subjectivity, and this article argues that such systems are made ‘user-friendly’ by appearing to accommodate intuitive processes of information retrieval, reflecting the subject back to itself as autonomous. This produces an ideology of individualism which belies the forms of heteronomy that in fact shape and structure access to information online in significant ways. At the heart of this argument is an attention to the visual, and the significance of art as an immanent mode of analysis. Through the themes of transparency and opacity, and order and chaos, the article thus proposes a defining dynamic between autonomy and automation as a model for understanding the contemporary subject.


2017 ◽  
Vol 4 (1) ◽  
pp. 95-110 ◽  
Author(s):  
Deepika Punj ◽  
Ashutosh Dixit

In order to manage the vast information available on web, crawler plays a significant role. The working of crawler should be optimized to get maximum and unique information from the World Wide Web. In this paper, architecture of migrating crawler is proposed which is based on URL ordering, URL scheduling and document redundancy elimination mechanism. The proposed ordering technique is based on URL structure, which plays a crucial role in utilizing the web efficiently. Scheduling ensures that URLs should go to optimum agent for downloading. To ensure this, characteristics of both agents and URLs are taken into consideration for scheduling. Duplicate documents are also removed to make the database unique. To reduce matching time, document matching is made on the basis of their Meta information only. The agents of proposed migrating crawler work more efficiently than traditional single crawler by providing ordering and scheduling of URLs.


Sign in / Sign up

Export Citation Format

Share Document