Decision Guidance for Optimizing Web Data Quality - A Recommendation Model for Completing Information Extraction Results

Author(s):  
Christina Feilmayr
Author(s):  
Amrapali Zaveri ◽  
Andrea Maurino ◽  
Laure-Berti Equille

The standardization and adoption of Semantic Web technologies has resulted in an unprecedented volume of data being published as Linked Data (LD). However, the “publish first, refine later” philosophy leads to various quality problems arising in the underlying data such as incompleteness, inconsistency and semantic ambiguities. In this article, we describe the current state of Data Quality in the Web of Data along with details of the three papers accepted for the International Journal on Semantic Web and Information Systems' (IJSWIS) Special Issue on Web Data Quality. Additionally, we identify new challenges that are specific to the Web of Data and provide insights into the current progress and future directions for each of those challenges.


Author(s):  
Hessah Albanwan ◽  
Rongjun Qin

Remote sensing images and techniques are powerful tools to investigate earth’s surface. Data quality is the key to enhance remote sensing applications and obtaining clear and noise-free set of data is very difficult in most situations due to the varying acquisition (e.g., atmosphere and season), sensor and platform (e.g., satellite angles and sensor characteristics) conditions. With the increasing development of satellites, nowadays Terabytes of remote sensing images can be acquired every day. Therefore, information and data fusion can be particularly important in the remote sensing community. The fusion integrates data from various sources acquired asynchronously for information extraction, analysis, and quality improvement. In this chapter, we aim to discuss the theory of spatiotemporal fusion by investigating previous works, in addition to describing the basic concepts and some of its applications by summarizing our prior and ongoing works.


Author(s):  
Shilpa Deshmukh, Et. al.

Deep Web substance are gotten to by inquiries submitted to Web information bases and the returned information records are enwrapped in progressively created Web pages (they will be called profound Web pages in this paper). Removing organized information from profound Web pages is a difficult issue because of the fundamental mind boggling structures of such pages. As of not long ago, an enormous number of strategies have been proposed to address this issue, however every one of them have characteristic impediments since they are Web-page-programming-language subordinate. As the mainstream two-dimensional media, the substance on Web pages are constantly shown routinely for clients to peruse. This inspires us to look for an alternate path for profound Web information extraction to beat the constraints of past works by using some fascinating normal visual highlights on the profound Web pages. In this paper, a novel vision-based methodology that is Visual Based Deep Web Data Extraction (VBDWDE) Algorithm is proposed. This methodology basically uses the visual highlights on the profound Web pages to execute profound Web information extraction, including information record extraction and information thing extraction. We additionally propose another assessment measure amendment to catch the measure of human exertion expected to create wonderful extraction. Our investigations on a huge arrangement of Web information bases show that the proposed vision-based methodology is exceptionally viable for profound Web information extraction.


Designing intelligent expert systems capable of answering different human queries is a challenging and emerging area of research. A huge amount of web data is available online and majority of which are in the form of unstructured documents covering articles, online news, corporate reports, medical records, social media communication data, etc. A user in need of certain information has to assess all the relevant documents to obtain the exact answer of their queries which is a time consuming and tedious work. Also, sometimes it becomes quite difficult to obtain the exact information from a list of documents quickly as and when required unless the whole document is read. This paper presents a rule-based information extraction system for unstructured web data that access the document contents quickly and provides the relevant answers to the user queries in a structured format. A number of tests were conducted to determine the overall performance of the proposed model and the results obtained in all the experiments performed shows the effectiveness of the model in providing required answers to different user queries quickly.


Author(s):  
Suranga C. H. Geekiyanage ◽  
Andrzej Tunkiel ◽  
Dan Sui

Abstract Data analytics is a process of data acquiring, transforming, interpreting, modelling, displaying and storing data with an aim of extracting useful information, so that decision-making, actions executing, events detecting and incidents managing can be handled in an efficient and certain manner. However, data analytics also meets some challenges, for instance, data corruption due to noises, time delays, missing and external disturbances, etc. This paper focuses on data quality improvement to cleanse, improve and interpret the post-well or real-time data to preserve and enhance data features, like accuracy, consistency, reliability and validity. In this study, laboratory data and field data are used to illustrate data issues and show data quality improvements with using different data processing methods. Case study clearly demonstrates that the proper data quality management process and information extraction methods are essential to carry out an intelligent digitalization in oil and gas industry.


2012 ◽  
Author(s):  
Nurul A. Emran ◽  
Noraswaliza Abdullah ◽  
Nuzaimah Mustafa

Sign in / Sign up

Export Citation Format

Share Document