Tracking Phantastic Objects: A Computer Algorithmic Investigation of Narrative Evolution in Unstructured Data Sources

Author(s):  
David Tuckett ◽  
Robert Elliot Smith ◽  
Rickard Nyman
2014 ◽  
Vol 38 ◽  
pp. 121-133 ◽  
Author(s):  
David Tuckett ◽  
Robert Elliot Smith ◽  
Rickard Nyman

Author(s):  
Francesco Corcoglioniti ◽  
Marco Rospocher ◽  
Roldano Cattoni ◽  
Bernardo Magnini ◽  
Luciano Serafini

This chapter describes the KnowledgeStore, a scalable, fault-tolerant, and Semantic Web grounded open-source storage system to jointly store, manage, retrieve, and query interlinked structured and unstructured data, especially designed to manage all the data involved in Knowledge Extraction applications. The chapter presents the concept, design, function and implementation of the KnowledgeStore, and reports on its concrete usage in four application scenarios within the NewsReader EU project, where it has been successfully used to store and support the querying of millions of news articles interlinked with billions of RDF triples, both extracted from text and imported from Linked Open Data sources.


Transformation presents the second step in the ETL process that is responsible for extracting, transforming and loading data into a data warehouse. The role of transformation is to set up several operations to clean, to format and to unify types and data coming from multiple and different data sources. The goal is to get data to conform to the schema of the data warehouse to avoid any ambiguity problems during the data storage and analytical operations. Transforming data coming from structured, semi-structured and unstructured data sources need two levels of treatments: the first one is transformation schema to schema to get a unified schema for all selected data sources and the second treatment is transformation data to data to unify all types and data gathered. To ensure the setting up of these steps we propose in this paper a process switch from one database schema to another as a part of transformation schema to schema, and a meta-model based on MDA approach to describe the main operations of transformation data to data. The results of our transformations propose a data loading in one of the four schemas of NoSQL to best meet the constraints and requirements of Big Data.


2019 ◽  
Vol 249 ◽  
pp. R47-R58 ◽  
Author(s):  
Alex Bishop ◽  
Juan Mateos-Garcia

Recent studies have shown a strong link between the complexity of economies and their economic development. There remain gaps in our understanding of the mechanisms underpinning these links, in part because they are difficult to analyse with highly aggregated, official data sources that do not capture the emergence of new industrial activities, a potential benefit from complexity. We seek to address some of these gaps by calculating two indices of economic complexity for functional local economies (Travel to Work Areas) in Great Britain, and explore their link with these locations’ economic performance. Seeking to gain a better understanding of the mechanism connecting economic complexity with economic performance, we create a measure of emergent technological activity in a location based on a combination of novel data sources including text from UK business websites and CrunchBase, a technology company directory, and study its link with economic complexity. Our results highlight the potential value of novel, unstructured data sources for the analysis of the links between economic complexity and regional economic development.


2020 ◽  
pp. 1-18
Author(s):  
Sofie De Broe ◽  
Peter Struijs ◽  
Piet Daas ◽  
Arnout van Delden ◽  
Joep Burger ◽  
...  

This paper aims to elicit a discussion of the existence of a paradigm shift in official statistics through the emergence of new (unstructured) data sources and methods that may not adhere to established and existing statistical practices and quality frameworks. The paper discusses strengths and weaknesses of several data sources. Furthermore, it discusses methodological, technical and cultural barriers in dealing with new data and methods in data science; cultural as in the culture that reigns in an area of expertise or approach. The paper concludes with suggestions of updating the existing quality frameworks. We take the position that there is no paradigm shift but that the existing production processes should be adjusted and that existing quality frameworks should be updated in order for official statistics to benefit from the fusion of data, knowledge and skills among survey methodologists and data scientists.


2020 ◽  
Vol 8 (5) ◽  
pp. 1945-1949

Current huge volumes of data is generated from wide variety of data sources and there is lot of demand for processing, this data. Apache Hadoop is designed for batch processing. Though Hadoop is used for batch processing there is lot of requirement for real time stream processing and querying on unstructured data. Data ingestion tools of Hadoop are playing a key role in processing of streamed log data. With the increase of volume of the data performance of data ingestion tools goes down linearly. In this paper we discuss solutions for performance issues of data ingestion tools, capturing and processing of streamed multimedia data along with real-time stream processing with the help of frame work known as H-Stream framework. (Abstract)


Author(s):  
Ronald Maier

An increasing share of work in businesses and organizations depends on information and knowledge rather than manual labor and physical goods (Wolf, 2005). Knowledge work contributes substantially to the long-term success of an organization. It is characterized by unstructured, creative, and learning-oriented tasks and involves access to a wide variety of structured and unstructured data sources such as Web sites, databases, data warehouses, document bases, or messaging systems. Knowledge work is often hampered by the fragmentation of resources across these numerous elements of information and communication technology (ICT) infrastructures. Consequently, concepts for the design and implementation of integrating technologies are required in order to improve ICT support for knowledge work.


Sign in / Sign up

Export Citation Format

Share Document