scholarly journals Profiling for Confidence: Debugging Relationships among Urban Spatio-Temporal Datasets

2020 ◽  
Author(s):  
Laís M. A. Rocha ◽  
Mirella M. Moro ◽  
Juliana Freire

We aim to help users identify potential issues in spatio-temporal data and thus gain trust in the results they derive from such data -- a crucial benefit in the era of data science and big data. We propose a framework for profiling spatio-temporal relationships that automatically identifies data slices that deviate from what is expected, which can be further analyzed for quality issues and/or potential effects on analysis' results. We describe the profiling methodology and present cases studies using real urban datasets, then emphasizing the need for spatio-temporal profiling to build trust on data analysis' results.  

2021 ◽  
pp. 107-132
Author(s):  
Magy Seif El-Nasr ◽  
Truong Huy Nguyen Dinh ◽  
Alessandro Canossa ◽  
Anders Drachen

This chapter discusses the topic of how one can use visualization techniques to analyze game data. Specifically, the chapter delves into the development of heatmaps to analyze spatio-temporal data. The chapter also discusses spatio-temporal visualizations and state-action transition visualizations. We also discuss two visualization systems that we have developed within the GUII lab: Stratmapper and Glyph. We provide you with a link that allows you to explore the use of these visualizations with real game data. This chapter is written in collaboration with Riddhi Padte and Varun Sriram, based on their work in Dr. Seif El-Nasr’s game data science class at Northeastern University; Erica Kleinman, PhD student at University of California at Santa Cruz; and Andy Bryant, software engineer at GUII Lab. The chapter also includes labs where you get to experience the analysis of game data through visualization.


2020 ◽  
Vol 9 (2) ◽  
pp. 88
Author(s):  
Damião Ribeiro de Almeida ◽  
Cláudio de Souza Baptista ◽  
Fabio Gomes de Andrade ◽  
Amilcar Soares

Trajectory data allow the study of the behavior of moving objects, from humans to animals. Wireless communication, mobile devices, and technologies such as Global Positioning System (GPS) have contributed to the growth of the trajectory research field. With the considerable growth in the volume of trajectory data, storing such data into Spatial Database Management Systems (SDBMS) has become challenging. Hence, Spatial Big Data emerges as a data management technology for indexing, storing, and retrieving large volumes of spatio-temporal data. A Data Warehouse (DW) is one of the premier Big Data analysis and complex query processing infrastructures. Trajectory Data Warehouses (TDW) emerge as a DW dedicated to trajectory data analysis. A list and discussions on problems that use TDW and forward directions for the works in this field are the primary goals of this survey. This article collected state-of-the-art on Big Data trajectory analytics. Understanding how the research in trajectory data are being conducted, what main techniques have been used, and how they can be embedded in an Online Analytical Processing (OLAP) architecture can enhance the efficiency and development of decision-making systems that deal with trajectory data.


Web Services ◽  
2019 ◽  
pp. 1301-1329
Author(s):  
Suren Behari ◽  
Aileen Cater-Steel ◽  
Jeffrey Soar

The chapter discusses how Financial Services organizations can take advantage of Big Data analysis for disruptive innovation through examination of a case study in the financial services industry. Popular tools for Big Data Analysis are discussed and the challenges of big data are explored as well as how these challenges can be met. The work of Hayes-Roth in Valued Information at the Right Time (VIRT) and how it applies to the case study is examined. Boyd's model of Observe, Orient, Decide, and Act (OODA) is explained in relation to disruptive innovation in financial services. Future trends in big data analysis in the financial services domain are explored.


2019 ◽  
Vol 13 (01) ◽  
pp. 111-133
Author(s):  
Romita Banerjee ◽  
Karima Elgarroussi ◽  
Sujing Wang ◽  
Akhil Talari ◽  
Yongli Zhang ◽  
...  

Twitter is one of the most popular social media platforms used by millions of users daily to post their opinions and emotions. Consequently, Twitter tweets have become a valuable knowledge source for emotion analysis. In this paper, we present a new framework, K2, for tweet emotion mapping and emotion change analysis. It introduces a novel, generic spatio-temporal data analysis and storytelling framework that can be used to understand the emotional evolution of a specific section of population. The input for our framework is the location and time of where and when the tweets were posted and an emotion assessment score in the range [Formula: see text], with [Formula: see text] representing a very high positive emotion and [Formula: see text] representing a very high negative emotion. Our framework first segments the input dataset into a number of batches with each batch representing a specific time interval. This time interval can be a week, a month or a day. By generalizing existing kernel density estimation techniques in the next step, we transform each batch into a continuous function that takes positive and negative values. We have used contouring algorithms to find the contiguous regions with highly positive and highly negative emotions belonging to each member of the batch. Finally, we apply a generic, change analysis framework that monitors how positive and negative emotion regions evolve over time. In particular, using this framework, unary and binary change predicate are defined and matched against the identified spatial clusters, and change relationships will then be recorded, for those spatial clusters for which a match occurs. We also propose animation techniques to facilitate spatio-temporal data storytelling based on the obtained spatio-temporal data analysis results. We demo our approach using tweets collected in the state of New York in the month of June 2014.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Kehua Miao ◽  
Jie Li ◽  
Wenxing Hong ◽  
Mingtao Chen

The booming development of data science and big data technology stacks has inspired continuous iterative updates of data science research or working methods. At present, the granularity of the labor division between data science and big data is more refined. Traditional work methods, from work infrastructure environment construction to data modelling and analysis of working methods, will greatly delay work and research efficiency. In this paper, we focus on the purpose of the current friendly collaboration of the data science team to build data science and big data analysis application platform based on microservices architecture for education or nonprofessional research field. In the environment based on microservices that facilitates updating the components of each component, the platform has a personal code experiment environment that integrates JupyterHub based on Spark and HDFS for multiuser use and a visualized modelling tools which follow the modular design of data science engineering based on Greenplum in-database analysis. The entire web service system is developed based on spring boot.


Diversity ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 472
Author(s):  
Jorge Rubén Sánchez-González

The issue of hemi- and homonyms is an unsolved topic in the Big Data era, where informatics and technicians, rather than biologists or taxonomists, analyze huge datasets. Nowadays, taxonomic nomenclature is ruled by four independent international codes, and according to them, the existence of hemihomonyms and homonyms is accepted under some conditions as an exception to the general rule. This situation entails confusion, disagreements, and a plethora of problems whose consequences could worsen in the near future within the framework of the big data era. Moreover, the increasing use of big databases and analyses, data science, bioinformatics, biological monitoring, and bioassessment has shown such exceptions to be inconvenient, since these exceptions to homonyms are considered as duplicates by databases and statistical software, which are handled by non-taxonomist experts. International Codes of Nomenclature must change within the new context of big data analysis. This work aims to propose the elimination of any exception to the presence of homonyms and to evaluate whether the Independence Principle makes sense within this new context. Increasing coordination between several independent nomenclatural systems is essential and, perhaps, we must conduct our efforts towards a universal species list, finishing with the historical schism between Codes.


Sensors ◽  
2019 ◽  
Vol 19 (12) ◽  
pp. 2772 ◽  
Author(s):  
Aguinaldo Bezerra ◽  
Ivanovitch Silva ◽  
Luiz Affonso Guedes ◽  
Diego Silva ◽  
Gustavo Leitão ◽  
...  

Alarm and event logs are an immense but latent source of knowledge commonly undervalued in industry. Though, the current massive data-exchange, high efficiency and strong competitiveness landscape, boosted by Industry 4.0 and IIoT (Industrial Internet of Things) paradigms, does not accommodate such a data misuse and demands more incisive approaches when analyzing industrial data. Advances in Data Science and Big Data (or more precisely, Industrial Big Data) have been enabling novel approaches in data analysis which can be great allies in extracting hitherto hidden information from plant operation data. Coping with that, this work proposes the use of Exploratory Data Analysis (EDA) as a promising data-driven approach to pave industrial alarm and event analysis. This approach proved to be fully able to increase industrial perception by extracting insights and valuable information from real-world industrial data without making prior assumptions.


Author(s):  
Naonori Ueda ◽  
Futoshi Naya

Machine learning is a promising technology for analyzing diverse types of big data. The Internet of Things era will feature the collection of real-world information linked to time and space (location) from all sorts of sensors. In this paper, we discuss spatio-temporal multidimensional collective data analysis to create innovative services from such spatio-temporal data and describe the core technologies for the analysis. We describe core technologies about smart data collection and spatio-temporal data analysis and prediction as well as a novel approach for real-time, proactive navigation in crowded environments such as event spaces and urban areas. Our challenge is to develop a real-time navigation system that enables movements of entire groups to be efficiently guided without causing congestion by making near-future predictions of people flow. We show the effectiveness of our navigation approach by computer simulation using artificial people-flow data.


Sign in / Sign up

Export Citation Format

Share Document