A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques

This paper explores the strategic importance of information systems for managing such crises as the H1N1 outbreak and the Haiti earthquake in the healthcare service chain. The paper synthesizes the literature on crisis management and information systems for emergency response and draws some key lessons for healthcare service chains. The paper illustrates these lessons by using data from an empirical case study in the region of Crete in Greece. The author concludes by discussing some future directions in managing crises in the healthcare service chain, including the importance of distributive, adaptive crisis management through new technologies like mashups.

Download Full-text

Severely imbalanced Big Data challenges: investigating data sampling approaches

Journal Of Big Data ◽

10.1186/s40537-019-0274-4 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 3

Author(s):

Tawfiq Hasanin ◽

Taghi M. Khoshgoftaar ◽

Joffrey L. Leevy ◽

Richard A. Bauder

Keyword(s):

Big Data ◽

Operating Characteristic ◽

Performance Metrics ◽

Characteristic Curve ◽

Geometric Mean ◽

Class Imbalance ◽

Random Undersampling ◽

First Case ◽

Negative Class

AbstractSevere class imbalance between majority and minority classes in Big Data can bias the predictive performance of Machine Learning algorithms toward the majority (negative) class. Where the minority (positive) class holds greater value than the majority (negative) class and the occurrence of false negatives incurs a greater penalty than false positives, the bias may lead to adverse consequences. Our paper incorporates two case studies, each utilizing three learners, six sampling approaches, two performance metrics, and five sampled distribution ratios, to uniquely investigate the effect of severe class imbalance on Big Data analytics. The learners (Gradient-Boosted Trees, Logistic Regression, Random Forest) were implemented within the Apache Spark framework. The first case study is based on a Medicare fraud detection dataset. The second case study, unlike the first, includes training data from one source (SlowlorisBig Dataset) and test data from a separate source (POST dataset). Results from the Medicare case study are not conclusive regarding the best sampling approach using Area Under the Receiver Operating Characteristic Curve and Geometric Mean performance metrics. However, it should be noted that the Random Undersampling approach performs adequately in the first case study. For the SlowlorisBig case study, Random Undersampling convincingly outperforms the other five sampling approaches (Random Oversampling, Synthetic Minority Over-sampling TEchnique, SMOTE-borderline1 , SMOTE-borderline2 , ADAptive SYNthetic) when measuring performance with Area Under the Receiver Operating Characteristic Curve and Geometric Mean metrics. Based on its classification performance in both case studies, Random Undersampling is the best choice as it results in models with a significantly smaller number of samples, thus reducing computational burden and training time.

Download Full-text

De-Identification of Health Data in Big Data using a Novel Bio-Inspired Apoptosis Algorithm

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2015070101 ◽

2015 ◽

Vol 5 (3) ◽

pp. 1-15

Author(s):

Amine Rahmani ◽

Abdelmalek Amine ◽

Reda Mohamed Hamou

Keyword(s):

Big Data ◽

New Technologies ◽

Sensitive Information ◽

Sensitive Data ◽

Privacy Concerns ◽

Shared Data ◽

Data Mining Algorithms ◽

Identity Disclosure ◽

Using Data ◽

Mining Algorithms

In the last years, with the emergence of new technologies in the image of big data, the privacy concerns had grown widely. However, big data means the dematerialization of the data. The classical security solutions are no longer efficient in this case. Nowadays, sharing the data is much easier as well as saying hello. The amount of shared data over the web keeps growing from day to another which creates a wide gap between the purpose of sharing data and the fact that these last contain sensitive information. For that, the researches turned their attention to new issues and domains in order to minimize this gap. In other way, they intended to ensure a good utility of data by preserving its meaning while hiding sensitive information to prevent identity disclosure. Many techniques had been used for that. Some of it is mathematical and other ones using data mining algorithms. This paper deals with the problem of hiding sensitive data in shared structured medical data using a new bio-inspired algorithm from the natural phenomena of apoptosis cells in human body.

Download Full-text

Achieving a low carbon transition in Japan, the role of motor vehicle lifetime

Low Carbon Mobility Transitions ◽

10.23912/978-1-910158-64-7-3270 ◽

2016 ◽

Author(s):

Shigemi Kagawa ◽

Daisuke Nishijima ◽

Yuya Nakamoto

Keyword(s):

Climate Change ◽

New Technologies ◽

Motor Vehicle ◽

Global Climate ◽

Ghg Emissions ◽

Low Carbon ◽

Policy Tool ◽

Using Data

In order to achieve climate change mitigation goals, reducing greenhouse gas (GHG) emissions from Japan’s household sector is critical. Accomplishing a transition to low carbon and energy efficient consumer goods is particularly valuable as a policy tool for reducing emissions in the residential sector. This case study presents an analysis of the lifetime of personal vehicles in Japan, and considers the optimal scenario in terms of retention and disposal, specifically as it relates to GHG emissions. Using data from Japan, the case study shows the critical importance of including whole-of-life energy and carbon calculations when assessing the contributions that new technologies can make towards low carbon mobility transitions. While energy-efficiency gains are important, replacing technologies can overlook the energy and carbon embedded in the production phase. Without this perspective, policy designed to reduce GHG emissions may result in increased emissions and further exacerbate global climate change.

Download Full-text

Case Study: Political profiling based on Twitter Sentiment analysis for Big Data using Data Mining Algorithms

International Journal of Engineering Research and ◽

10.17577/ijertv5is020239 ◽

2016 ◽

Vol V5 (02) ◽

Author(s):

Shirin Hijaz Matwankar ◽

Dr. Shubhash K. Shinde ◽

Keyword(s):

Data Mining ◽

Big Data ◽

Sentiment Analysis ◽

Data Mining Algorithms ◽

Using Data ◽

Mining Algorithms

Download Full-text

Examining the Core Dilemmas Hindering Big Data-related Transformations in Public-Sector Organisations

NISPAcee Journal of Public Administration and Policy ◽

10.2478/nispa-2019-0017 ◽

2019 ◽

Vol 12 (2) ◽

pp. 131-156 ◽

Cited By ~ 1

Author(s):

Päivikki Kuoppakangas ◽

Tony Kinder ◽

Jari Stenvall ◽

Ilpo Laitinen ◽

Olli-Pekka Ruuskanen ◽

...

Keyword(s):

Big Data ◽

Risk Taking ◽

Service Provision ◽

Data Driven ◽

The Core ◽

Management Approaches ◽

Using Data ◽

Public Organisations ◽

Managerial System

AbstractThis study examines public organisations planning big data-driven transformations in their service provision. Without radical structural change or managerial system changes, leaders face dilemmas: simply bolting on big data makes little difference. This study is based on a qualitative empirical case study using data collected from the cities of Helsinki and Tampere in Finland. The three core dilemma pairs detected and connected to the big data-related organisational changes are: (1) repetitive continuity vs. visionary change, (2) risk-taking vs. security-seeking and (3) technology-based development vs. human-based development. This study suggests that organisational readiness involves not only capabilities; instead, readiness involves absorbing knowledge, making decisions, handling ambiguities, managing dilemmas. Thus, big data-related transformations in public organisations require embracing the world of dilemmas, since selected and cancelled experiments may each have valuable outcomes. The capability to act on intentions is a prerequisite for readiness; however, a preparedness to detect and address dilemmas is central to big data-related transformations. Thus, the ability to make dilemma decisions is a more complicated characteristic of readiness. In conclusion, our data analysis suggests that traditional public organisational and chance management approaches produce unsolved dilemmas in big data-related organisational changes.

Download Full-text

Managing Crises in the Healthcare Service Chain

Managing Crises and Disasters with Emerging Technologies ◽

10.4018/978-1-4666-0167-3.ch016 ◽

2012 ◽

pp. 229-244

Author(s):

Panos Constantinides

Keyword(s):

Information Systems ◽

Crisis Management ◽

Healthcare Service ◽

New Technologies ◽

Haiti Earthquake ◽

Future Directions ◽

Service Chain ◽

Using Data ◽

H1n1 Outbreak

This paper explores the strategic importance of information systems for managing such crises as the H1N1 outbreak and the Haiti earthquake in the healthcare service chain. The paper synthesizes the literature on crisis management and information systems for emergency response and draws some key lessons for healthcare service chains. The paper illustrates these lessons by using data from an empirical case study in the region of Crete in Greece. The author concludes by discussing some future directions in managing crises in the healthcare service chain, including the importance of distributive, adaptive crisis management through new technologies like mashups.

Download Full-text

The Impact Of Lane Closures On Congestion And Reliability: A Case Study Of Toronto's Gardiner Expressway

10.32920/ryerson.14656557.v1 ◽

2021 ◽

Author(s):

Christina Borowiec

Keyword(s):

Big Data ◽

Performance Indicators ◽

System Performance ◽

Policy Action ◽

Methods Of Analysis ◽

Trade Offs ◽

Regression Framework ◽

The Impact ◽

Lane Closures

Usage of big data with before-after methods of analysis makes it possible to evaluate the effect of major transport investments on system performance. In employing before-after methods to investigate the impact of lane closures on congestion and travel reliability, changes and trade-offs in performance indicators are quantified and policy action effectiveness is evaluated. This is illustrated through a case study of two separate lane closure interventions on the Gardiner Expressway in Toronto, Ontario. Models using a regression framework were developed for the pre-, peri-, and post-closure test periods of the first intervention and pre- and peri-closure periods of the second intervention. Results suggest the impacts of policy actions on system performance are strong, and that congestion and travel reliability counterintuitively move in different directions. Reduced demand effects are observed, prompting discussion on how highways and congestion should be managed and whether or not municipalities should add capacity to regional assets.

Download Full-text

The Impact Of Lane Closures On Congestion And Reliability: A Case Study Of Toronto's Gardiner Expressway

10.32920/ryerson.14656557 ◽

2021 ◽

Author(s):

Christina Borowiec

Keyword(s):

Big Data ◽

Performance Indicators ◽

System Performance ◽

Policy Action ◽

Methods Of Analysis ◽

Trade Offs ◽

Regression Framework ◽

The Impact ◽

Lane Closures

Usage of big data with before-after methods of analysis makes it possible to evaluate the effect of major transport investments on system performance. In employing before-after methods to investigate the impact of lane closures on congestion and travel reliability, changes and trade-offs in performance indicators are quantified and policy action effectiveness is evaluated. This is illustrated through a case study of two separate lane closure interventions on the Gardiner Expressway in Toronto, Ontario. Models using a regression framework were developed for the pre-, peri-, and post-closure test periods of the first intervention and pre- and peri-closure periods of the second intervention. Results suggest the impacts of policy actions on system performance are strong, and that congestion and travel reliability counterintuitively move in different directions. Reduced demand effects are observed, prompting discussion on how highways and congestion should be managed and whether or not municipalities should add capacity to regional assets.

Download Full-text

Project Management in Risk Analysis for Validation of Computer Systems in the Warehouse System

Handbook of Research on Emerging Technologies for Effective Project Management - Advances in Logistics, Operations, and Management Science ◽

10.4018/978-1-5225-9993-7.ch008 ◽

2020 ◽

pp. 141-157

Author(s):

Jorge Lima de Magalhães ◽

Juliana Satie Oliveira Igarashi ◽

Zulmira Hartz ◽

Adelaide Maria de Souza Antunes ◽

Elizabeth Valverde Macedo

Keyword(s):

Big Data ◽

Project Management ◽

Risk Analysis ◽

Human Resources ◽

Information Management ◽

New Technologies ◽

Point Of View ◽

Data Validation ◽

Digital Era

The informational and digital era of Big Data presents a non-trivial and unprecedented way in history for data and information management in organizations. Thus, to manage, protect, and ensure the validation of this data, it is imperative to develop new technologies for project management and their respective implementation in organizations. This chapter shows a case study in a pharmaceutical industry with the proposition of a methodology for validation of emerging technologies in the computerized systems. Data validation and security for project management in the organization is increasingly in demand. So, this implies that time and human resources in organizations are not infinite. It is necessary to prioritize the activities and resources dedicated to maintaining the validated state of the system. Authors propose a risk analysis to help companies with validation. They also present a proposed methodology for risk analysis from the point of view of the validation of computerized systems in a Warehouse Management module in a validated SAP ERP.

Download Full-text