Using Data Analytics Results in Practice: Challenges and Solution Directions

Author(s):  
Sunil Choenni ◽  
Mortaza S. Bargh ◽  
Niels Netten ◽  
Susan Van Den Braak

Organizations collect a vast amount of data of different types, from various sources, and through different channels. Primarily, these data are used by these organizations to facilitate their core business processes. However, today we witness a growing tendency to use these data for other purposes than that they are collected for. To this end, the data from one information system are combined with those of other information systems. Subsequently, the combined data are analyzed with advanced data analytics tools. Although there is a strong and practical need to apply such findings of data analytics to improve, among others, organizations’ (social) services, it is often not straightforward how to apply these findings in practice. This is due to many challenges arising from legal, ethical, and data quality concerns. In this chapter, we discuss the main reasons that hamper the application of data analytics findings, particularly pertaining to data collection processes and data analysis processes (like data mining and statistics). These reasons include inadequate transformations of statistical truths to individual cases, chances to fall into the trap of system realities, and required efforts to deal with the evolving semantics of data over time. The latter is due to the fact that our (social) environment is subjected to constant changes. We discuss two strategies to harvest data analytics findings in a responsible way. By means of some real-life examples in the field of social services we illustrate the applications of the strategies in practice. Furthermore, we argue that the findings from data-driven analytics may augment real-world ecosystems if they are applied with caution and responsibly.

2015 ◽  
Vol 28 (3) ◽  
pp. 1-14 ◽  
Author(s):  
Ehsan Saghehei ◽  
Azizollah Memariani

The approach used in this paper is an implementation of a data mining process against real-life transactions of debit cards with the aim of detecting suspicious behavior. The framework designed for this purpose has been obtained through merging supervised and unsupervised models. First, due to unlabeled data, Twostep and Self-Organizing Map algorithms have been used in clustering the transactions. A C5.0 classification algorithm has been applied to evaluate supervised models and also to detect suspicious behaviors. An innovative plan has been designed to evaluate hybrid models and select the most appropriate model for the solution of the fraud detection problem. The evaluation of the models and the final analysis of the data took place in four stages. The appropriate hybrid model was selected from among 16 models. The results show a high ability of selected model in detecting suspicious behavior in transactions involving debit cards.


2013 ◽  
Vol 22 (01) ◽  
pp. 1350005
Author(s):  
RICARDO PÉREZ-CASTILLO ◽  
MARIO PIATTINI ◽  
BARBARA WEBER

Concept location is a key activity during software modernization since it allows maintainers to exactly determine what pieces of source code support a specific concept. Real-world business processes and information systems providing operational IT support for respective processes can be misaligned as a consequence of uncontrolled maintenance over time. When concepts supported by an information system are getting outdated or misaligned, concept location becomes a time-consuming and error-prone task. Moreover, enterprise information systems (which implement business processes) embed significant business knowledge over time that is neither present nor documented anywhere else. To support the evolution of existing information systems, the embedded knowledge must first be retrieved and depicted in up-to-date business process models and then be mapped to the source code. This paper addresses this issue through a concept location approach that considers business activities as the key concept to be located and discovers different partial business process views for each piece of source code. Thus, the concept location problem becomes the problem of extracting such views. This approach follows model-driven development principles and an automatic model transformation is implemented to facilitate its adoption. Moreover, a case study involving two real-life information system demonstrates its feasibility.


2012 ◽  
Vol 488-489 ◽  
pp. 1466-1472
Author(s):  
Ehsan Saghehei ◽  
Farshad Farahani Deljoo ◽  
Mehrdad Hamidi Hedayat ◽  
Yazdan Khoshjahan

Today with swift growing of plastic cards industry in the world, variety and volume of data stored in the database is growing strongly, this issue reminds the growing need of banks and financial institutions in applying knowledge discovery processes on value creation services. The original approach of this paper, is step by step implementing process of data mining in real-life transaction of debit cards, with the aim of customer profiling. In this study profiling is applied with two approaches of explorative and predictive analysis. In explorative model SOM and TwoStep clustering techniques are used. Also in predictive model four decision tree techniques are applied, the C5.0, Chi-square Automatics Interaction Detection (CHAID), Quest, classification and regression. Finally, the optimal models details are more analyzed to discover the knowledge in transactions done.


Author(s):  
Sharan Kumar Paratala Rajagopal

This research paper describes how to determine the various factors impacting the customers churn rate in telecom industry. And what factors impact customer to move from one telecom source to another. Using data analytics and data mining will analyze the factors for churn rate.


Author(s):  
Jorge Cardoso

Business process management systems (BPMSs) (Smith & Fingar, 2003) provide a fundamental infrastructure to define and manage business processes, Web processes, and workflows. When Web processes and workflows are installed and executed, the management system generates data describing the activities being carried out and is stored in a log. This log of data can be used to discover and extract knowledge about the execution of processes. One piece of important and useful information that can be discovered is related to the prediction of the path that will be followed during the execution of a process. I call this type of discovery path mining. Path mining is vital to algorithms that estimate the quality of service of a process, because they require the prediction of paths. In this work, I present and describe how process path mining can be achieved by using data-mining techniques.


2009 ◽  
pp. 2543-2563 ◽  
Author(s):  
Narasimhaiah Gorla ◽  
Pang Wing Yan Betty

A new approach to vertical fragmentation in relational databases is proposed using association rules, a data-mining technique. Vertical fragmentation can enhance the performance of database systems by reducing the number of disk accesses needed by transactions. By adapting Apriori algorithm, a design methodology for vertical partitioning is proposed. The heuristic methodology is tested using two real-life databases for various minimum support levels and minimum confidence levels. In the smaller database, the partitioning solution obtained matched the optimal solution using exhaustive enumeration. The application of our method on the larger database resulted in the partitioning solution that has an improvement of 41.05% over unpartitioned solution and took less than a second to produce the solution. We provide future research directions on extending the procedure to distributed and object-oriented database designs.


2017 ◽  
Vol 10 (04) ◽  
pp. 788-792 ◽  
Author(s):  
D. Ramesh ◽  
Syed Pasha ◽  
G Roopa

Data mining has become one of the emerging fields in research because of its vast contents. Data mining is used for finding hidden patterns in the database or any other information repository. This information is necessary to generate knowledge from the patterns. The main task is to extract knowledge out of the information. In this paper we use a data mining technique called classification to determine the playing condition based on the current temperature values. Classification technique is a powerful way to classify the attributes of the dataset into different classes. In our approach we use classification algorithms like Decision Tree (J48), REP Tree and Random Tree. Then we compare the efficiencies of these classification algorithms. The tool we use for this approach is WEKA (Waikato Environment for Knowledge Analysis) a collection of open source machine learning algorithms.


Sign in / Sign up

Export Citation Format

Share Document