Scalable business intelligence with graph collections

André Petermann; Martin Junghanns

doi:10.1515/itit-2016-0006

Scalable business intelligence with graph collections

it - Information Technology ◽

10.1515/itit-2016-0006 ◽

2016 ◽

Vol 58 (4) ◽

Cited By ~ 1

Author(s):

André Petermann ◽

Martin Junghanns

Keyword(s):

Big Data ◽

Data Warehouse ◽

Business Intelligence ◽

State Of The Art ◽

Prior Work ◽

Graph Models ◽

Relationship Patterns ◽

Graph Analytics ◽

Graph Data ◽

Graph Data Management

AbstractUsing graph data models for business intelligence applications is a novel and promising approach. In contrast to traditional data warehouse models, graph models enable the mining of relationship patterns. In our prior work, we introduced an approach to graph-based data integration and analytics called BIIIG (Business Intelligence with Integrated Instance Graphs). In this work, we compare state-of-the-art systems for graph data management and analytics with regard to the support for our approach in Big Data scenarios. To exemplify the analytical value of graph models for business intelligence, we propose an analytical workflow to extract knowledge from graph-integrated business data. Finally, we show how we use Gradoop, a novel framework for distributed graph analytics, to implement our approach.

Download Full-text

A Scalable Business Intelligence Decision-Making System in the Era of Big Data

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3251.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 2034-2042

Keyword(s):

Big Data ◽

Data Warehouse ◽

Data Storage ◽

Business Intelligence ◽

Unstructured Data ◽

Data Sources ◽

Second Step ◽

Decision Making System ◽

Set Up

Transformation presents the second step in the ETL process that is responsible for extracting, transforming and loading data into a data warehouse. The role of transformation is to set up several operations to clean, to format and to unify types and data coming from multiple and different data sources. The goal is to get data to conform to the schema of the data warehouse to avoid any ambiguity problems during the data storage and analytical operations. Transforming data coming from structured, semi-structured and unstructured data sources need two levels of treatments: the first one is transformation schema to schema to get a unified schema for all selected data sources and the second treatment is transformation data to data to unify all types and data gathered. To ensure the setting up of these steps we propose in this paper a process switch from one database schema to another as a part of transformation schema to schema, and a meta-model based on MDA approach to describe the main operations of transformation data to data. The results of our transformations propose a data loading in one of the four schemas of NoSQL to best meet the constraints and requirements of Big Data.

Download Full-text

Towards Convergence in Information Systems Design

Novel Approaches to Information Systems Design - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-2975-1.ch011 ◽

2020 ◽

pp. 247-263

Author(s):

Deepika Prakash

Keyword(s):

Machine Learning ◽

Big Data ◽

Information Systems ◽

Data Warehouse ◽

Business Intelligence ◽

Systems Design ◽

Information Model ◽

Information Systems Design ◽

Nosql Databases ◽

Nosql Database

Three technologies—business intelligence, big data, and machine learning—developed independently and address different types of problems. Data warehouses have been used as systems for business intelligence, and NoSQL databases are used for big data. In this chapter, the authors explore the convergence of business intelligence and big data. Traditionally, a data warehouse is implemented on a ROLAP or MOLAP platform. Whereas MOLAP suffers from having propriety architecture, ROLAP suffers from the inherent disadvantages of RDBMS. In order to mitigate the drawbacks of ROLAP, the authors propose implementing a data warehouse on a NoSQL database. They choose Cassandra as their database. For this they start by identifying a generic information model that captures the requirements of the system to-be. They propose mapping rules that map the components of the information model to the Cassandra data model. They finally show a small implementation using an example.

Download Full-text

Commercial and Open Source Business Intelligence Platforms for Big Data Warehousing

Emerging Perspectives in Big Data Warehousing - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-5516-2.ch007 ◽

2019 ◽

pp. 158-181 ◽

Cited By ~ 1

Author(s):

Jorge Bernardino ◽

Joaquim Lapa ◽

Ana Almeida

Keyword(s):

Decision Making ◽

Big Data ◽

Decision Support ◽

Comparative Analysis ◽

Open Source ◽

Data Warehouse ◽

Business Intelligence ◽

Selection Process ◽

Decision Making Processes ◽

Big Data Warehouse

A big data warehouse enables the analysis of large amounts of information that typically comes from the organization's transactional systems (OLTP). However, today's data warehouse systems do not have the capacity to handle the massive amount of data that is currently produced. Business intelligence (BI) is a collection of decision support technologies that enable executives, managers, and analysts to make better and faster decisions. Organizations must make good use of business intelligence platforms to quickly acquire desirable information from the huge volume of data to reduce the time and increase the efficiency of decision-making processes. In this chapter, the authors present a comparative analysis of commercial and open source BI tools capabilities, in order to aid organizations in the selection process of the most suitable BI platform. They also evaluated and compared six major open source BI platforms: Actuate, Jaspersoft, Jedox/Palo, Pentaho, SpagoBI, and Vanilla; and six major commercial BI platforms: IBM Cognos, Microsoft BI, MicroStrategy, Oracle BI, SAP BI, and SAS BI & Analytics.

Download Full-text

Ontology and trust based data warehouse in new generation of business intelligence: State-of-the-art, challenges, and opportunities

2015 IEEE 13th International Conference on Industrial Informatics (INDIN) ◽

10.1109/indin.2015.7281780 ◽

2015 ◽

Cited By ~ 4

Author(s):

Pornpit Wongthongtham ◽

Bilal Abu-Salih

Keyword(s):

Data Warehouse ◽

Business Intelligence ◽

State Of The Art ◽

Challenges And Opportunities ◽

New Generation

Download Full-text

An Efficient Multi-Phase Blocking Strategy for Entity Resolution in Big Data

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7070.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 254-263

Keyword(s):

Big Data ◽

Data Integration ◽

Data Warehouse ◽

Real World ◽

Business Intelligence ◽

Time Complexity ◽

Entity Resolution ◽

Second Phase ◽

Multi Phase ◽

State Of Art

Entity Resolution (ER) is the process of identifying records that refer to the same real-world entity. It plays a key role in many applications as data warehouse, data integration, and business intelligence. Comparing every record with all corresponding records is infeasible especially for a big dataset. To overcome such a problem, blocking techniques have been implemented. In this paper, we propose a novel Efficient Multi-Phase Blocking Strategy (EMPBS) for resolving duplicates in big data. As per our knowledge, some state of art blocking techniques may result in overlapping blocks (i.e. Q-grams) which cause redundant comparisons and hence increase the time complexity. Our proposed blocking strategy has disjoint blocks and less time complexity compared to Q-grams and slandered blocking techniques. In addition, EMPBS is general and requires no restrictions on the type of blocking keys. EMPBS consists of three phases. The first one generates three single efficient blocking keys. The second phase takes the output of the first phase as an input to construct a compound key. The compound key is composed of concatenation of two single blocking keys. Three compound blocking keys are the output of this phase that will be used as an input for the last phase, which is generating the Efficient Multi-Phase Blocking Key (EMPBK). EMPBK is constructed using the union of two compound blocking keys. The implementation of EMPBS presents promising results in terms of Reduction Ratio (RR) as it achieves a higher value of RR than adopting only a single blocking key, while at the same time maintains nearly the same precision and recall. EMPBS reduced about 84% of the average number of comparisons accomplished in a single blocking key. To evaluate EMPBS, we developed a Duplicate Generation tool (DupGen) that accepts a clean semi-structured file as an input and generates labeled duplicate records according to certain criteria.

Download Full-text

Evolutionary Intelligent Data Warehousing Approach to Knowledge Discovery Systems: Dynamic Cubing

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666191211113623 ◽

2019 ◽

Vol 13 ◽

Author(s):

Harkiran Kaur ◽

Kawaljeet Singh ◽

Tejinder Kaur

Keyword(s):

Knowledge Discovery ◽

Data Warehouse ◽

Business Intelligence ◽

End Users ◽

Processing Stage ◽

Information Update ◽

On Line ◽

Analytical Processing ◽

Warehouse Operations ◽

Made In

Background: Numerous E – Migrants databases assist the migrants to locate their peers in various countries; hence contributing largely in communication of migrants, staying overseas. Presently, these traditional E – Migrants databases face the issues of non – scalability, difficult search mechanisms and burdensome information update routines. Furthermore, analysis of migrants’ profiles in these databases has remained unhandled till date and hence do not generate any knowledge. Objective: To design and develop an efficient and multidimensional knowledge discovery framework for E - Migrants databases. Method: In the proposed technique, results of complex calculations related to most probable On-Line Analytical Processing operations required by end users, are stored in the form of Decision Trees, at the pre- processing stage of data analysis. While browsing the Cube, these pre-computed results are called; thus offering Dynamic Cubing feature to end users at runtime. This data-tuning step reduces the query processing time and increases efficiency of required data warehouse operations. Results: Experiments conducted with Data Warehouse of around 1000 migrants’ profiles confirm the knowledge discovery power of this proposal. Using the proposed methodology, authors have designed a framework efficient enough to incorporate the amendments made in the E – Migrants Data Warehouse systems on regular intervals, which was totally missing in the traditional E – Migrants databases. Conclusion: The proposed methodology facilitate migrants to generate dynamic knowledge and visualize it in the form of dynamic cubes. Applying Business Intelligence mechanisms, blending it with tuned OLAP operations, the authors have managed to transform traditional datasets into intelligent migrants Data Warehouse.

Download Full-text

Rank Critical Success Factors (CSFs) of Data Warehouse and Business Intelligence (DW/BI) Implementation in Banking Sector Using Analytical Hierarchy Process (AHP)

2020 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) ◽

10.1109/icimcis51567.2020.9354331 ◽

2020 ◽

Author(s):

Shadri Halim ◽

Izzati Mubarokah ◽

Achmad Nizar Hidayanto

Keyword(s):

Data Warehouse ◽

Business Intelligence ◽

Analytical Hierarchy Process ◽

Critical Success Factors ◽

Success Factors ◽

Banking Sector ◽

Critical Success ◽

Hierarchy Process

Download Full-text

Accelerating organic solar cell material's discovery: high-throughput screening and big data

Energy & Environmental Science ◽

10.1039/d1ee00559f ◽

2021 ◽

Author(s):

Xabier Rodríguez-Martínez ◽

Enrique Pascual-San-José ◽

Mariano Campoy-Quiles

Keyword(s):

Machine Learning ◽

Big Data ◽

High Throughput ◽

Organic Solar Cells ◽

High Throughput Screening ◽

Organic Solar Cell ◽

State Of The Art ◽

Review Article ◽

Machine Learning Algorithms ◽

Device Optimization

This review article presents the state-of-the-art in high-throughput computational and experimental screening routines with application in organic solar cells, including materials discovery, device optimization and machine-learning algorithms.

Download Full-text

The Multivariate Theory of Functional Connections: Theory, Proofs, and Application in Partial Differential Equations

Mathematics ◽

10.3390/math8081303 ◽

2020 ◽

Vol 8 (8) ◽

pp. 1303 ◽

Cited By ~ 1

Author(s):

Carl Leake ◽

Hunter Johnston ◽

Daniele Mortari

Keyword(s):

Partial Differential Equations ◽

Differential Equations ◽

State Of The Art ◽

Linear Constraints ◽

Prior Work ◽

Partial Differential ◽

Seminal Paper ◽

Functional Connections ◽

Mathematical Proofs ◽

Multivariate Theory

This article presents a reformulation of the Theory of Functional Connections: a general methodology for functional interpolation that can embed a set of user-specified linear constraints. The reformulation presented in this paper exploits the underlying functional structure presented in the seminal paper on the Theory of Functional Connections to ease the derivation of these interpolating functionals—called constrained expressions—and provides rigorous terminology that lends itself to straightforward derivations of mathematical proofs regarding the properties of these constrained expressions. Furthermore, the extension of the technique to and proofs in n-dimensions is immediate through a recursive application of the univariate formulation. In all, the results of this reformulation are compared to prior work to highlight the novelty and mathematical convenience of using this approach. Finally, the methodology presented in this paper is applied to two partial differential equations with different boundary conditions, and, when data is available, the results are compared to state-of-the-art methods.

Download Full-text

Graph Data Management Systems

Encyclopedia of Big Data Technologies ◽

10.1007/978-3-319-63962-8_82-1 ◽

2018 ◽

pp. 1-9

Author(s):

Marcus Paradies ◽

Stefan Plantikow ◽

Oskar van Rest

Keyword(s):

Data Management ◽

Management Systems ◽

Data Management Systems ◽

Graph Data ◽

Graph Data Management

Download Full-text