Advancing Logistics 4.0 with the Implementation of a Big Data Warehouse: A Demonstration Case for the Automotive Industry

Nuno Silva; Júlio Barros; Maribel Y. Santos; Carlos Costa; Paulo Cortez; M. Sameiro Carvalho; João N. C. Gonçalves

doi:10.3390/electronics10182221

Advancing Logistics 4.0 with the Implementation of a Big Data Warehouse: A Demonstration Case for the Automotive Industry

Electronics ◽

10.3390/electronics10182221 ◽

2021 ◽

Vol 10 (18) ◽

pp. 2221

Author(s):

Nuno Silva ◽

Júlio Barros ◽

Maribel Y. Santos ◽

Carlos Costa ◽

Paulo Cortez ◽

...

Keyword(s):

Big Data ◽

Data Warehouse ◽

Automotive Industry ◽

Quality Data ◽

New Concepts ◽

Challenges And Opportunities ◽

Main Driver ◽

Big Data Technologies ◽

Data Volume ◽

Big Data Warehouse

The constant advancements in Information Technology have been the main driver of the Big Data concept’s success. With it, new concepts such as Industry 4.0 and Logistics 4.0 are arising. Due to the increase in data volume, velocity, and variety, organizations are now looking to their data analytics infrastructures and searching for approaches to improve their decision-making capabilities, in order to enhance their results using new approaches such as Big Data and Machine Learning. The implementation of a Big Data Warehouse can be the first step to improve the organizations’ data analysis infrastructure and start retrieving value from the usage of Big Data technologies. Moving to Big Data technologies can provide several opportunities for organizations, such as the capability of analyzing an enormous quantity of data from different data sources in an efficient way. However, at the same time, different challenges can arise, including data quality, data management, and lack of knowledge within the organization, among others. In this work, we propose an approach that can be adopted in the logistics department of any organization in order to promote the Logistics 4.0 movement, while highlighting the main challenges and opportunities associated with the development and implementation of a Big Data Warehouse in a real demonstration case at a multinational automotive organization.

Download Full-text

Design and Implementation of Distributed Spatial Indexing and Query Processing on a Big Data Warehouse System

Journal of Korean Society for Geospatial Information System ◽

10.7319/kogsis.2021.29.1.025 ◽

2021 ◽

Vol 29 (1) ◽

pp. 25-32

Author(s):

Hyun Gu Cho ◽

Mi Kyung Eum ◽

Jong Yun Lee ◽

Seok Chan Bae ◽

Kwang Woo Nam

Keyword(s):

Big Data ◽

Query Processing ◽

Data Warehouse ◽

Spatial Indexing ◽

Design And Implementation ◽

Big Data Warehouse

Download Full-text

A Semi-Automatic Design Methodology for (Big) Data Warehouse Transforming Facts into Dimensions

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2019.2925621 ◽

2021 ◽

Vol 33 (1) ◽

pp. 28-42

Author(s):

Lucile Sautot ◽

Sandro Bimonte ◽

Ludovic Journaux

Keyword(s):

Big Data ◽

Data Warehouse ◽

Design Methodology ◽

Automatic Design ◽

Big Data Warehouse

Download Full-text

NoSQL Big Data Warehouse: Review and Comparison

Advances in Intelligent Systems and Computing - Intelligent Systems Design and Applications ◽

10.1007/978-3-030-71187-0_36 ◽

2021 ◽

pp. 392-401

Author(s):

Senda Bouaziz ◽

Ahlem Nabli ◽

Faiez Gargouri

Keyword(s):

Big Data ◽

Data Warehouse ◽

Big Data Warehouse

Download Full-text

On construction of a big data warehouse accessing platform for campus power usages

Journal of Parallel and Distributed Computing ◽

10.1016/j.jpdc.2019.05.011 ◽

2019 ◽

Vol 133 ◽

pp. 40-50

Author(s):

Chih-Hung Chang ◽

Fuu-Cheng Jiang ◽

Chao-Tung Yang ◽

Sheng-Cang Chou

Keyword(s):

Big Data ◽

Data Warehouse ◽

Big Data Warehouse

Download Full-text

Privacy-Aware Big Data Warehouse Architecture

2016 IEEE International Congress on Big Data (BigData Congress) ◽

10.1109/bigdatacongress.2016.53 ◽

2016 ◽

Cited By ~ 1

Author(s):

Karthik Navuluri ◽

Ravi Mukkamala ◽

Aftab Ahmad

Keyword(s):

Big Data ◽

Data Warehouse ◽

Big Data Warehouse

Download Full-text

Efficient query processing framework for big data warehouse: an almost join-free approach

Frontiers of Computer Science ◽

10.1007/s11704-014-4025-6 ◽

2015 ◽

Vol 9 (2) ◽

pp. 224-236 ◽

Cited By ~ 10

Author(s):

Huiju Wang ◽

Xiongpai Qin ◽

Xuan Zhou ◽

Furong Li ◽

Zuoyan Qin ◽

...

Keyword(s):

Big Data ◽

Query Processing ◽

Data Warehouse ◽

Efficient Query Processing ◽

Processing Framework ◽

Big Data Warehouse

Download Full-text

Hybrid big data warehouse for on-demand decision needs

2017 International Conference on Electrical and Information Technologies (ICEIT) ◽

10.1109/eitech.2017.8255261 ◽

2017 ◽

Cited By ~ 1

Author(s):

Meryeme El Houari ◽

Maryem Rhanoui ◽

Bouchra El Asri

Keyword(s):

Big Data ◽

Data Warehouse ◽

On Demand ◽

Big Data Warehouse

Download Full-text

Integration and optimization of multiple big data processing platforms

Engineering Computations ◽

10.1108/ec-08-2015-0247 ◽

2016 ◽

Vol 33 (6) ◽

pp. 1680-1704 ◽

Cited By ~ 8

Author(s):

Bao-Rong Chang ◽

Hsiu-Fen Tsai ◽

Yun-Che Tsai ◽

Chin-Fu Kuo ◽

Chi-Chung Chen

Keyword(s):

Big Data ◽

Data Processing ◽

Data Warehouse ◽

Search Time ◽

Memory Capacity ◽

Memory Storage ◽

Big Data Processing ◽

Content Type ◽

Memory Cache ◽

Big Data Warehouse

Purpose – The purpose of this paper is to integrate and optimize a multiple big data processing platform with the features of high performance, high availability and high scalability in big data environment. Design/methodology/approach – First, the integration of Apache Hive, Cloudera Impala and BDAS Shark make the platform support SQL-like query. Next, users can access a single interface and select the best performance of big data warehouse platform automatically by the proposed optimizer. Finally, the distributed memory storage system Memcached incorporated into the distributed file system, Apache HDFS, is employed for fast caching query results. Therefore, if users query the same SQL command, the same result responds rapidly from the cache system instead of suffering the repeated searches in a big data warehouse and taking a longer time to retrieve. Findings – As a result the proposed approach significantly improves the overall performance and dramatically reduces the search time as querying a database, especially applying for the high-repeatable SQL commands under multi-user mode. Research limitations/implications – Currently, Shark’s latest stable version 0.9.1 does not support the latest versions of Spark and Hive. In addition, this series of software only supports Oracle JDK7. Using Oracle JDK8 or Open JDK will cause serious errors, and some software will be unable to run. Practical implications – The problem with this system is that some blocks are missing when too many blocks are stored in one result (about 100,000 records). Another problem is that the sequential writing into In-memory cache wastes time. Originality/value – When the remaining memory capacity is 2 GB or less on each server, Impala and Shark will have a lot of page swapping, causing extremely low performance. When the data scale is larger, it may cause the JVM I/O exception and make the program crash. However, when the remaining memory capacity is sufficient, Shark is faster than Hive and Impala. Impala’s consumption of memory resources is between those of Shark and Hive. This amount of remaining memory is sufficient for Impala’s maximum performance. In this study, each server allocates 20 GB of memory for cluster computing and sets the amount of remaining memory as Level 1: 3 percent (0.6 GB), Level 2: 15 percent (3 GB) and Level 3: 75 percent (15 GB) as the critical points. The program automatically selects Hive when memory is less than 15 percent, Impala at 15 to 75 percent and Shark at more than 75 percent.

Download Full-text

Big Data Technologies and Analytics

International Journal of Business Analytics ◽

10.4018/ijban.2014040101 ◽

2014 ◽

Vol 1 (2) ◽

pp. 1-17 ◽

Cited By ~ 6

Author(s):

Hoda Ahmed Abdelhafez

Keyword(s):

Cloud Computing ◽

Big Data ◽

Case Studies ◽

Real World ◽

Lessons Learned ◽

The Internet ◽

Time Data ◽

Real Time Data ◽

Big Data Technologies ◽

Data Volume

The internet era creates new types of large and real-time data; much of those data are non-standard such as streaming and sensor-generated data. Advanced big data technologies enable organizations to extract insights from sophisticated data. Volume, variety and velocity represent big data challenges, which cause difficulties in capture, storage, search, sharing, analysis and visualization. Therefore, technologies like No-SQL, Hadoop and cloud computing used to extract value from large volumes and a wide variety of data to discover business needs. This article's goal is to focus on the challenges of big data and how the recent technologies can be used to address those issues, which are illustrated through real world case studies. The article also presents the lessons learned from these case studies.

Download Full-text

Towards Big data Governance in Cybersecurity

Data-Enabled Discovery and Applications ◽

10.1007/s41688-019-0034-9 ◽

2019 ◽

Vol 3 (1) ◽

Cited By ~ 2

Author(s):

Longzhi Yang ◽

Jie Li ◽

Noe Elisa ◽

Tom Prickett ◽

Fei Chao

Keyword(s):

Decision Making ◽

Big Data ◽

Data Access ◽

Quality Data ◽

Data Sets ◽

Data Governance ◽

Business Decisions ◽

Regulatory Frameworks ◽

Big Data Technologies ◽

Cooperative Decision Making

AbstractBig data refers to large complex structured or unstructured data sets. Big data technologies enable organisations to generate, collect, manage, analyse, and visualise big data sets, and provide insights to inform diagnosis, prediction, or other decision-making tasks. One of the critical concerns in handling big data is the adoption of appropriate big data governance frameworks to (1) curate big data in a required manner to support quality data access for effective machine learning and (2) ensure the framework regulates the storage and processing of the data from providers and users in a trustworthy way within the related regulatory frameworks (both legally and ethically). This paper proposes a framework of big data governance that guides organisations to make better data-informed business decisions within the related regularity framework, with close attention paid to data security, privacy, and accessibility. In order to demonstrate this process, the work also presents an example implementation of the framework based on the case study of big data governance in cybersecurity. This framework has the potential to guide the management of big data in different organisations for information sharing and cooperative decision-making.

Download Full-text