scholarly journals Advancing Logistics 4.0 with the Implementation of a Big Data Warehouse: A Demonstration Case for the Automotive Industry

Electronics ◽  
2021 ◽  
Vol 10 (18) ◽  
pp. 2221
Author(s):  
Nuno Silva ◽  
Júlio Barros ◽  
Maribel Y. Santos ◽  
Carlos Costa ◽  
Paulo Cortez ◽  
...  

The constant advancements in Information Technology have been the main driver of the Big Data concept’s success. With it, new concepts such as Industry 4.0 and Logistics 4.0 are arising. Due to the increase in data volume, velocity, and variety, organizations are now looking to their data analytics infrastructures and searching for approaches to improve their decision-making capabilities, in order to enhance their results using new approaches such as Big Data and Machine Learning. The implementation of a Big Data Warehouse can be the first step to improve the organizations’ data analysis infrastructure and start retrieving value from the usage of Big Data technologies. Moving to Big Data technologies can provide several opportunities for organizations, such as the capability of analyzing an enormous quantity of data from different data sources in an efficient way. However, at the same time, different challenges can arise, including data quality, data management, and lack of knowledge within the organization, among others. In this work, we propose an approach that can be adopted in the logistics department of any organization in order to promote the Logistics 4.0 movement, while highlighting the main challenges and opportunities associated with the development and implementation of a Big Data Warehouse in a real demonstration case at a multinational automotive organization.

2019 ◽  
Vol 133 ◽  
pp. 40-50
Author(s):  
Chih-Hung Chang ◽  
Fuu-Cheng Jiang ◽  
Chao-Tung Yang ◽  
Sheng-Cang Chou

2015 ◽  
Vol 9 (2) ◽  
pp. 224-236 ◽  
Author(s):  
Huiju Wang ◽  
Xiongpai Qin ◽  
Xuan Zhou ◽  
Furong Li ◽  
Zuoyan Qin ◽  
...  

2016 ◽  
Vol 33 (6) ◽  
pp. 1680-1704 ◽  
Author(s):  
Bao-Rong Chang ◽  
Hsiu-Fen Tsai ◽  
Yun-Che Tsai ◽  
Chin-Fu Kuo ◽  
Chi-Chung Chen

Purpose – The purpose of this paper is to integrate and optimize a multiple big data processing platform with the features of high performance, high availability and high scalability in big data environment. Design/methodology/approach – First, the integration of Apache Hive, Cloudera Impala and BDAS Shark make the platform support SQL-like query. Next, users can access a single interface and select the best performance of big data warehouse platform automatically by the proposed optimizer. Finally, the distributed memory storage system Memcached incorporated into the distributed file system, Apache HDFS, is employed for fast caching query results. Therefore, if users query the same SQL command, the same result responds rapidly from the cache system instead of suffering the repeated searches in a big data warehouse and taking a longer time to retrieve. Findings – As a result the proposed approach significantly improves the overall performance and dramatically reduces the search time as querying a database, especially applying for the high-repeatable SQL commands under multi-user mode. Research limitations/implications – Currently, Shark’s latest stable version 0.9.1 does not support the latest versions of Spark and Hive. In addition, this series of software only supports Oracle JDK7. Using Oracle JDK8 or Open JDK will cause serious errors, and some software will be unable to run. Practical implications – The problem with this system is that some blocks are missing when too many blocks are stored in one result (about 100,000 records). Another problem is that the sequential writing into In-memory cache wastes time. Originality/value – When the remaining memory capacity is 2 GB or less on each server, Impala and Shark will have a lot of page swapping, causing extremely low performance. When the data scale is larger, it may cause the JVM I/O exception and make the program crash. However, when the remaining memory capacity is sufficient, Shark is faster than Hive and Impala. Impala’s consumption of memory resources is between those of Shark and Hive. This amount of remaining memory is sufficient for Impala’s maximum performance. In this study, each server allocates 20 GB of memory for cluster computing and sets the amount of remaining memory as Level 1: 3 percent (0.6 GB), Level 2: 15 percent (3 GB) and Level 3: 75 percent (15 GB) as the critical points. The program automatically selects Hive when memory is less than 15 percent, Impala at 15 to 75 percent and Shark at more than 75 percent.


2014 ◽  
Vol 1 (2) ◽  
pp. 1-17 ◽  
Author(s):  
Hoda Ahmed Abdelhafez

The internet era creates new types of large and real-time data; much of those data are non-standard such as streaming and sensor-generated data. Advanced big data technologies enable organizations to extract insights from sophisticated data. Volume, variety and velocity represent big data challenges, which cause difficulties in capture, storage, search, sharing, analysis and visualization. Therefore, technologies like No-SQL, Hadoop and cloud computing used to extract value from large volumes and a wide variety of data to discover business needs. This article's goal is to focus on the challenges of big data and how the recent technologies can be used to address those issues, which are illustrated through real world case studies. The article also presents the lessons learned from these case studies.


Author(s):  
Longzhi Yang ◽  
Jie Li ◽  
Noe Elisa ◽  
Tom Prickett ◽  
Fei Chao

AbstractBig data refers to large complex structured or unstructured data sets. Big data technologies enable organisations to generate, collect, manage, analyse, and visualise big data sets, and provide insights to inform diagnosis, prediction, or other decision-making tasks. One of the critical concerns in handling big data is the adoption of appropriate big data governance frameworks to (1) curate big data in a required manner to support quality data access for effective machine learning and (2) ensure the framework regulates the storage and processing of the data from providers and users in a trustworthy way within the related regulatory frameworks (both legally and ethically). This paper proposes a framework of big data governance that guides organisations to make better data-informed business decisions within the related regularity framework, with close attention paid to data security, privacy, and accessibility. In order to demonstrate this process, the work also presents an example implementation of the framework based on the case study of big data governance in cybersecurity. This framework has the potential to guide the management of big data in different organisations for information sharing and cooperative decision-making.


Sign in / Sign up

Export Citation Format

Share Document