Big Data: Generation Next

Big data quality framework: a holistic approach to continuous quality management

Journal Of Big Data ◽

10.1186/s40537-021-00468-0 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Ikbal Taleb ◽

Mohamed Adel Serhani ◽

Chafik Bouhaddioui ◽

Rachida Dssouli

Keyword(s):

Big Data ◽

Quality Management ◽

Data Quality ◽

Value Added ◽

Holistic Approach ◽

Research Area ◽

Heterogeneous Data ◽

Data Generation ◽

Continuous Quality ◽

Quality Profile

AbstractBig Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper.

Download Full-text

Big Data and Big Data Analytics for Improved Healthcare Service and Management

International Journal of Privacy and Health Information Management ◽

10.4018/ijphim.2020010102 ◽

2020 ◽

Vol 8 (1) ◽

pp. 13-51

Author(s):

Pijush Kanti Dutta Pramanik ◽

Saurabh Pal ◽

Moutan Mukhopadhyay

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Healthcare Services ◽

Healthcare Sector ◽

Future Market ◽

Data Generation ◽

Related Data ◽

Healthcare Data ◽

Different Types

Like other fields, the healthcare sector has also been greatly impacted by big data. A huge volume of healthcare data and other related data are being continually generated from diverse sources. Tapping and analysing these data, suitably, would open up new avenues and opportunities for healthcare services. In view of that, this paper aims to present a systematic overview of big data and big data analytics, applicable to modern-day healthcare. Acknowledging the massive upsurge in healthcare data generation, various ‘V's, specific to healthcare big data, are identified. Different types of data analytics, applicable to healthcare, are discussed. Along with presenting the technological backbone of healthcare big data and analytics, the advantages and challenges of healthcare big data are meticulously explained. A brief report on the present and future market of healthcare big data and analytics is also presented. Besides, several applications and use cases are discussed with sufficient details.

Download Full-text

IoT-Based Big Data

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2017010103 ◽

2017 ◽

Vol 13 (1) ◽

pp. 28-47 ◽

Cited By ~ 52

Author(s):

M. Mazhar Rathore ◽

Anand Paul ◽

Awais Ahmad ◽

Gwanggil Jeon

Keyword(s):

Decision Making ◽

Big Data ◽

City Planning ◽

Water System ◽

Planning System ◽

Data Generation ◽

Next Generation ◽

Smart Systems ◽

Tier 1 ◽

Iot Devices

Recently, a rapid growth in the population in urban regions demands the provision of services and infrastructure. These needs can be come up wit the use of Internet of Things (IoT) devices, such as sensors, actuators, smartphones and smart systems. This leans to building Smart City towards the next generation Super City planning. However, as thousands of IoT devices are interconnecting and communicating with each other over the Internet to establish smart systems, a huge amount of data, termed as Big Data, is being generated. It is a challenging task to integrate IoT services and to process Big Data in an efficient way when aimed at decision making for future Super City. Therefore, to meet such requirements, this paper presents an IoT-based system for next generation Super City planning using Big Data Analytics. Authors have proposed a complete system that includes various types of IoT-based smart systems like smart home, vehicular networking, weather and water system, smart parking, and surveillance objects, etc., for dada generation. An architecture is proposed that includes four tiers/layers i.e., 1) Bottom Tier-1, 2) Intermediate Tier-1, 3) Intermediate Tier 2, and 4) Top Tier that handle data generation and collections, communication, data administration and processing, and data interpretation, respectively. The system implementation model is presented from the generation and collection of data to the decision making. The proposed system is implemented using Hadoop ecosystem with MapReduce programming. The throughput and processing time results show that the proposed Super City planning system is more efficient and scalable.

Download Full-text

Automated Design of Realistic Contingencies for Big Data Generation

10.1109/pesgm46819.2021.9637988 ◽

2021 ◽

Author(s):

Tetiana Bogodorova ◽

Denis Osipov ◽

Luigi Vanfretti

Keyword(s):

Big Data ◽

Automated Design ◽

Data Generation

Download Full-text

Accessing Big Data in the Cloud Using Mobile Devices

Advances in Data Mining and Database Management - Handbook of Research on Cloud Infrastructures for Big Data Analytics ◽

10.4018/978-1-4666-5864-6.ch018 ◽

2014 ◽

pp. 444-470 ◽

Cited By ~ 24

Author(s):

Haoliang Wang ◽

Wei Liu ◽

Tolga Soyata

Keyword(s):

Big Data ◽

Mobile Devices ◽

Response Times ◽

Sensor Nodes ◽

Data Generation ◽

Ongoing Research ◽

Mobile Access ◽

Modern Computer ◽

Processing Power ◽

Require Response

The amount of data acquired, stored, and processed annually over the Internet has exceeded the processing capabilities of modern computer systems, including supercomputers with multiple-Petaflop processing power, giving rise to the term Big Data. Continuous research efforts to implement systems to cope with this insurmountable amount of data are underway. The authors introduce the ongoing research in three different facets: 1) in the Acquisition front, they introduce a concept that has come to the forefront in the past few years: Internet-of-Things (IoT), which will be one of the major sources for Big Data generation in the following decades. The authors provide a brief survey of IoT to understand the concept and the ongoing research in this field. 2) In the Cloud Storage and Processing front, they provide a survey of techniques to efficiently store the acquired Big Data in the cloud, index it, and get it ready for processing. While IoT relates primarily to sensor nodes and thin devices, the authors study this storage and processing aspect of Big Data within the framework of Cloud Computing. 3) In the Mobile Access front, they perform a survey of existing infrastructures to access the Big Data efficiently via mobile devices. This survey also includes intermediate devices, such as a Cloudlet, to accelerate the Big Data collection from IoT and access to Big Data for applications that require response times that are close to real-time.

Download Full-text

The Compute Infrastructures for Big Data Analytics

Advances in Data Mining and Database Management - Handbook of Research on Cloud Infrastructures for Big Data Analytics ◽

10.4018/978-1-4666-5864-6.ch004 ◽

2014 ◽

pp. 74-109 ◽

Cited By ~ 1

Author(s):

Pethuru Raj

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Data Sources ◽

Steady Increase ◽

Data Generation ◽

Abnormal Growth ◽

The World ◽

Tremendous Amount ◽

And Storage

The implications of the digitization process among a bevy of trends are definitely many and memorable. One is the abnormal growth in data generation, gathering, and storage due to a steady increase in the number of data sources, structures, scopes, sizes, and speeds. In this chapter, the author shows some of the impactful developments brewing in the IT space, how the tremendous amount of data getting produced and processed all over the world impacts the IT and business domains, how next-generation IT infrastructures are accordingly getting refactored, remedied, and readied for the impending big data-induced challenges, how likely the move of the big data analytics discipline towards fulfilling the digital universe requirements of extracting and extrapolating actionable insights for the knowledge-parched is, and finally, the establishment and sustenance of the dreamt smarter planet.

Download Full-text

Affordances of Data Science in Agriculture, Manufacturing, and Education

Web Services ◽

10.4018/978-1-5225-7501-6.ch052 ◽

2019 ◽

pp. 953-978

Author(s):

Krishnan Umachandran ◽

Debra Sharon Ferdinand-James

Keyword(s):

Big Data ◽

Large Scale ◽

Data Science ◽

Data Generation ◽

Large Scale Data ◽

Big Data Applications ◽

Effective Decision ◽

Effective Decision Making ◽

Text Images ◽

Scale Data

Continued technological advancements of the 21st Century afford massive data generation in sectors of our economy to include the domains of agriculture, manufacturing, and education. However, harnessing such large-scale data, using modern technologies for effective decision-making appears to be an evolving science that requires knowledge of Big Data management and analytics. Big data in agriculture, manufacturing, and education are varied such as voluminous text, images, and graphs. Applying Big data science techniques (e.g., functional algorithms) for extracting intelligence data affords decision markers quick response to productivity, market resilience, and student enrollment challenges in today's unpredictable markets. This chapter serves to employ data science for potential solutions to Big Data applications in the sectors of agriculture, manufacturing and education to a lesser extent, using modern technological tools such as Hadoop, Hive, Sqoop, and MongoDB.

Download Full-text

Big Data Analytics in Bioinformatics

Biotechnology ◽

10.4018/978-1-5225-8903-7.ch080 ◽

2019 ◽

pp. 1967-1984

Author(s):

Dharmendra Trikamlal Patel

Keyword(s):

Big Data ◽

Internet Of Things ◽

Data Analytics ◽

Big Data Analytics ◽

Annual Growth ◽

Predictive Analysis ◽

The Internet ◽

Data Generation ◽

Intelligent Devices ◽

The Internet Of Things

Voluminous data are being generated by various means. The Internet of Things (IoT) has emerged recently to group all manmade artificial things around us. Due to intelligent devices, the annual growth of data generation has increased rapidly, and it is expected that by 2020, it will reach more than 40 trillion GB. Data generated through devices are in unstructured form. Traditional techniques of descriptive and predictive analysis are not enough for that. Big Data Analytics have emerged to perform descriptive and predictive analysis on such voluminous data. This chapter first deals with the introduction to Big Data Analytics. Big Data Analytics is very essential in Bioinformatics field as the size of human genome sometimes reaches 200 GB. The chapter next deals with different types of big data in Bioinformatics. The chapter describes several problems and challenges based on big data in Bioinformatics. Finally, the chapter deals with techniques of Big Data Analytics in the Bioinformatics field.

Download Full-text

Affordances of Data Science in Agriculture, Manufacturing, and Education

Privacy and Security Policies in Big Data - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-2486-1.ch002 ◽

2017 ◽

pp. 14-40 ◽

Cited By ~ 2

Author(s):

Krishnan Umachandran ◽

Debra Sharon Ferdinand-James

Keyword(s):

Big Data ◽

Large Scale ◽

Data Science ◽

Data Generation ◽

Large Scale Data ◽

Big Data Applications ◽

Effective Decision ◽

Effective Decision Making ◽

Text Images ◽

Scale Data

Continued technological advancements of the 21st Century afford massive data generation in sectors of our economy to include the domains of agriculture, manufacturing, and education. However, harnessing such large-scale data, using modern technologies for effective decision-making appears to be an evolving science that requires knowledge of Big Data management and analytics. Big data in agriculture, manufacturing, and education are varied such as voluminous text, images, and graphs. Applying Big data science techniques (e.g., functional algorithms) for extracting intelligence data affords decision markers quick response to productivity, market resilience, and student enrollment challenges in today's unpredictable markets. This chapter serves to employ data science for potential solutions to Big Data applications in the sectors of agriculture, manufacturing and education to a lesser extent, using modern technological tools such as Hadoop, Hive, Sqoop, and MongoDB.

Download Full-text

Intuitive Web-Based Experimental Design for High-Throughput Biomedical Data

BioMed Research International ◽

10.1155/2015/958302 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

Andreas Friedrich ◽

Erhan Kenar ◽

Oliver Kohlbacher ◽

Sven Nahnsen

Keyword(s):

Big Data ◽

Experimental Design ◽

High Throughput ◽

Large Scale ◽

Integrated Design ◽

Added Value ◽

Biomedical Data ◽

Data Generation ◽

Web Based ◽

Data Annotation

Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.

Download Full-text