scholarly journals Coming Together of Big Data and Cloud Computing : A Review

Author(s):  
Muneeba Afzal Mukhdoomi ◽  
Ashish Oberoi ◽  
Ankur Gupta

Big data stands for sheer amount of data that is growing unceasingly at a rapid pace. Big Data demands high-powered, robust, reliable, fault-tolerant tools and techniques in order to make it convenient to process, analyse and uproot new insights from Big Data. Big data refers to huge, heterogeneous amount of details, facts and data generating at constantly rising rate. The data sets in Big Data are too bulky or extensive, as a result classical data handling application software are not competent enough to administer them. On the other hand, Cloud computing is a resourceful technology providing high computing power, scalability, computing resources as and when required for processing, storage, analytics and visualization of Big Data. Therefore, cloud computing can be regarded as a feasible and applicable technology which promises to handle Big Data challenges and also provides here and now infrastructures with all the mandatory resources. This paper will mainly review processing of big data cloud using Hadoop and spark in cloud, advantages of driving Big Data using cloud computing and applications of Big data in Cloud.

Author(s):  
Arpit Kumar Sharma ◽  
Arvind Dhaka ◽  
Amita Nandal ◽  
Kumar Swastik ◽  
Sunita Kumari

The meaning of the term “big data” can be inferred by its name itself (i.e., the collection of large structured or unstructured data sets). In addition to their huge quantity, these data sets are so complex that they cannot be analyzed in any way using the conventional data handling software and hardware tools. If processed judiciously, big data can prove to be a huge advantage for the industries using it. Due to its usefulness, studies are being conducted to create methods to handle the big data. Knowledge extraction from big data is very important. Other than this, there is no purpose for accumulating such volumes of data. Cloud computing is a powerful tool which provides a platform for the storage and computation of massive amounts of data.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

AbstractData variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked in previous works. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


Author(s):  
Forest Jay Handford

The number of tools available for Big Data processing have grown exponentially as cloud providers have introduced solutions for businesses that have little or no money for capital expenditures. The chapter starts by discussing historic data tools and the evolution to those of today. With Cloud Computing, the need for upfront costs has been removed, costs are continuing to fall and costs can be negotiated. This chapter reviews the current types of Big Data tools, and how they evolved. To give readers an idea of costs, the chapter shows example costs (in today's market) for a sampling of the tools and relative cost comparisons of the other tools like the Grid tools used by the government, scientific communities and academic communities. Readers will take away from this chapter an understanding of what tools work best for several scenarios and how to select cost effective tools (even tools that are unknown today).


2018 ◽  
Vol 43 (4) ◽  
pp. 179-190
Author(s):  
Pritha Guha

Executive Summary Very large or complex data sets, which are difficult to process or analyse using traditional data handling techniques, are usually referred to as big data. The idea of big data is characterized by the three ‘v’s which are volume, velocity, and variety ( Liu, McGree, Ge, & Xie, 2015 ) referring respectively to the volume of data, the velocity at which the data are processed and the wide varieties in which big data are available. Every single day, different sectors such as credit risk management, healthcare, media, retail, retail banking, climate prediction, DNA analysis and, sports generate petabytes of data (1 petabyte = 250 bytes). Even basic handling of big data, therefore, poses significant challenges, one of them being organizing the data in such a way that it can give better insights into analysing and decision-making. With the explosion of data in our life, it has become very important to use statistical tools to analyse them.


Author(s):  
Mounia Rahhali ◽  
Lahcen Oughdir ◽  
Youssef Jedidi

In educational institutions, E-learning has been known as a successful technology for enhancing performance, concentration, and thus providing higher academic success. Nevertheless, the conventional system for executing research work and selecting courses is a time-consuming and unexciting practice, that not only directly impacts the students ’ academic achievement but also impacts the learning experience of students. In addition to that, there is an enormous number of various kinds of data in the E-Learning domain both structured and unstructured, and the academic establishments attempt to manage and understand big complicated data sets. To fix this problem, this paper proposes a model of an E-learning recommendation system that will suggest and encourage the learner in choosing the courses according to their needs. This system used big data tools such as Hadoop and Spark to enhance data collection, storage, analysis, processing, optimization, and visualization, furthermore based on cloud computing infrastructure and especially Google cloud services.


2021 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


Author(s):  
Adarsh Bhandari

Abstract: With the rapid escalation of data driven solutions, companies are integrating huge data from multiple sources in order to gain fruitful results. To handle this tremendous volume of data we need cloud based architecture to store and manage this data. Cloud computing has emerged as a significant infrastructure that promises to reduce the need for maintaining costly computing facilities by organizations and scale up the products. Even today heavy applications are deployed on cloud and managed specially at AWS eliminating the need for error prone manual operations. This paper demonstrates about certain cloud computing tools and techniques present to handle big data and processes involved while extracting this data till model deployment and also distinction among their usage. It will also demonstrate, how big data analytics and cloud computing will change methods that will later drive the industry. Additionally, a study is presented later in the paper about management of blockchain generated big data on cloud and making analytical decision. Furthermore, the impact of blockchain in cloud computing and big data analytics has been employed in this paper. Keywords: Cloud Computing, Big Data, Amazon Web Services (AWS), Google Cloud Platform (GCP), SaaS, PaaS, IaaS.


Author(s):  
Abdul Razaque ◽  
Shaldanbayeva Nazerke ◽  
Bandar Alotaibi ◽  
Munif Alotaibi ◽  
Akhmetov Murat ◽  
...  

Nowadays, cloud computing is one of the important and rapidly growing paradigms that extend its capabilities and applications in various areas of life. The cloud computing system challenges many security issues, such as scalability, integrity, confidentiality, and unauthorized access, etc. An illegitimate intruder may gain access to the sensitive cloud computing system and use the data for inappropriate purposes that may lead to losses in business or system damage. This paper proposes a hybrid unauthorized data handling (HUDH) scheme for Big data in cloud computing. The HUDU aims to restrict illegitimate users from accessing the cloud and data security provision. The proposed HUDH consists of three steps: data encryption, data access, and intrusion detection. HUDH involves three algorithms; Advanced Encryption Standards (AES) for encryption, Attribute-Based Access Control (ABAC) for data access control, and Hybrid Intrusion Detection (HID) for unauthorized access detection. The proposed scheme is implemented using Python and Java language. Testing results demonstrate that the HUDH can delegate computation overhead to powerful cloud servers. User confidentiality, access privilege, and user secret key accountability can be attained with more than 97% high accuracy.


Author(s):  
Haowei Lin ◽  
Xiaolong Xu ◽  
Juan Zhao ◽  
Xinheng Wang

Abstract The multi-access edge computing (MEC) has higher computing power and lower latency than user equipment and remote cloud computing, enabling the continuing emergence of new types of services and mobile application. However, the movement of users could induce service migration or interruption in the MEC network. Especially for highly mobile users, they accelerate the frequency of services’ migration and handover, impacting on the stability of the total MEC network. In this paper, we propose a hierarchical multi-access edge computing architecture, setting up the infrastructure for dynamic service migration in the ultra-dense MEC networks. Moreover, we propose a new mechanism for users with high mobility in the ultra-dense MEC network, efficiently arranging service migrations for users with high-mobility and ordinary users together. Then, we propose an algorithm for evaluating migrated services to contribute to choose the suitable MEC servers for migrated services. The results show that the proposed mechanism can efficiently arrange service migrations and more quickly restore the services even in the blockage. On the other hand, the proposed algorithm is able to make a supplement to the existing algorithms for selecting MEC servers because it can better reflect the capability of migrated services.


Author(s):  
Vinod Kumar ◽  
Ramjeevan Singh Thakur

With every passing day, data generation is increasing exponentially, its volume, variety, velocity are making it quite challenging to analyze, interpret, visualize for gaining the greater insights from the available data. Billions of networked sensors are being embedded in devices such as smart phones, automobiles, social media sites, laptop, PC's and industrial machines etc. that operates, generate and communicate data. Thus, the data obtained from various resources exists in structured, semi-structured and unstructured form. The traditional database system is not suitable to handle these data formats. Therefore, new tools and techniques are developed to work with these data. NoSQL is one of them. Currently, many NoSQL database are available in the market, each one of them specially designed to solve specific type of data handling problems, most of the NoSQL databases are developed with special attention to problem of business organizations and enterprises. The chapter focuses various aspects of NoSQL as tool for handling the big data.


Sign in / Sign up

Export Citation Format

Share Document