Green Cloud Software Engineering for Big Data Processing

Madhubala Ganesan; Ah-Lian Kor; Colin Pattinson; Eric Rondeau

doi:10.3390/su12219255

Green Cloud Software Engineering for Big Data Processing

Sustainability ◽

10.3390/su12219255 ◽

2020 ◽

Vol 12 (21) ◽

pp. 9255

Author(s):

Madhubala Ganesan ◽

Ah-Lian Kor ◽

Colin Pattinson ◽

Eric Rondeau

Keyword(s):

Big Data ◽

High Performance ◽

Data Centers ◽

Research Work ◽

Service Level Agreement ◽

Big Data Analytics ◽

Service Level ◽

Cloud Infrastructure ◽

Communication Performance ◽

Vm Consolidation

Internet of Things (IoT) coupled with big data analytics is emerging as the core of smart and sustainable systems which bolsters economic, environmental and social sustainability. Cloud-based data centers provide high performance computing power to analyze voluminous IoT data to provide invaluable insights to support decision making. However, multifarious servers in data centers appear to be the black hole of superfluous energy consumption that contributes to 23% of the global carbon dioxide (CO2) emissions in ICT (Information and Communication Technology) industry. IoT-related energy research focuses on low-power sensors and enhanced machine-to-machine communication performance. To date, cloud-based data centers still face energy–related challenges which are detrimental to the environment. Virtual machine (VM) consolidation is a well-known approach to affect energy-efficient cloud infrastructures. Although several research works demonstrate positive results for VM consolidation in simulated environments, there is a gap for investigations on real, physical cloud infrastructure for big data workloads. This research work addresses the gap of conducting real physical cloud infrastructure-based experiments. The primary goal of setting up a real physical cloud infrastructure is for the evaluation of dynamic VM consolidation approaches which include integrated algorithms from existing relevant research. An open source VM consolidation framework, Openstack NEAT is adopted and experiments are conducted on a Multi-node Openstack Cloud with Apache Spark as the big data platform. Open sourced Openstack has been deployed because it enables rapid innovation, and boosts scalability as well as resource utilization. Additionally, this research work investigates the performance based on service level agreement (SLA) metrics and energy usage of compute hosts. Relevant results concerning the best performing combination of algorithms are presented and discussed.

Download Full-text

Energy Efficient and VM Consolidation Framework using Improved Spider Monkey Optimization Algorithm

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6390.0910321 ◽

2021 ◽

Vol 10 (3) ◽

pp. 21-26

Author(s):

Kethavath Prem Kumar ◽

◽

Thirumalaisamy Ragunathan ◽

Devara Vasumathi ◽

◽

...

Keyword(s):

Cloud Computing ◽

Energy Efficient ◽

Data Centers ◽

Virtual Machines ◽

Service Level Agreement ◽

Service Level ◽

Energy Usage ◽

Energy Efficient Computing ◽

Spider Monkey Optimization ◽

Vm Consolidation

Cloud Computing is rapidly being utilized to operate informational technological services by outstanding technologies for a variety of benefits, including dynamically improved resources planning and a new service delivery method. The Cloud computing process is occurred by allowing the client devices for data access through the internet from a remote server, computers, and the databases. An internet connection is linked among the front end users such as client device, network, browser, and software application with the back end that constitutes of servers, computers, and database. For satisfying the demands of the Service Level Agreement (SLA), providers of cloud service should reduce the usage of energy. Capacity reservations oriented system is available by clouds’ providers to permit users for customizing Virtual Machines (VMs) having specified age and geographic resources, reduces the amount to be paid for cloud services. To overcome the aforementioned issue, an Improved Spider Monkey Optimization (ISMO) approach is proposed for cloud center optimization. The VM consolidation architecture based on the proposed ISMO algorithm decreases energy usage while attempting to prevent Service Level Agreement breaches. The accessibility of hosts or virtual machines (VMs) for task performance is measured by fitness. If the number of tasks to be handled increases the hosts of VMs available at right state. The proposed VM consolidation architecture decreases energy usage while also attempting to prevent Service Level Agreement breaches and also provide energy-efficient computing in data centers. The proposed approach may be utilized to provide energy-efficient computing in data centers. The energy efficiency of the proposed ISMO method is achieved 28266 whereas, the existing algorithm showed an energy efficiency of 6009 and 10001.

Download Full-text

Distance Aware VM allocation process to minimize energy consumption in cloud computing

Recent Patents on Computer Science ◽

10.2174/2213275912666191023143709 ◽

2019 ◽

Vol 12 ◽

Author(s):

Gurpreet Singh ◽

Manish Mahajan ◽

Rajni Mohana

Keyword(s):

Resource Allocation ◽

Cloud Computing ◽

Energy Consumption ◽

Virtual Machine ◽

Virtual Machines ◽

Research Work ◽

Service Level Agreement ◽

Service Level ◽

Physical Machine ◽

Allocation Process

BACKGROUND: Cloud computing is considered as an on-demand service resource with the applications towards data center on pay per user basis. For allocating the resources appropriately for the satisfaction of user needs, an effective and reliable resource allocation method is required. Because of the enhanced user demand, the allocation of resources has now considered as a complex and challenging task when a physical machine is overloaded, Virtual Machines share its load by utilizing the physical machine resources. Previous studies lack in energy consumption and time management while keeping the Virtual Machine at the different server in turned on state. AIM AND OBJECTIVE: The main aim of this research work is to propose an effective resource allocation scheme for allocating the Virtual Machine from an ad hoc sub server with Virtual Machines. EXECUTION MODEL: The execution of the research has been carried out into two sections, initially, the location of Virtual Machines and Physical Machine with the server has been taken place and subsequently, the cross-validation of allocation is addressed. For the sorting of Virtual Machines, Modified Best Fit Decreasing algorithm is used and Multi-Machine Job Scheduling is used while the placement process of jobs to an appropriate host. Artificial Neural Network as a classifier, has allocated jobs to the hosts. Measures, viz. Service Level Agreement violation and energy consumption are considered and fruitful results have been obtained with a 37.7 of reduction in energy consumption and 15% improvement in Service Level Agreement violation.

Download Full-text

A Novel Call Admission Control Algorithm for Next Generation Wireless Mobile Communication

International Journal of Rough Sets and Data Analysis ◽

10.4018/ijrsda.2017070106 ◽

2017 ◽

Vol 4 (3) ◽

pp. 83-95

Author(s):

T. A. Chavan ◽

P. Saras

Keyword(s):

Wireless Network ◽

Research Work ◽

Service Level Agreement ◽

Priority Queue ◽

Service Level ◽

Next Generation ◽

Handoff Call ◽

Call Acceptance ◽

Customer Services ◽

Wireless Mobile Communication

Wireless communication technology is progressing very vastly. With this change in technology customer services for multimedia and non-multimedia are increasing day by day. But due to limited resources of the wireless network, we need to design an efficient CAC algorithm to enhance QoS levels for end users. The Quality of service (QoS) enhancement in the wireless network is related to making an efficient use of current network resources and the optimization of the users. Call acceptance in CAC is one of the challenge in mobile cellular networks to ensure that the acceptance of a new call into a resource limited wireless network should not deviate the service level Agreement (SLAs) at the time of conversations. In the next generation wireless network, CAC has the direct impact on QoS for user calls & overall system performance. To handle handoff calls and new calls in cellular network channel reservation scheme have been already proposed to reserve system bandwidth for higher priority call for CAC. This earlier proposed scheme is not as per the required level of satisfaction because the available reversed bandwidth is not allocated properly in case of least handoff rate. In this, the authors like to present a new channel borrowing scheme where new non real time (NRT) calls can make use of reserved channels. It can borrow this reserved channel on a temporary basis and after this immediately if any handoff call enters the current cell and no any other channels are available, then it will pre-empt the channel from an earlier borrowed NRT user if exists. This pre-empted NRT call is kept in the priority queue to consider its service when any channel becomes free. The number of NRT calls in the queue should not be large to avoid delayed service. The fundamental objective of the proposed scheme to design of the system for evaluating the results and comparing with the results of the existing system. From the results of current research work, it is observed that proposed scheme decreases call dropping probability which increase slightly in call blocking rate over high-density handoff call rate.

Download Full-text

High performance deep learning techniques for big data analytics

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5032 ◽

2018 ◽

Vol 30 (23) ◽

pp. e5032

Author(s):

Maozhen Li

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Learning Techniques

Download Full-text

Big Data and IT Network Data Visualization

International Journal of Mathematical Engineering and Management Sciences ◽

10.33889/ijmems.2018.3.1-002 ◽

2018 ◽

Vol 3 (1) ◽

pp. 9-16 ◽

Cited By ~ 3

Author(s):

Lidong Wang

Keyword(s):

Big Data ◽

Network Analysis ◽

Graphics Processing Units ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Network Visualization ◽

Network Data ◽

Graphics Processing ◽

Performance Computing

Visualization with graphs is popular in the data analysis of Information Technology (IT) networks or computer networks. An IT network is often modelled as a graph with hosts being nodes and traffic being flows on many edges. General visualization methods are introduced in this paper. Applications and technology progress of visualization in IT network analysis and big data in IT network visualization are presented. The challenges of visualization and Big Data analytics in IT network visualization are also discussed. Big Data analytics with High Performance Computing (HPC) techniques, especially Graphics Processing Units (GPUs) helps accelerate IT network analysis and visualization.

Download Full-text

High Performance Storage for Big Data Analytics and Visualization

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch010 ◽

2018 ◽

pp. 254-275

Author(s):

Armando Fandango ◽

William Rivera

Keyword(s):

Big Data ◽

High Speed ◽

High Performance ◽

File System ◽

Predictive Analytics ◽

Big Data Analytics ◽

File Systems ◽

Distributed Applications ◽

System Level ◽

File Formats

Scientific Big Data being gathered at exascale needs to be stored, retrieved and manipulated. The storage stack for scientific Big Data includes a file system at the system level for physical organization of the data, and a file format and input/output (I/O) system at the application level for logical organization of the data; both of them of high-performance variety for exascale. The high-performance file system is designed with concurrent access, high-speed transmission and fault tolerance characteristics. High-performance file formats and I/O are designed to allow parallel and distributed applications with easy and fast access to Big Data. These specialized file formats make it easier to store and access Big Data for scientific visualization and predictive analytics. This chapter provides a brief review of the characteristics of high-performance file systems such as Lustre and GPFS, and high-performance file formats such as HDF5, NetCDF, MPI-IO, and HDFS.

Download Full-text

Intelligent Big Data Analytics

Advances in Business Information Systems and Analytics - Maximizing Business Performance and Efficiency Through Intelligent Systems ◽

10.4018/978-1-5225-2234-8.ch003 ◽

2017 ◽

pp. 50-72 ◽

Cited By ~ 2

Author(s):

Dheeraj Malhotra ◽

Neha Verma ◽

Om Prakash Rishi ◽

Jatinder Singh

Keyword(s):

Big Data ◽

System Design ◽

Research Work ◽

Big Data Analytics ◽

Map Reduce ◽

Time Intervals ◽

Online Commerce ◽

Online Purchase ◽

Online Retailers ◽

Novel Approaches

With the explosive increase in regular E Commerce users, online commerce companies must have more customer friendly websites to satisfy the personalized requirements of online customer to progress their market share over competition; Different individuals have different purchase requirements at different time intervals and hence novel approaches are often required to be deployed by online retailers in order to identify the latest purchase requirements of customer. This research work proposes a novel MR apriori algorithm and system design of a tool called IMSS-SE, which can be used to blend benefits of Apriori-based Map Reduce framework with Intelligent technologies for B2C E-commerce in order to assist the online user to easily search and rank various E Commerce websites which can satisfy his personalized online purchase requirement. An extensive experimental evaluation shows that proposed system can better satisfy the personalized search requirements of E Commerce users than generic search engines.

Download Full-text

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Big Data ◽

10.4018/978-1-4666-9840-6.ch071 ◽

2016 ◽

pp. 1555-1581

Author(s):

Gueyoung Jung ◽

Tridib Mukherjee

Keyword(s):

Big Data ◽

Distributed System ◽

Data Analytics ◽

High Performance ◽

Large Scale ◽

Big Data Analytics ◽

Loosely Coupled ◽

Current Trends ◽

Distributed Computing Infrastructures ◽

Performance Computing

In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful information from big data). Many researchers and practitioners are focusing on big data analytics to address the problem. One of the major issues in this regard is the computation requirement of big data analytics. In recent years, the proliferation of many loosely coupled distributed computing infrastructures (e.g., modern public, private, and hybrid clouds, high performance computing clusters, and grids) have enabled high computing capability to be offered for large-scale computation. This has allowed the execution of the big data analytics to gather pace in recent years across organizations and enterprises. However, even with the high computing capability, it is a big challenge to efficiently extract valuable information from vast astronomical data. Hence, we require unforeseen scalability of performance to deal with the execution of big data analytics. A big question in this regard is how to maximally leverage the high computing capabilities from the aforementioned loosely coupled distributed infrastructure to ensure fast and accurate execution of big data analytics. In this regard, this chapter focuses on synchronous parallelization of big data analytics over a distributed system environment to optimize performance.

Download Full-text

Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation

Cluster Computing ◽

10.1007/s10586-019-02960-y ◽

2019 ◽

Vol 23 (2) ◽

pp. 953-988 ◽

Cited By ~ 3

Author(s):

Ajeet Ram Pathak ◽

Manjusha Pandey ◽

Siddharth S. Rautaray

Keyword(s):

Big Data ◽

High Performance Computing ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Performance Computing

Download Full-text

Virtual Machine Consolidation with Minimization of Migration Thrashing for Cloud Data Centers

Mathematical Problems in Engineering ◽

10.1155/2020/7848232 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Xialin Liu ◽

Junsheng Wu ◽

Gang Sha ◽

Shuqin Liu

Keyword(s):

Virtual Machine ◽

Carbon Dioxide Emissions ◽

Data Centers ◽

Virtual Machines ◽

Service Level Agreement ◽

High Capacity ◽

Electrical Energy ◽

Cloud Data ◽

Cloud Data Centers ◽

Vm Consolidation

Cloud data centers consume huge amount of electrical energy bringing about in high operating costs and carbon dioxide emissions. Virtual machine (VM) consolidation utilizes live migration of virtual machines (VMs) to transfer a VM among physical servers in order to improve the utilization of resources and energy efficiency in cloud data centers. Most of the current VM consolidation approaches tend to aggressive-migrate for some types of applications such as large capacity application such as speech recognition, image processing, and decision support systems. These approaches generate a high migration thrashing because VMs are consolidated to servers according to VM’s instant resource usage without considering their overall and long-term utilization. The proposed approach, dynamic consolidation with minimization of migration thrashing (DCMMT) which prioritizes VM with high capacity, significantly reduces migration thrashing and the number of migrations to ensure service-level agreement (SLA) since it keeps VMs likely to suffer from migration thrashing in the same physical servers instead of migrating. We have performed experiments using real workload traces compared to existing aggressive-migration-based solutions; through simulations, we show that our approach improves migration thrashing metric by about 28%, number of migrations metric by about 21%, and SLAV metric by about 19%.

Download Full-text