Resource provisioning for data-intensive applications with deadline constraints on hybrid clouds using Aneka

Shared data-aware dynamic resource provisioning and task scheduling for data intensive applications on hybrid clouds using Aneka

Future Generation Computer Systems ◽

10.1016/j.future.2020.01.038 ◽

2020 ◽

Vol 106 ◽

pp. 595-606

Author(s):

Shreshth Tuli ◽

Rajinder Sandhu ◽

Rajkumar Buyya

Keyword(s):

Task Scheduling ◽

Resource Provisioning ◽

Hybrid Clouds ◽

Data Intensive ◽

Shared Data ◽

Dynamic Resource Provisioning ◽

Dynamic Resource ◽

Data Intensive Applications ◽

And Task

Download Full-text

Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids

2008 IEEE International Performance, Computing and Communications Conference ◽

10.1109/pccc.2008.4745123 ◽

2008 ◽

Cited By ~ 9

Author(s):

Cong Liu ◽

Xiao Qin ◽

Santosh Kulkarni ◽

Chengjun Wang ◽

Shuang Li ◽

...

Keyword(s):

Energy Efficient ◽

Distributed Energy ◽

Data Grids ◽

Data Intensive ◽

Deadline Constraints ◽

Data Intensive Applications ◽

Energy Efficient Scheduling

Download Full-text

Towards Data Intensive Many-Task Computing

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Data Intensive Distributed Computing ◽

10.4018/978-1-61520-971-2.ch002 ◽

2012 ◽

pp. 28-73 ◽

Cited By ~ 8

Author(s):

Ioan Raicu ◽

Ian Foster ◽

Yong Zhao ◽

Alex Szalay ◽

Philip Little ◽

...

Keyword(s):

High Performance ◽

File Systems ◽

Data Locality ◽

Resource Provisioning ◽

Parallel File Systems ◽

Data Intensive ◽

Dynamic Resource Provisioning ◽

Rate Of Increase ◽

Parallel File ◽

Data Intensive Applications

Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Traditional techniques to support many-task computing commonly found in scientific computing (i.e. the reliance on parallel file systems with static configurations) do not scale to today’s largest systems for data intensive application, as the rate of increase in the number of processors per system is outgrowing the rate of performance increase of parallel file systems. In this chapter, the authors argue that in such circumstances, data locality is critical to the successful and efficient use of large distributed systems for data-intensive applications. They propose a “data diffusion” approach to enable data-intensive many-task computing. They define an abstract model for data diffusion, define and implement scheduling policies with heuristics that optimize real world performance, and develop a competitive online caching eviction policy. They also offer many empirical experiments to explore the benefits of data diffusion, both under static and dynamic resource provisioning, demonstrating approaches that improve both performance and scalability.

Download Full-text

Adaptive Performance Modeling of Data-intensive Workloads for Resource Provisioning in Virtualized Environment

ACM Transactions on Modeling and Performance Evaluation of Computing Systems ◽

10.1145/3442696 ◽

2021 ◽

Vol 5 (4) ◽

pp. 1-24

Author(s):

Hosein Mohamamdi Makrani ◽

Hossein Sayadi ◽

Najmeh Nazari ◽

Sai Mnoj Pudukotai Dinakarrao ◽

Avesta Sasan ◽

...

Keyword(s):

Optimization Technique ◽

Big Data Analytics ◽

Resource Provisioning ◽

Optimal Configuration ◽

Adaptive Performance ◽

Data Intensive ◽

Resource Matching ◽

Cross Platform ◽

The Cost ◽

Data Intensive Applications

The processing of data-intensive workloads is a challenging and time-consuming task that often requires massive infrastructure to ensure fast data analysis. The cloud platform is the most popular and powerful scale-out infrastructure to perform big data analytics and eliminate the need to maintain expensive and high-end computing resources at the user side. The performance and the cost of such infrastructure depend on the overall server configuration, such as processor, memory, network, and storage configurations. In addition to the cost of owning or maintaining the hardware, the heterogeneity in the server configuration further expands the selection space, leading to non-convergence. The challenge is further exacerbated by the dependency of the application’s performance on the underlying hardware. Despite an increasing interest in resource provisioning, few works have been done to develop accurate and practical models to proactively predict the performance of data-intensive applications corresponding to the server configuration and provision a cost-optimal configuration online. In this work, through a comprehensive real-system empirical analysis of performance, we address these challenges by introducing ProMLB: a proactive machine-learning-based methodology for resource provisioning. We first characterize diverse types of data-intensive workloads across different types of server architectures. The characterization aids in accurately capture applications’ behavior and train a model for prediction of their performance. Then, ProMLB builds a set of cross-platform performance models for each application. Based on the developed predictive model, ProMLB uses an optimization technique to distinguish close-to-optimal configuration to minimize the product of execution time and cost. Compared to the oracle scheduler, ProMLB achieves 91% accuracy in terms of application-resource matching. On average, ProMLB improves the performance and resource utilization by 42.6% and 41.1%, respectively, compared to baseline scheduler. Moreover, ProMLB improves the performance per cost by 2.5× on average.

Download Full-text

Data Intensive Cloud Computing

Big Data ◽

10.4018/978-1-4666-9840-6.ch029 ◽

2016 ◽

pp. 639-654

Author(s):

Jayalakshmi D. S. ◽

R. Srinivasan ◽

K. G. Srinivasa

Keyword(s):

Cloud Computing ◽

Big Data ◽

Cluster Computing ◽

Resource Provisioning ◽

Data Intensive ◽

Scientific Value ◽

Data Intensive Applications ◽

Cloud Applications ◽

Problem Data ◽

Huge Challenge

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.

Download Full-text

Data Intensive Cloud Computing

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Advanced Research on Cloud Computing Design and Applications ◽

10.4018/978-1-4666-8676-2.ch019 ◽

2015 ◽

pp. 305-320

Author(s):

Jayalakshmi D. S. ◽

R. Srinivasan ◽

K. G. Srinivasa

Keyword(s):

Cloud Computing ◽

Big Data ◽

Cluster Computing ◽

Resource Provisioning ◽

Data Intensive ◽

Scientific Value ◽

Data Intensive Applications ◽

Cloud Applications ◽

Problem Data ◽

Huge Challenge

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.

Download Full-text

6G Enabled Smart Infrastructure for Sustainable Society: Opportunities, Challenges, and Research Roadmap

Sensors ◽

10.3390/s21051709 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1709

Author(s):

Agbotiname Lucky Imoize ◽

Oluwadara Adedeji ◽

Nistha Tandiya ◽

Sachin Shetty

Keyword(s):

Wireless Communication ◽

Psychological Health ◽

Future Research ◽

Agriculture Education ◽

Social Psychological ◽

Research Issues ◽

Data Intensive ◽

Wireless Communication Network ◽

Data Intensive Applications

The 5G wireless communication network is currently faced with the challenge of limited data speed exacerbated by the proliferation of billions of data-intensive applications. To address this problem, researchers are developing cutting-edge technologies for the envisioned 6G wireless communication standards to satisfy the escalating wireless services demands. Though some of the candidate technologies in the 5G standards will apply to 6G wireless networks, key disruptive technologies that will guarantee the desired quality of physical experience to achieve ubiquitous wireless connectivity are expected in 6G. This article first provides a foundational background on the evolution of different wireless communication standards to have a proper insight into the vision and requirements of 6G. Second, we provide a panoramic view of the enabling technologies proposed to facilitate 6G and introduce emerging 6G applications such as multi-sensory–extended reality, digital replica, and more. Next, the technology-driven challenges, social, psychological, health and commercialization issues posed to actualizing 6G, and the probable solutions to tackle these challenges are discussed extensively. Additionally, we present new use cases of the 6G technology in agriculture, education, media and entertainment, logistics and transportation, and tourism. Furthermore, we discuss the multi-faceted communication capabilities of 6G that will contribute significantly to global sustainability and how 6G will bring about a dramatic change in the business arena. Finally, we highlight the research trends, open research issues, and key take-away lessons for future research exploration in 6G wireless communication.

Download Full-text

Exploratory Development of Data-intensive Applications

Proceedings of the International Conference on the Art, Science, and Engineering of Programming - Programming '17 ◽

10.1145/3079368.3079399 ◽

2017 ◽

Cited By ~ 1

Author(s):

Patrick Rein ◽

Marcel Taeumel ◽

Robert Hirschfeld ◽

Michael Perscheid

Keyword(s):

Data Intensive ◽

Data Intensive Applications

Download Full-text

EZIOTracer

ACM SIGOPS Operating Systems Review ◽

10.1145/3469379.3469391 ◽

2021 ◽

Vol 55 (1) ◽

pp. 88-98

Author(s):

Mohammed Islam Naas ◽

François Trahay ◽

Alexis Colin ◽

Pierre Olivier ◽

Stéphane Rubini ◽

...

Keyword(s):

Analysis Framework ◽

Comprehensive Understanding ◽

Kernel Space ◽

Data Intensive ◽

Storage Performance ◽

Performance Requirements ◽

Memory Footprint ◽

Extreme Performance ◽

Data Intensive Applications ◽

Kernel Level

Tracing is a popular method for evaluating, investigating, and modeling the performance of today's storage systems. Tracing has become crucial with the increase in complexity of modern storage applications/systems, that are manipulating an ever-increasing amount of data and are subject to extreme performance requirements. There exists many tracing tools focusing either on the user-level or the kernel-level, however we observe the lack of a unified tracer targeting both levels: this prevents a comprehensive understanding of modern applications' storage performance profiles. In this paper, we present EZIOTracer, a unified I/O tracer for both (Linux) kernel and user spaces, targeting data intensive applications. EZIOTracer is composed of a userland as well as a kernel space tracer, complemented with a trace analysis framework able to merge the output of the two tracers, and in particular to relate user-level events to kernel-level ones, and vice-versa. On the kernel side, EZIOTracer relies on eBPF to offer safe, low-overhead, low memory footprint, and flexible tracing capabilities. We demonstrate using FIO benchmark the ability of EZIOTracer to track down I/O performance issues by relating events recorded at both the kernel and user levels. We show that this can be achieved with a relatively low overhead that ranges from 2% to 26% depending on the I/O intensity.

Download Full-text

Domain Metric Driven Decomposition of Data-Intensive Applications

2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW) ◽

10.1109/issrew51248.2020.00071 ◽

2020 ◽

Author(s):

Matteo Camilli ◽

Carmine Colarusso ◽

Barbara Russo ◽

Eugenio Zimeo

Keyword(s):

Data Intensive ◽

Data Intensive Applications

Download Full-text