Resource provisioning for data-intensive applications with deadline constraints on hybrid clouds using Aneka

2018 ◽  
Vol 79 ◽  
pp. 765-775 ◽  
Author(s):  
Adel Nadjaran Toosi ◽  
Richard O. Sinnott ◽  
Rajkumar Buyya
Author(s):  
Ioan Raicu ◽  
Ian Foster ◽  
Yong Zhao ◽  
Alex Szalay ◽  
Philip Little ◽  
...  

Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Traditional techniques to support many-task computing commonly found in scientific computing (i.e. the reliance on parallel file systems with static configurations) do not scale to today’s largest systems for data intensive application, as the rate of increase in the number of processors per system is outgrowing the rate of performance increase of parallel file systems. In this chapter, the authors argue that in such circumstances, data locality is critical to the successful and efficient use of large distributed systems for data-intensive applications. They propose a “data diffusion” approach to enable data-intensive many-task computing. They define an abstract model for data diffusion, define and implement scheduling policies with heuristics that optimize real world performance, and develop a competitive online caching eviction policy. They also offer many empirical experiments to explore the benefits of data diffusion, both under static and dynamic resource provisioning, demonstrating approaches that improve both performance and scalability.


Author(s):  
Hosein Mohamamdi Makrani ◽  
Hossein Sayadi ◽  
Najmeh Nazari ◽  
Sai Mnoj Pudukotai Dinakarrao ◽  
Avesta Sasan ◽  
...  

The processing of data-intensive workloads is a challenging and time-consuming task that often requires massive infrastructure to ensure fast data analysis. The cloud platform is the most popular and powerful scale-out infrastructure to perform big data analytics and eliminate the need to maintain expensive and high-end computing resources at the user side. The performance and the cost of such infrastructure depend on the overall server configuration, such as processor, memory, network, and storage configurations. In addition to the cost of owning or maintaining the hardware, the heterogeneity in the server configuration further expands the selection space, leading to non-convergence. The challenge is further exacerbated by the dependency of the application’s performance on the underlying hardware. Despite an increasing interest in resource provisioning, few works have been done to develop accurate and practical models to proactively predict the performance of data-intensive applications corresponding to the server configuration and provision a cost-optimal configuration online. In this work, through a comprehensive real-system empirical analysis of performance, we address these challenges by introducing ProMLB: a proactive machine-learning-based methodology for resource provisioning. We first characterize diverse types of data-intensive workloads across different types of server architectures. The characterization aids in accurately capture applications’ behavior and train a model for prediction of their performance. Then, ProMLB builds a set of cross-platform performance models for each application. Based on the developed predictive model, ProMLB uses an optimization technique to distinguish close-to-optimal configuration to minimize the product of execution time and cost. Compared to the oracle scheduler, ProMLB achieves 91% accuracy in terms of application-resource matching. On average, ProMLB improves the performance and resource utilization by 42.6% and 41.1%, respectively, compared to baseline scheduler. Moreover, ProMLB improves the performance per cost by 2.5× on average.


Big Data ◽  
2016 ◽  
pp. 639-654
Author(s):  
Jayalakshmi D. S. ◽  
R. Srinivasan ◽  
K. G. Srinivasa

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.


Author(s):  
Jayalakshmi D. S. ◽  
R. Srinivasan ◽  
K. G. Srinivasa

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1709
Author(s):  
Agbotiname Lucky Imoize ◽  
Oluwadara Adedeji ◽  
Nistha Tandiya ◽  
Sachin Shetty

The 5G wireless communication network is currently faced with the challenge of limited data speed exacerbated by the proliferation of billions of data-intensive applications. To address this problem, researchers are developing cutting-edge technologies for the envisioned 6G wireless communication standards to satisfy the escalating wireless services demands. Though some of the candidate technologies in the 5G standards will apply to 6G wireless networks, key disruptive technologies that will guarantee the desired quality of physical experience to achieve ubiquitous wireless connectivity are expected in 6G. This article first provides a foundational background on the evolution of different wireless communication standards to have a proper insight into the vision and requirements of 6G. Second, we provide a panoramic view of the enabling technologies proposed to facilitate 6G and introduce emerging 6G applications such as multi-sensory–extended reality, digital replica, and more. Next, the technology-driven challenges, social, psychological, health and commercialization issues posed to actualizing 6G, and the probable solutions to tackle these challenges are discussed extensively. Additionally, we present new use cases of the 6G technology in agriculture, education, media and entertainment, logistics and transportation, and tourism. Furthermore, we discuss the multi-faceted communication capabilities of 6G that will contribute significantly to global sustainability and how 6G will bring about a dramatic change in the business arena. Finally, we highlight the research trends, open research issues, and key take-away lessons for future research exploration in 6G wireless communication.


2021 ◽  
Vol 55 (1) ◽  
pp. 88-98
Author(s):  
Mohammed Islam Naas ◽  
François Trahay ◽  
Alexis Colin ◽  
Pierre Olivier ◽  
Stéphane Rubini ◽  
...  

Tracing is a popular method for evaluating, investigating, and modeling the performance of today's storage systems. Tracing has become crucial with the increase in complexity of modern storage applications/systems, that are manipulating an ever-increasing amount of data and are subject to extreme performance requirements. There exists many tracing tools focusing either on the user-level or the kernel-level, however we observe the lack of a unified tracer targeting both levels: this prevents a comprehensive understanding of modern applications' storage performance profiles. In this paper, we present EZIOTracer, a unified I/O tracer for both (Linux) kernel and user spaces, targeting data intensive applications. EZIOTracer is composed of a userland as well as a kernel space tracer, complemented with a trace analysis framework able to merge the output of the two tracers, and in particular to relate user-level events to kernel-level ones, and vice-versa. On the kernel side, EZIOTracer relies on eBPF to offer safe, low-overhead, low memory footprint, and flexible tracing capabilities. We demonstrate using FIO benchmark the ability of EZIOTracer to track down I/O performance issues by relating events recorded at both the kernel and user levels. We show that this can be achieved with a relatively low overhead that ranges from 2% to 26% depending on the I/O intensity.


Sign in / Sign up

Export Citation Format

Share Document