scholarly journals An Efficient Grid Scheduling Algorithm with Fault Tolerance and User Satisfaction

2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
P. Keerthika ◽  
N. Kasthuri

Problem Statement. The advances in human civilization lead to more complications in problem solving. Grid computing serves as an efficient technology in solving those complicated problems. In computational grids, the grid scheduler schedules the task and finds the appropriate resource for each task. The scheduler must consider several factors such as user demand, communication time, failure handling mechanisms, and reduced makespan. Most of the existing algorithms do not consider user satisfaction. Thus a scheduling algorithm that handles failure of resources and achieves user satisfaction gains more importance.Approach. A new bicriteria scheduling algorithm (BSA) that considers user satisfaction along with fault tolerance has been introduced. The main contribution of this paper includes achieving user satisfaction along with fault tolerance and minimizing the makespan of jobs.Results. The performance of this proposed algorithm is evaluated using GridSim based on makespan and number of jobs completed successfully within user deadline.Conclusions/Recommendations. The proposed BSA algorithm achieves reduced makespan and better hit rate with higher user satisfaction and fault tolerance.

Author(s):  
MALARVIZHI NANDAGOPAL ◽  
S. GAJALAKSHMI ◽  
V. RHYMEND UTHARIARAJ

Computational grids have the potential for solving large-scale scientific applications using heterogeneous and geographically distributed resources. In addition to the challenges of managing and scheduling these applications, reliability challenges arise because of the unreliable nature of grid infrastructure. Two major problems that are critical to the effective utilization of computational resources are efficient scheduling of jobs and providing fault tolerance in a reliable manner. This paper addresses these problems by combining the checkpoint replication based fault tolerance mechanism with minimum total time to release (MTTR) job scheduling algorithm. TTR includes the service time of the job, waiting time in the queue, transfer of input and output data to and from the resource. The MTTR algorithm minimizes the response time by selecting a computational resource based on job requirements, job characteristics, and hardware features of the resources. The fault tolerance mechanism used here sets the job checkpoints based on the resource failure rate. If resource failure occurs, the job is restarted from its last successful state using a checkpoint file from another grid resource. Globus ToolKit is used as the grid middleware to set up a grid environment and evaluate the performance of the proposed approach. The monitoring tools Ganglia and Network Weather Service are used to gather hardware and network details, respectively. The experimental results demonstrate that, the proposed approach effectively schedule the grid jobs with fault-tolerant way thereby reduces TTR of the jobs submitted in the grid. Also, it increases the percentage of jobs completed within specified deadline and making the grid trustworthy.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
P. Keerthika ◽  
P. Suresh

Grid environment consists of millions of dynamic and heterogeneous resources. A grid environment which deals with computing resources is computational grid and is meant for applications that involve larger computations. A scheduling algorithm is said to be efficient if and only if it performs better resource allocation even in case of resource failure. Allocation of resources is a tedious issue since it has to consider several requirements such as system load, processing cost and time, user’s deadline, and resource failure. This work attempts to design a resource allocation algorithm which is budget constrained and also targets load balancing, fault tolerance, and user satisfaction by considering the above requirements. The proposed Multiconstrained Load Balancing Fault Tolerant algorithm (MLFT) reduces the schedule makespan, schedule cost, and task failure rate and improves resource utilization. The proposed MLFT algorithm is evaluated using Gridsim toolkit and the results are compared with the recent algorithms which separately concentrate on all these factors. The comparison results ensure that the proposed algorithm works better than its counterparts.


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
P. Suresh ◽  
P. Balasubramanie

Grid computing is a collection of computational and data resources, providing the means to support both computational intensive applications and data intensive applications. In order to improve the overall performance and efficient utilization of the resources, an efficient load balanced scheduling algorithm has to be implemented. The scheduling approach also needs to consider user demand to improve user satisfaction. This paper proposes a dynamic hierarchical load balancing approach which considers load of each resource and performs load balancing. It minimizes the response time of the jobs and improves the utilization of the resources in grid environment. By considering the user demand of the jobs, the scheduling algorithm also improves the user satisfaction. The experimental results show the improvement of the proposed load balancing method.


2013 ◽  
Vol 2013 ◽  
pp. 1-13 ◽  
Author(s):  
Piyush Chauhan ◽  
Nitin

Due to monetary limitation, small organizations cannot afford high end supercomputers to solve highly complex tasks. P2P (peer to peer) grid computing is being used nowadays to break complex task into subtasks in order to solve them on different grid resources. Workflows are used to represent these complex tasks. Finishing such complex task in a P2P grid requires scheduling subtasks of workflow in an optimized manner. Several factors play their part in scheduling decisions. The genetic algorithm is very useful in scheduling DAG (directed acyclic graph) based task. Benefit of a genetic algorithm is that it takes into consideration multiple criteria while scheduling. In this paper, we have proposed a precedence level based genetic algorithm (PLBGSA), which yields schedules for workflows in a decentralized fashion. PLBGSA is compared with existing genetic algorithm based scheduling techniques. Fault tolerance is a desirable trait of a P2P grid scheduling algorithm due to the untrustworthy nature of grid resources. PLBGSA handles faults efficiently.


2010 ◽  
Vol 18 (1) ◽  
pp. 51-76
Author(s):  
Qian Zhu ◽  
Gagan Agrawal

In this paper, we consider the problem of supporting fault tolerance foradaptiveandtime-criticalapplications in heterogeneous and unreliable grid computing environments. Our goal for this class of applications is to optimize a user-specifiedbenefit functionwhile meeting the time deadline. Our first contribution in this paper is a multi-objective optimization algorithm for scheduling the application onto the most efficient and reliable resources. In this way, the processing can achieve the maximum benefit while also maximizing thesuccess-rate, which is the probability of finishing execution without failures. However, for the cases where failures do occur, we have developed ahybrid failure recoveryscheme to ensure that the application can complete within the pre-specified time interval. Our experimental results show that our scheduling algorithm can achieve better benefit when compared to several heuristics-based greedy scheduling algorithms, while still having a negligible overhead. Benefit is further improved when we apply the hybrid failure recovery scheme, and the success-rate becomes 100%.


Author(s):  
Xavier Besseron ◽  
Mohamed-Slim Bouguerra ◽  
Thierry Gautier ◽  
Erik Saule ◽  
Denis Trystram

Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2270
Author(s):  
Sina Zangbari Koohi ◽  
Nor Asilah Wati Abdul Hamid ◽  
Mohamed Othman ◽  
Gafurjan Ibragimov

High-performance computing comprises thousands of processing powers in order to deliver higher performance computation than a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. The scheduling of these machines has an important impact on their performance. HPC’s job scheduling is intended to develop an operational strategy which utilises resources efficiently and avoids delays. An optimised schedule results in greater efficiency of the parallel machine. In addition, processes and network heterogeneity is another difficulty for the scheduling algorithm. Another problem for parallel job scheduling is user fairness. One of the issues in this field of study is providing a balanced schedule that enhances efficiency and user fairness. ROA-CONS is a new job scheduling method proposed in this paper. It describes a new scheduling approach, which is a combination of an updated conservative backfilling approach further optimised by the raccoon optimisation algorithm. This algorithm also proposes a technique of selection that combines job waiting and response time optimisation with user fairness. It contributes to the development of a symmetrical schedule that increases user satisfaction and performance. In comparison with other well-known job scheduling algorithms, the simulation assesses the effectiveness of the proposed method. The results demonstrate that the proposed strategy offers improved schedules that reduce the overall system’s job waiting and response times.


Sign in / Sign up

Export Citation Format

Share Document