INHIBITOR: An intrusion tolerant scheduling algorithm in cloud-based scientific workflow system

2021 ◽  
Vol 114 ◽  
pp. 272-284
Author(s):  
Yawen Wang ◽  
Yunfei Guo ◽  
Wenbo Wang ◽  
Hao Liang ◽  
Shumin Huo
2021 ◽  
Vol 7 ◽  
pp. e747
Author(s):  
Mazen Farid ◽  
Rohaya Latip ◽  
Masnida Hussin ◽  
Nor Asilah Wati Abdul Hamid

Background Recent technological developments have enabled the execution of more scientific solutions on cloud platforms. Cloud-based scientific workflows are subject to various risks, such as security breaches and unauthorized access to resources. By attacking side channels or virtual machines, attackers may destroy servers, causing interruption and delay or incorrect output. Although cloud-based scientific workflows are often used for vital computational-intensive tasks, their failure can come at a great cost. Methodology To increase workflow reliability, we propose the Fault and Intrusion-tolerant Workflow Scheduling algorithm (FITSW). The proposed workflow system uses task executors consisting of many virtual machines to carry out workflow tasks. FITSW duplicates each sub-task three times, uses an intermediate data decision-making mechanism, and then employs a deadline partitioning method to determine sub-deadlines for each sub-task. This way, dynamism is achieved in task scheduling using the resource flow. The proposed technique generates or recycles task executors, keeps the workflow clean, and improves efficiency. Experiments were conducted on WorkflowSim to evaluate the effectiveness of FITSW using metrics such as task completion rate, success rate and completion time. Results The results show that FITSW not only raises the success rate by about 12%, it also improves the task completion rate by 6.2% and minimizes the completion time by about 15.6% in comparison with intrusion tolerant scientific workflow ITSW system.


2014 ◽  
Vol 22 (3) ◽  
pp. 277
Author(s):  
Qiao Huijie ◽  
Lin Congtian ◽  
Wang Jiangning ◽  
Ji Liqiang

2012 ◽  
Vol 9 ◽  
pp. 1604-1613 ◽  
Author(s):  
Marcin Płóciennik ◽  
Michał Owsiak ◽  
Tomasz Zok ◽  
Bartek Palak ◽  
Antonio Gómez-Iglesias ◽  
...  

Author(s):  
Hajar Hamidian ◽  
Shiyong Lu ◽  
Satyendra Rana ◽  
Farshad Fotouhi ◽  
Hamid Soltanian-Zadeh

2016 ◽  
Author(s):  
Andrea Manconi ◽  
Marco Moscatelli ◽  
Matteo Gnocchi ◽  
Giuliano Armano ◽  
Luciano Milanesi

Motivation Recent advances in genome sequencing and biological data analysis technologies used in bioinformatics have led to a fast and continuous increase in biological data. The difficulty of managing the huge amounts of data currently available to researchers and the need to have results within a reasonable time have led to the use of distributed and parallel computing infrastructures for their analysis. Recently, bioinformatics is exploring new approaches based on the use of hardware accelerators as GPUs. From an architectural perspective, GPUs are very different from traditional CPUs. Indeed, the latter are devices composed of few cores with lots of cache memory able to handle a few software threads at a time. Conversely, the former are devices equipped with hundreds of cores able to handle thousands of threads simultaneously, so that a very high level of parallelism can be reached. Use of GPUs over the last years has resulted in significant increases in the performance of certain applications. Despite GPUs are increasingly used in bioinformatics most laboratories do not have access to a GPU cluster or server. In this context, it is very important to provide useful services to use these tools. Methods A web-based platform has been implemented with the aim to enable researchers to perform their analysis through dedicated GPU-based computing resources. To this end, a GPU cluster equipped with 16 NVIDIA Tesla k20c cards has been configured. The infrastructure has been built upon the Galaxy technology [1]. Galaxy is an open web-based scientific workflow system for data intensive biomedical research accessible to researchers that do not have programming experience. Let us recall that Galaxy provides a public server, but it does not provide support to GPU-computing. By default, Galaxy is designed to run jobs on local systems. However, it can also be configured to run jobs on a cluster. The front-end Galaxy application runs on a single server, but tools are run on cluster nodes instead. To this end, Galaxy supports different distributed resource managers with the aim to enable different clusters. For the specific case, in our opinion SLURM [2] represents the most suitable workload manager to manage and control jobs. SLURM is a highly configurable workload and resource manager and it is currently used on six of the ten most powerful computers in the world including the Piz Daint, utilizing over 5000 NVIDIA Tesla K20 GPUs. Results GPU-based tools [3] devised by our group for quality control of NGS data have been used to test the infrastructure. Initially, this activity required to make changes to the tools with the aim to optimize the parallelization on the cluster according to the adopted workload manager. Successively, the tools have been converted into web-based services accessible through the Galaxy portal. Abstract truncated at 3,000 characters - the full version is available in the pdf file.


Author(s):  
Jasraj Meena ◽  
Manu Vardhan

Cloud computing is used to deliver IT resources over the internet. Due to the popularity of cloud computing, nowadays, most of the scientific workflows are shifted towards this environment. There are lots of algorithms has been proposed in the literature to schedule scientific workflows in the cloud, but their execution cost is very high as well as they are not meeting the user-defined deadline constraint. This paper focuses on satisfying the userdefined deadline of a scientific workflow while minimizing the total execution cost. So, to achieve this, we have proposed a Cost-Effective under Deadline (CEuD) constraint workflow scheduling algorithm. The proposed CEuD algorithm considers all the essential features of Cloud and resolves the major issues such as performance variation, and acquisition delay. We have compared the proposed CEuD algorithm with the existing literature algorithms for scientific workflows (i.e., Montage, Epigenomics, and CyberShake) and getting better results for minimizing the overall execution cost of the workflow while satisfying the user-defined deadline.


Sign in / Sign up

Export Citation Format

Share Document