Aeromancer: A Workflow Manager for Large-Scale MapReduce-Based Scientific Workflows

Author(s):  
Nabeel Mohamed ◽  
Nabanita Maji ◽  
Jing Zhang ◽  
Nataliya Timoshevskaya ◽  
Wu-Chun Feng
Author(s):  
Ewa Deelman ◽  
Ann Chervenak

Scientific applications such as those in astronomy, earthquake science, gravitational-wave physics, and others have embraced workflow technologies to do large-scale science. Workflows enable researchers to collaboratively design, manage, and obtain results that involve hundreds of thousands of steps, access terabytes of data, and generate similar amounts of intermediate and final data products. Although workflow systems are able to facilitate the automated generation of data products, many issues still remain to be addressed. These issues exist in different forms in the workflow lifecycle. This chapter describes a workflow lifecycle as consisting of a workflow generation phase where the analysis is defined, the workflow planning phase where resources needed for execution are selected, the workflow execution part, where the actual computations take place, and the result, metadata, and provenance storing phase. The authors discuss the issues related to data management at each step of the workflow cycle. They describe challenge problems and illustrate them in the context of real-life applications. They discuss the challenges, possible solutions, and open issues faced when mapping and executing large-scale workflows on current cyberinfrastructure. They particularly emphasize the issues related to the management of data throughout the workflow lifecycle.


2021 ◽  
Author(s):  
Sridevi S ◽  
Jeevaa Katiravan Jeevaa Katiravan

Abstract Scientific workflows deserve the emerging attention in sophisticated large-scale scientific problem-solving environments. Though a single task failure occurs in workflow based applications, due to its task dependency nature the reliability of the overall system will be affected drastically. Hence rather than reactive fault tolerant approaches, proactive measures are vital in scientific workflows. This work puts forth an attempt to concentrate on the exploration issue of structuring an Exotic Intelligent Water Drops - Support Vector Regression-based approach for task failure prognostication which facilitates proactive fault tolerance in scientific workflow applications. The failure prediction models in this study have been implemented through SVR-based machine learning approaches and its precision accuracy is optimized by IWDA and various performance metrics were evaluated. The experimental results prove that the proposed approach performs better compared with the other existing techniques.


2019 ◽  
Vol 20 (2) ◽  
pp. 237-258
Author(s):  
Avinash Kaur ◽  
Pooja Gupta ◽  
Manpreet Singh

Scientific Workflow is a composition of both coarse-grained and fine-grained computational tasks displaying varying execution requirements. Large-scale data transfer is involved in scientific workflows, so efficient techniques are required to reduce the makespan of the workflow. Task clustering is an efficient technique used in such a scenario that involves combining multiple tasks with shorter execution time into a single cluster to be executed on a resource. This leads to a reduction of scheduling overheads in scientific workflows and thus improvement of performance. However available task clustering methods involve clustering the tasks horizontally without the consideration of the structure of tasks in a workflow. We propose hybrid balanced task clustering algorithm that uses the parameter of impact factor of workflows along with the structure of workflow. According to this technique, tasks can be considered for clustering either vertically or horizontally based on the value of the impact factor. This minimizes the system overheads and the makespan for execution of a workflow. A simulation based evaluation is performed on real workflows that shows the proposed algorithm is efficient in recommending clusters. It shows improvement of 5-10\% in makespan time of workflow depending on the type of workflow used.


Sign in / Sign up

Export Citation Format

Share Document