GridAssist, a User Friendly Grid-Based Workflow Management Tool

Author(s):  
M. ter Linden ◽  
H. de Wolf ◽  
R. Grim
2020 ◽  
Vol 15 ◽  
Author(s):  
Akshatha Prasanna ◽  
Vidya Niranjan

Background: Since bacteria are the earliest known organisms, there has been significant interest in their variety and biology, most certainly concerning human health. Recent advances in Metagenomics sequencing (mNGS), a culture-independent sequencing technology have facilitated an accelerated development in clinical microbiology and our understanding of pathogens. Objective: For the implementation of mNGS in routine clinical practice to become feasible, a practical and scalable strategy for the study of mNGS data is essential. This study presents a robust automated pipeline to analyze clinical metagenomic data for pathogen identification and classification. Method: The proposed Clin-mNGS pipeline is an integrated, open-source, scalable, reproducible, and user-friendly framework scripted using the Snakemake workflow management software. The implementation avoids the hassle of manual installation and configuration of the multiple command-line tools and dependencies. The approach directly screens pathogens from clinical raw reads and generates consolidated reports for each sample. Results: The pipeline is demonstrated using publicly available data and is tested on a desktop Linux system and a High-performance cluster. The study compares variability in results from different tools and versions. The versions of the tools are made user modifiable. The pipeline results in quality check, filtered reads, host subtraction, assembled contigs, assembly metrics, relative abundances of bacterial species, antimicrobial resistance genes, plasmid finding, and virulence factors identification. The results obtained from the pipeline are evaluated based on sensitivity and positive predictive value. Conclusion: Clin-mNGS is an automated Snakemake pipeline validated for the analysis of microbial clinical metagenomics reads to perform taxonomic classification and antimicrobial resistance prediction.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Marius Welzel ◽  
Anja Lange ◽  
Dominik Heider ◽  
Michael Schwarz ◽  
Bernd Freisleben ◽  
...  

Abstract Background Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system. Results We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub (https://github.com/MW55/Natrix) or as a Docker container on DockerHub (https://hub.docker.com/r/mw55/natrix). Conclusion Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data.


Author(s):  
Cesare Pautasso

Model-driven architecture (MDA), design and transformation techniques can be applied with success to the domain of business process modeling (BPM) with the goal of making the vision of business-driven development a reality. This chapter is centered on the idea of compiling business process models for executing them, and how this idea has been driving the design of the JOpera for Eclipse workflow management tool. JOpera presents users with a simple, graph-based process modeling language with a visual representation of both control and data-flow aspects. As an intermediate representation, the graphs are converted into Event-Condition-Action rules, which are further compiled into Java bytecode for efficient execution. These transformations of process models are performed by the JOpera process compiler in a completely transparent way, where the generated executable artefacts are kept hidden from users at all times (i.e., even for debugging process executions, which is done by augmenting the original, high level notation). The author evaluates his approach by discussing how using a compiler has opened up the several possibilities for performing optimization on the generated code and also simplified the design the corresponding workflow engine architecture.


1998 ◽  
Vol 33 (3) ◽  
pp. 183-197 ◽  
Author(s):  
H.M.J. Goldschmidt ◽  
J.C.M. de Vries ◽  
G.G. van Merode ◽  
J.J.M. Derks

2021 ◽  
Vol 14 (1) ◽  
pp. 11-19
Author(s):  
Bogdan Văduva ◽  
Honoriu Vălean

Abstract Nowadays programmers write source code for inserting, editing and deleting records of a relational table. The majority of commercial relational databases include a specific management tool that offers such possibilities and most database programmers take this ability as granted. When it comes to real life applications, programmers use Object Oriented (OO) paradigm to build user friendly windows/screens/forms for database operations. The current work shows a different approach using a Low-code CRUD (Create, Read, Update, Delete) framework. Views and guidelines of how to design a Low-code CRUD framework will be detailed. “Low-code” motivation is due to the fact that the new framework will provide the ability to use less code in order to build fast and efficient complex applications. It will be up to the reader to envision a specific framework.


2020 ◽  
Author(s):  
Marius Welzel ◽  
Anja Lange ◽  
Dominik Heider ◽  
Michael Schwarz ◽  
Bernd Freisleben ◽  
...  

AbstractSequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires effcient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an effcient workflow management system. We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub (https://github.com/MW55/Natrix).


2010 ◽  
Vol 219 (7) ◽  
pp. 072022 ◽  
Author(s):  
D C Vanderster ◽  
F Brochu ◽  
G Cowan ◽  
U Egede ◽  
J Elmsheuser ◽  
...  

2020 ◽  
Vol 4 (s1) ◽  
pp. 32-32
Author(s):  
Laura Nelle Hanson ◽  
Jennifer Weis ◽  
Sasa Andrijasevic ◽  
Sharon Elcombe ◽  
Rachel Hardtke ◽  
...  

OBJECTIVES/GOALS: A workflow management tool is essential in order to help support consistent processes with transparency in next steps of the study process. Prior to this tool, staff has relied upon extensive training and coaching on the study process. While resources and guidelines exist, it requires additional time for staff to identify these resources and allows for confusion and rework. Implementation of a systematic workflow management tool was identified as a critical need in order to support streamlined processes, improve transparency and support business continuity, and to accelerate the study process. METHODS/STUDY POPULATION: This effort was undertaken as part of the Protocol Lifecycle Management effort to implement a comprehensive clinical trial management system for clinical research studies. Mayo Clinic has designed a workflow management tool within the Velos eResearch system. The workflow manager is dynamic and will present specific activities based on the study design and responses to data entered on the ad hoc forms. A Workflow Build group contributed to the design of the workflow in order to reflect appropriate, current operational processes. The workflow was vetted and validated with research teams. In addition to designing activities, planned dates and target timelines were established for relevant workflows to help promote transparency in the study start-up timelines and allow study staff to identify overdue activities. Study status controls were designed in the workflow to protect study staff from inadvertently changing the status until appropriate activities are complete. RESULTS/ANTICIPATED RESULTS: A dynamic workflow has been designed and implemented in the Velos eResearch system to support Mayo Clinic research sites. This system will be implemented February 24, 2020 to all consenting studies. DISCUSSION/SIGNIFICANCE OF IMPACT: The implementation of this workflow management tool is critical to help support research operations in a large, academic medical center. Benefits to implementation are expected to include improved transparency in the study status and next steps, reductions in rework due to confusion in next steps, better understanding from new staff in the appropriate study process, and improved timelines for study start-up. As we prepare for the implementation of the Velos eResearch system at Mayo Clinic, the workflow management tool has been identified in training sessions as a positive benefit.


2010 ◽  
Vol 01 (04) ◽  
pp. 419-429 ◽  
Author(s):  
A. Beck ◽  
T. Ganslandt ◽  
M. Hummel ◽  
M. Kiehntopf ◽  
U. Sax ◽  
...  

Summary Objective: Within translational research projects in the recent years large biobanks have been established, mostly supported by homegrown, proprietary software solutions. No general requirements for biobanking IT infrastructures have been published yet. This paper presents an exemplary biobanking IT architecture, a requirements specification for a biorepository management tool and exemplary illustrations of three major types of requirements. Methods: We have pursued a comprehensive literature review for biobanking IT solutions and established an interdisciplinary expert panel for creating the requirements specification. The exemplary illustrations were derived from a requirements analysis within two university hospitals. Results: The requirements specification comprises a catalog with more than 130 detailed requirements grouped into 3 major categories and 20 subcategories. Special attention is given to multitenancy capabilities in order to support the project-specific definition of varying research and bio-banking contexts, the definition of workflows to track sample processing, sample transportation and sample storage and the automated integration of preanalytic handling and storage robots. Conclusion: IT support for biobanking projects can be based on a federated architectural framework comprising primary data sources for clinical annotations, a pseudonymization service, a clinical data warehouse with a flexible and user-friendly query interface and a biorepository management system. Flexibility and scalability of all such components are vital since large medical facilities such as university hospitals will have to support biobanking for varying monocentric and multicentric research scenarios and multiple medical clients.


Sign in / Sign up

Export Citation Format

Share Document