scholarly journals Arteria: An automation system for a sequencing core facility

2017 ◽  
Author(s):  
Johan Dahlberg ◽  
Johan Hermansson ◽  
Steinar Sturlaugsson ◽  
Pontus Larsson

AbstractArteria is an automation system aimed at sequencing core facilities. It is built on existing open source technologies, with a modular design allowing for a community-driven effort to create plug-and-play micro-services. Herein we describe the Arteria system and elaborate on the underlying conceptual framework. The Arteria system breaks down into three conceptual levels; orchestration, process and execution. At the orchestration level it utilizes an event-based model of automation. It models processes, e.g. the steps involved in processing sequencing data, as workflows and executes these in a micro-service based environment. This creates a system which is both flexible and scalable. The Arteria Project code is available as open source software at http://www.github.com/arteria-project.

GigaScience ◽  
2019 ◽  
Vol 8 (12) ◽  
Author(s):  
Johan Dahlberg ◽  
Johan Hermansson ◽  
Steinar Sturlaugsson ◽  
Mariya Lysenkova ◽  
Patrik Smeds ◽  
...  

Abstract Background In recent years, nucleotide sequencing has become increasingly instrumental in both research and clinical settings. This has led to an explosive growth in sequencing data produced worldwide. As the amount of data increases, so does the need for automated solutions for data processing and analysis. The concept of workflows has gained favour in the bioinformatics community, but there is little in the scientific literature describing end-to-end automation systems. Arteria is an automation system that aims at providing a solution to the data-related operational challenges that face sequencing core facilities. Findings Arteria is built on existing open source technologies, with a modular design allowing for a community-driven effort to create plug-and-play micro-services. In this article we describe the system, elaborate on the underlying conceptual framework, and present an example implementation. Arteria can be reduced to 3 conceptual levels: orchestration (using an event-based model of automation), process (the steps involved in processing sequencing data, modelled as workflows), and execution (using a series of RESTful micro-services). This creates a system that is both flexible and scalable. Arteria-based systems have been successfully deployed at 3 sequencing core facilities. The Arteria Project code, written largely in Python, is available as open source software, and more information can be found at https://arteria-project.github.io/ . Conclusions We describe the Arteria system and the underlying conceptual framework, demonstrating how this model can be used to automate data handling and analysis in the context of a sequencing core facility.


Author(s):  
James A. Cowling ◽  
Christopher V. Morgan ◽  
Robert Cloutier

The systems engineering discipline has made great strides in developing a manageable approach to system development. This is predicated on thoroughly articulating the stakeholder requirements. However, in some engineering environments, requirements are changing faster than they can be captured and realized, making this ‘traditional' form of systems engineering less tenable. An iterative system refinement approach, characterized by open systems developments, may be a more appropriate and timely response for fast-changing needs. The open systems development approach has been utilized in a number of domains including open source software, Wikipedia®, and open innovation in manufacturing. However, open systems development appears difficult to recreate successfully, and while domain tradecraft advice is often available, no engineering management methodology has emerged to improve the likelihood of success. The authors discuss the essential features of openness in these three domains and use them to propose a conceptual framework for the further exploration of the effect of governance in determining success in such open endeavors. It is the authors' hope that further research to apply this conceptual framework to open source software projects may reveal some rudimentary elements of a management methodology for environments where requirements are highly uncertain, volatile, or ‘traditional' systems engineering is otherwise sub-optimal.


Author(s):  
Donald Wynn Jr.

This study examines the concept of an ecosystem as originated in the field of ecology and applied to open source software projects. Additionally, a framework for assessing the three dimensions of ecosystem health is defined and explained using examples from a specific open source ecosystem. The conceptual framework is explained in the context of a case study for a sponsored open source ecosystem. The framework and case study highlight a number of characteristics and aspects of these ecosystems which can be evaluated by existing and potential members to gauge the health and sustainability of open source projects and the products and services they produce.


2021 ◽  
Author(s):  
Luca De Sabato ◽  
Gabriele Vaccari ◽  
Arnold Knijn ◽  
Giovanni Ianiro ◽  
Ilaria Di Bartolo ◽  
...  

AbstractBackgroundSince its first appearance in December 2019, the novel Severe Acute Respiratory Syndrome Coronavirus type 2 (SARS-CoV-2), spread worldwide causing an increasing number of cases and deaths (35,537,491 and 1,042,798, respectively at the time of writing, https://covid19.who.int). Similarly, the number of complete viral genome sequences produced by Next Generation Sequencing (NGS), increased exponentially. NGS enables a rapid accumulation of a large number of sequences. However, bioinformatics analyses are critical and require combined approaches for data analysis, which can be challenging for non-bioinformaticians.ResultsA user-friendly and sequencing platform-independent bioinformatics pipeline, named SARS-CoV-2 RECoVERY (REconstruction of CoronaVirus gEnomes & Rapid analYsis) has been developed to build SARS-CoV-2 complete genomes from raw sequencing reads and to investigate variants. The genomes built by SARS-CoV-2 RECoVERY were compared with those obtained using other software available and revealed comparable or better performances of SARS–CoV2 RECoVERY. Depending on the number of reads, the complete genome reconstruction and variants analysis can be achieved in less than one hour. The pipeline was implemented in the multi-usage open-source Galaxy platform allowing an easy access to the software and providing computational and storage resources to the community.ConclusionsSARS-CoV-2 RECoVERY is a piece of software destined to the scientific community working on SARS-CoV-2 phylogeny and molecular characterisation, providing a performant tool for the complete reconstruction and variants’ analysis of the viral genome. Additionally, the simple software interface and the ability to use it through a Galaxy instance without the need to implement computing and storage infrastructures, make SARS-CoV-2 RECoVERY a resource also for virologists with little or no bioinformatics skills.Availability and implementationThe pipeline SARS-CoV-2 RECoVERY (REconstruction of COronaVirus gEnomes & Rapid analYsis) is implemented in the Galaxy instance ARIES (https://aries.iss.it).


2019 ◽  
Author(s):  
Ayman Yousif ◽  
Nizar Drou ◽  
Jillian Rowe ◽  
Mohammed Khalfan ◽  
Kristin C Gunsalus

AbstractBackgroundAs high-throughput sequencing applications continue to evolve, the rapid growth in quantity and variety of sequence-based data calls for the development of new software libraries and tools for data analysis and visualization. Often, effective use of these tools requires computational skills beyond those of many researchers. To ease this computational barrier, we have created a dynamic web-based platform, NASQAR (Nucleic Acid SeQuence Analysis Resource).ResultsNASQAR offers a collection of custom and publicly available open-source web applications that make extensive use of a variety of R packages to provide interactive data analysis and visualization. The platform is publicly accessible at http://nasqar.abudhabi.nyu.edu/. Open-source code is on GitHub at https://github.com/nasqar/NASQAR, and the system is also available as a Docker image at https://hub.docker.com/r/aymanm/nasqarall. NASQAR is a collaboration between the core bioinformatics teams of the NYU Abu Dhabi and NYU New York Centers for Genomics and Systems Biology.ConclusionsNASQAR empowers non-programming experts with a versatile and intuitive toolbox to easily and efficiently explore, analyze, and visualize their Transcriptomics data interactively. Popular tools for a variety of applications are currently available, including Transcriptome Data Preprocessing, RNA-seq Analysis (including Single-cell RNA-seq), Metagenomics, and Gene Enrichment.


2011 ◽  
Vol 2011 ◽  
pp. 1-9 ◽  
Author(s):  
Robert Oostenveld ◽  
Pascal Fries ◽  
Eric Maris ◽  
Jan-Mathijs Schoffelen

This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages.


2020 ◽  
Vol 71 (1) ◽  
pp. 43-48
Author(s):  
Bettina Gierke

ZusammenfassungIm Rahmen des DFG Förderprogramms Fachinformationsdienste (FID) nahm der FID Buch- Bibliotheks- und Informationswissenschaft, eine Kooperation der Herzog August Bibliothek Wolfenbüttel und der Universitätsbibliothek Leipzig, im Oktober 2017 seine Arbeit auf. Ziel ist, die Spitzenversorgung mit Literatur für Wissenschaftlerinnen und Wissenschaftler dieser und angrenzenden Disziplinen sicher zu stellen. Dazu hat der FID BBI ein Discovery Tool entwickelt. Grundlage dafür ist die Open-Source-Software VuFind. Eine Herausforderung für den FID BBI ist die Auswertung unterschiedlichster Datenquellen, weil die Themengebiete des FID BBI sehr weit gefächert sind. Das Portal bietet einen schnellen Rechercheeinstieg. Es ist aber auch möglich komplexere Suchanfragen zu stellen. Der Kontakt zu der wissenschaftlichen Gemeinschaft, die der FID BBI bedient, hat große Priorität, um die Ziele, die von der Deutschen Forschungsgemeinschaft gesetzt wurden, zu erfüllen. Ein erster Kontakt kann über das Nachweisportal hergestellt werden: https://katalog.fid-bbi.de.


Sign in / Sign up

Export Citation Format

Share Document