From peer-reviewed to peer-reproduced: a role for data standards, models and computational workflows in scholarly publishing

A GPU-based high performance computing infrastructure for specialized NGS analyses

10.7287/peerj.preprints.2175v1 ◽

2016 ◽

Author(s):

Andrea Manconi ◽

Marco Moscatelli ◽

Matteo Gnocchi ◽

Giuliano Armano ◽

Luciano Milanesi

Keyword(s):

Gpu Computing ◽

Scientific Workflow ◽

Biological Data ◽

Single Server ◽

Web Based ◽

Continuous Increase ◽

Gpu Cluster ◽

Workflow System ◽

The Galaxy ◽

Workload Manager

Motivation Recent advances in genome sequencing and biological data analysis technologies used in bioinformatics have led to a fast and continuous increase in biological data. The difficulty of managing the huge amounts of data currently available to researchers and the need to have results within a reasonable time have led to the use of distributed and parallel computing infrastructures for their analysis. Recently, bioinformatics is exploring new approaches based on the use of hardware accelerators as GPUs. From an architectural perspective, GPUs are very different from traditional CPUs. Indeed, the latter are devices composed of few cores with lots of cache memory able to handle a few software threads at a time. Conversely, the former are devices equipped with hundreds of cores able to handle thousands of threads simultaneously, so that a very high level of parallelism can be reached. Use of GPUs over the last years has resulted in significant increases in the performance of certain applications. Despite GPUs are increasingly used in bioinformatics most laboratories do not have access to a GPU cluster or server. In this context, it is very important to provide useful services to use these tools. Methods A web-based platform has been implemented with the aim to enable researchers to perform their analysis through dedicated GPU-based computing resources. To this end, a GPU cluster equipped with 16 NVIDIA Tesla k20c cards has been configured. The infrastructure has been built upon the Galaxy technology [1]. Galaxy is an open web-based scientific workflow system for data intensive biomedical research accessible to researchers that do not have programming experience. Let us recall that Galaxy provides a public server, but it does not provide support to GPU-computing. By default, Galaxy is designed to run jobs on local systems. However, it can also be configured to run jobs on a cluster. The front-end Galaxy application runs on a single server, but tools are run on cluster nodes instead. To this end, Galaxy supports different distributed resource managers with the aim to enable different clusters. For the specific case, in our opinion SLURM [2] represents the most suitable workload manager to manage and control jobs. SLURM is a highly configurable workload and resource manager and it is currently used on six of the ten most powerful computers in the world including the Piz Daint, utilizing over 5000 NVIDIA Tesla K20 GPUs. Results GPU-based tools [3] devised by our group for quality control of NGS data have been used to test the infrastructure. Initially, this activity required to make changes to the tools with the aim to optimize the parallelization on the cluster according to the adopted workload manager. Successively, the tools have been converted into web-based services accessible through the Galaxy portal. Abstract truncated at 3,000 characters - the full version is available in the pdf file.

Download Full-text

A GPU-based high performance computing infrastructure for specialized NGS analyses

10.7287/peerj.preprints.2175 ◽

2016 ◽

Author(s):

Andrea Manconi ◽

Marco Moscatelli ◽

Matteo Gnocchi ◽

Giuliano Armano ◽

Luciano Milanesi

Keyword(s):

Gpu Computing ◽

Scientific Workflow ◽

Biological Data ◽

Single Server ◽

Web Based ◽

Continuous Increase ◽

Gpu Cluster ◽

Workflow System ◽

The Galaxy ◽

Workload Manager

Motivation Recent advances in genome sequencing and biological data analysis technologies used in bioinformatics have led to a fast and continuous increase in biological data. The difficulty of managing the huge amounts of data currently available to researchers and the need to have results within a reasonable time have led to the use of distributed and parallel computing infrastructures for their analysis. Recently, bioinformatics is exploring new approaches based on the use of hardware accelerators as GPUs. From an architectural perspective, GPUs are very different from traditional CPUs. Indeed, the latter are devices composed of few cores with lots of cache memory able to handle a few software threads at a time. Conversely, the former are devices equipped with hundreds of cores able to handle thousands of threads simultaneously, so that a very high level of parallelism can be reached. Use of GPUs over the last years has resulted in significant increases in the performance of certain applications. Despite GPUs are increasingly used in bioinformatics most laboratories do not have access to a GPU cluster or server. In this context, it is very important to provide useful services to use these tools. Methods A web-based platform has been implemented with the aim to enable researchers to perform their analysis through dedicated GPU-based computing resources. To this end, a GPU cluster equipped with 16 NVIDIA Tesla k20c cards has been configured. The infrastructure has been built upon the Galaxy technology [1]. Galaxy is an open web-based scientific workflow system for data intensive biomedical research accessible to researchers that do not have programming experience. Let us recall that Galaxy provides a public server, but it does not provide support to GPU-computing. By default, Galaxy is designed to run jobs on local systems. However, it can also be configured to run jobs on a cluster. The front-end Galaxy application runs on a single server, but tools are run on cluster nodes instead. To this end, Galaxy supports different distributed resource managers with the aim to enable different clusters. For the specific case, in our opinion SLURM [2] represents the most suitable workload manager to manage and control jobs. SLURM is a highly configurable workload and resource manager and it is currently used on six of the ten most powerful computers in the world including the Piz Daint, utilizing over 5000 NVIDIA Tesla K20 GPUs. Results GPU-based tools [3] devised by our group for quality control of NGS data have been used to test the infrastructure. Initially, this activity required to make changes to the tools with the aim to optimize the parallelization on the cluster according to the adopted workload manager. Successively, the tools have been converted into web-based services accessible through the Galaxy portal. Abstract truncated at 3,000 characters - the full version is available in the pdf file.

Download Full-text

Process-oriented ecological modeling approach and scientific workflow system

Biodiversity Science ◽

10.3724/sp.j.1003.2014.13267 ◽

2014 ◽

Vol 22 (3) ◽

pp. 277

Author(s):

Qiao Huijie ◽

Lin Congtian ◽

Wang Jiangning ◽

Ji Liqiang

Keyword(s):

Ecological Modeling ◽

Scientific Workflow ◽

Modeling Approach ◽

Workflow System ◽

Process Oriented

Download Full-text

Tripal and Galaxy: supporting reproducible scientific workflows for community biological databases

Database ◽

10.1093/database/baaa032 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Shawna Spoor ◽

Connor Wytko ◽

Brian Soto ◽

Ming Chen ◽

Abdullah Almsaeed ◽

...

Keyword(s):

Workflow Management ◽

Application Programming Interface ◽

Scientific Workflow ◽

Biological Databases ◽

Web Based ◽

Job Management ◽

The Galaxy ◽

Application Programming ◽

Computational Resources ◽

Programming Interface

Abstract Online biological databases housing genomics, genetic and breeding data can be constructed using the Tripal toolkit. Tripal is an open-source, internationally developed framework that implements FAIR data principles and is meant to ease the burden of constructing such websites for research communities. Use of a common, open framework improves the sustainability and manageability of such as site. Site developers can create extensions for their site and in turn share those extensions with others. One challenge that community databases often face is the need to provide tools for their users that analyze increasingly larger datasets using multiple software tools strung together in a scientific workflow on complicated computational resources. The Tripal Galaxy module, a ‘plug-in’ for Tripal, meets this need through integration of Tripal with the Galaxy Project workflow management system. Site developers can create workflows appropriate to the needs of their community using Galaxy and then share those for execution on their Tripal sites via automatically constructed, but configurable, web forms or using an application programming interface to power web-based analytical applications. The Tripal Galaxy module helps reduce duplication of effort by allowing site developers to spend time constructing workflows and building their applications rather than rebuilding infrastructure for job management of multi-step applications.

Download Full-text

Elizabeth Wager. GETTING RESEARCH PUBLISHED – An A to Z of Publication Strategy, Third Edition. Boca Raton, New York, London: CRC Press, Taylor & Francis Group, LLC, 2015. 188 pages; ISBN-13:978-1-78523-138-4 (Paperback) - CAT# K28669

Open Access Macedonian Journal of Medical Sciences ◽

10.3889/oamjms.2016.031 ◽

2016 ◽

Vol 4 (1) ◽

pp. 185-186

Author(s):

Doncho Donev

Keyword(s):

New York ◽

Scientific Paper ◽

Research Papers ◽

Essential Information ◽

Publication Strategy ◽

Good Publication ◽

Scientific Papers ◽

Publishing Strategy ◽

The Right ◽

User Friendly

PURPOSE: This book provides step-by-step guidance on developing a sound publication strategy for how to prepare and get research papers published. The book is a user-friendly guide, a route map for publishing that covers many topics, ranging from abstracts and blogs, tables and trial registration to ethical principles and conventions for writing scientific papers. Publishing the results of scientific research in the form of a scientific paper is the ultimate goal and the final stage of the research of each scientist. To write and publish papers is never going to be an easy task. With this book as their guide, researchers will be better informed and therefore should have an easier and altogether more pleasant path to publication with clear direction on how to choose the right journal, avoid publication delays, and resolve authorship disputes and many other problems associated with scientific publishing.CONTENTS: The 188 pages of the book are distributed in 5 chapters in Part I and 249 entries ordered by the letters of Alphabet in Part II creating an A to Z of publication strategy. In the Appendices there are four sections covering further reading, organizations, guidelines and principles of good publication practice for company-sponsored medical research. The book also contains key references and useful websites within many entries where it seemed helpful. The last ten pages of the book present an index to help users to find the information of interest in the book.CONCLUSION: The book is intended to help all authors, young and old, novice and experienced, to plan their research and publications effectively and prepare manuscripts for journals and other publications, increasing the likelihood that their work will be published. Providing essential information on publishing strategy and process, the book should be extremely useful to everyone who wants to publish research results.

Download Full-text

Workspace – a Scientific Workflow System with commercial impact

El Sawah, S. (ed.) MODSIM2019, 23rd International Congress on Modelling and Simulation. ◽

10.36334/modsim.2019.d2.oakes ◽

2019 ◽

Keyword(s):

Scientific Workflow ◽

Workflow System

Download Full-text

Forum Gumeulis: Upaya Peningkatan Kompetensi Guru dalam Menulis Karya Ilmiah di Kota Tasikmalaya

Andragogi: Jurnal Diklat Teknis Pendidikan dan Keagamaan ◽

10.36052/andragogi.v8i1.122 ◽

2020 ◽

Vol 8 (1) ◽

pp. 418-429

Author(s):

Caswita

Keyword(s):

Professional Competence ◽

Scientific Paper ◽

Study Data ◽

Teacher Competence ◽

Teacher Competency ◽

Hands On ◽

Scientific Papers ◽

Teacher Professional ◽

The City

[FORUM GUMEULIS: EFFORTS TO IMPROVE TEACHER COMPETENCE IN WRITING SCIENTIFIC PAPERS IN TASIKMALAYA CITY]. The purpose of this study is to describe efforts to improve teacher competency in writing scientific papers through the activities of the Gumeulis forum in the City of Tasikmalaya, through training, guidance, mentoring and hands-on practice. The research method used is qualitative research with a type of case study. Data collection techniques through interviews, observation, and study of documentation. The results showed: (1) improvement of teacher competency in the city of Tasikmalaya in making the scientific paper more effectively carried out through teacher writing forums. (2) the development of teacher professional competence in making scientific papers through the Gumeulis forum activities shows an increase in teacher competency. (3) through the Gumeulis forum, there is mutual interaction together to learn to make Eastern Indonesia. (4), the Gumeulis Forum can create a conducive academic climate in improving teacher competency in creating scientific papers. The conclusion of the research shows that by learning together in the community will be able to improve the competence of teachers in writing scientific papers. This is because among members can discuss and learn together. Under the guidance of senior teachers in the community.

Download Full-text

The History and Future of Scientific Publishing

How Scientists Communicate ◽

10.1093/oso/9780190936600.003.0002 ◽

2020 ◽

pp. 10-33

Author(s):

Alan Kelly

Keyword(s):

Sixteenth Century ◽

Open Access ◽

Scientific Community ◽

Scientific Information ◽

Scientific Paper ◽

Scientific Journals ◽

Scientific Publishing ◽

The Past ◽

Current State ◽

Open Access Journals

This chapter reviews the development of the modern scientific paper, from the sixteenth century forward, and explores the ways in which scientific information has been disseminated in the past. Great scientific advances of the past are discussed in the context of how they were first published, or otherwise brought to the attention of the broader scientific community, and the modern scientific publishing sector is explored. The types and categories of scientific journals are discussed, along with an overview of current publishing trends, such as the exponential increase in number of journals, changes in the ways in which researchers access the literature, and in particular the emergence and current state of open access journals. In addition, various ways in which journals are ranked are discussed, and key trends in such lists over the last ten years or so explored.

Download Full-text

THE PREPARATION OF A PAPER

Geophysics ◽

10.1190/1.23050003.1 ◽

1958 ◽

Vol 23 (5) ◽

pp. 944-952

Author(s):

LAWRENCE Y. FAUST

Keyword(s):

Oral Presentation ◽

Scientific Paper ◽

Integrated Delivery ◽

Published Paper

Oral and written presentation of a scientific paper require dissimilar preparation. Properly planned figures carry the theme in oral presentation from introduction through conclusions. The accompanying comments by the speaker, using the slides as notes, explain and emphasize. The planning of figure sequence and practice of the running comments aid mutually in assuring an optimum organization. An integrated delivery results. The complete and permanent disclosure in the published paper utilizes figures and tables primarily for the display of data. The text describes the research and carries the argument. The abstract, tables, figures, and figure captions should provide a good synopsis of the paper. Complete disclosure requires clear writing which is attained by outside criticism and thoughtful and continued revision.

Download Full-text

Performance Driven Design Optimisation with Scientific Workflow System

International Conference on Green Buildings and Optimization Design (GBOD 2012) ◽

10.1115/1.860137_ch25 ◽

2012 ◽

pp. 189-196

Keyword(s):

Scientific Workflow ◽

Design Optimisation ◽

Workflow System

Download Full-text