scholarly journals From peer-reviewed to peer-reproduced: a role for data standards, models and computational workflows in scholarly publishing

2014 ◽  
Author(s):  
Alejandra Gonzalez-Beltran ◽  
Peter Li ◽  
Jun Zhao ◽  
Maria Susana Avila-Garcia ◽  
Marco Roos ◽  
...  

Motivation: Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the proce- dures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanop- ublications (NP) and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler. Results: Executable workflows were developed using Galaxy which reproduced results that were con- sistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata. Availability: SOAPdenovo2 scripts, data and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy. cbiit.cuhk.edu.hk; and the representations using the ISA, NP and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/. Contact: philippe.rocca- [email protected] and [email protected]

2016 ◽  
Author(s):  
Andrea Manconi ◽  
Marco Moscatelli ◽  
Matteo Gnocchi ◽  
Giuliano Armano ◽  
Luciano Milanesi

Motivation Recent advances in genome sequencing and biological data analysis technologies used in bioinformatics have led to a fast and continuous increase in biological data. The difficulty of managing the huge amounts of data currently available to researchers and the need to have results within a reasonable time have led to the use of distributed and parallel computing infrastructures for their analysis. Recently, bioinformatics is exploring new approaches based on the use of hardware accelerators as GPUs. From an architectural perspective, GPUs are very different from traditional CPUs. Indeed, the latter are devices composed of few cores with lots of cache memory able to handle a few software threads at a time. Conversely, the former are devices equipped with hundreds of cores able to handle thousands of threads simultaneously, so that a very high level of parallelism can be reached. Use of GPUs over the last years has resulted in significant increases in the performance of certain applications. Despite GPUs are increasingly used in bioinformatics most laboratories do not have access to a GPU cluster or server. In this context, it is very important to provide useful services to use these tools. Methods A web-based platform has been implemented with the aim to enable researchers to perform their analysis through dedicated GPU-based computing resources. To this end, a GPU cluster equipped with 16 NVIDIA Tesla k20c cards has been configured. The infrastructure has been built upon the Galaxy technology [1]. Galaxy is an open web-based scientific workflow system for data intensive biomedical research accessible to researchers that do not have programming experience. Let us recall that Galaxy provides a public server, but it does not provide support to GPU-computing. By default, Galaxy is designed to run jobs on local systems. However, it can also be configured to run jobs on a cluster. The front-end Galaxy application runs on a single server, but tools are run on cluster nodes instead. To this end, Galaxy supports different distributed resource managers with the aim to enable different clusters. For the specific case, in our opinion SLURM [2] represents the most suitable workload manager to manage and control jobs. SLURM is a highly configurable workload and resource manager and it is currently used on six of the ten most powerful computers in the world including the Piz Daint, utilizing over 5000 NVIDIA Tesla K20 GPUs. Results GPU-based tools [3] devised by our group for quality control of NGS data have been used to test the infrastructure. Initially, this activity required to make changes to the tools with the aim to optimize the parallelization on the cluster according to the adopted workload manager. Successively, the tools have been converted into web-based services accessible through the Galaxy portal. Abstract truncated at 3,000 characters - the full version is available in the pdf file.


2016 ◽  
Author(s):  
Andrea Manconi ◽  
Marco Moscatelli ◽  
Matteo Gnocchi ◽  
Giuliano Armano ◽  
Luciano Milanesi

Motivation Recent advances in genome sequencing and biological data analysis technologies used in bioinformatics have led to a fast and continuous increase in biological data. The difficulty of managing the huge amounts of data currently available to researchers and the need to have results within a reasonable time have led to the use of distributed and parallel computing infrastructures for their analysis. Recently, bioinformatics is exploring new approaches based on the use of hardware accelerators as GPUs. From an architectural perspective, GPUs are very different from traditional CPUs. Indeed, the latter are devices composed of few cores with lots of cache memory able to handle a few software threads at a time. Conversely, the former are devices equipped with hundreds of cores able to handle thousands of threads simultaneously, so that a very high level of parallelism can be reached. Use of GPUs over the last years has resulted in significant increases in the performance of certain applications. Despite GPUs are increasingly used in bioinformatics most laboratories do not have access to a GPU cluster or server. In this context, it is very important to provide useful services to use these tools. Methods A web-based platform has been implemented with the aim to enable researchers to perform their analysis through dedicated GPU-based computing resources. To this end, a GPU cluster equipped with 16 NVIDIA Tesla k20c cards has been configured. The infrastructure has been built upon the Galaxy technology [1]. Galaxy is an open web-based scientific workflow system for data intensive biomedical research accessible to researchers that do not have programming experience. Let us recall that Galaxy provides a public server, but it does not provide support to GPU-computing. By default, Galaxy is designed to run jobs on local systems. However, it can also be configured to run jobs on a cluster. The front-end Galaxy application runs on a single server, but tools are run on cluster nodes instead. To this end, Galaxy supports different distributed resource managers with the aim to enable different clusters. For the specific case, in our opinion SLURM [2] represents the most suitable workload manager to manage and control jobs. SLURM is a highly configurable workload and resource manager and it is currently used on six of the ten most powerful computers in the world including the Piz Daint, utilizing over 5000 NVIDIA Tesla K20 GPUs. Results GPU-based tools [3] devised by our group for quality control of NGS data have been used to test the infrastructure. Initially, this activity required to make changes to the tools with the aim to optimize the parallelization on the cluster according to the adopted workload manager. Successively, the tools have been converted into web-based services accessible through the Galaxy portal. Abstract truncated at 3,000 characters - the full version is available in the pdf file.


2014 ◽  
Vol 22 (3) ◽  
pp. 277
Author(s):  
Qiao Huijie ◽  
Lin Congtian ◽  
Wang Jiangning ◽  
Ji Liqiang

Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Shawna Spoor ◽  
Connor Wytko ◽  
Brian Soto ◽  
Ming Chen ◽  
Abdullah Almsaeed ◽  
...  

Abstract Online biological databases housing genomics, genetic and breeding data can be constructed using the Tripal toolkit. Tripal is an open-source, internationally developed framework that implements FAIR data principles and is meant to ease the burden of constructing such websites for research communities. Use of a common, open framework improves the sustainability and manageability of such as site. Site developers can create extensions for their site and in turn share those extensions with others. One challenge that community databases often face is the need to provide tools for their users that analyze increasingly larger datasets using multiple software tools strung together in a scientific workflow on complicated computational resources. The Tripal Galaxy module, a ‘plug-in’ for Tripal, meets this need through integration of Tripal with the Galaxy Project workflow management system. Site developers can create workflows appropriate to the needs of their community using Galaxy and then share those for execution on their Tripal sites via automatically constructed, but configurable, web forms or using an application programming interface to power web-based analytical applications. The Tripal Galaxy module helps reduce duplication of effort by allowing site developers to spend time constructing workflows and building their applications rather than rebuilding infrastructure for job management of multi-step applications.


2016 ◽  
Vol 4 (1) ◽  
pp. 185-186
Author(s):  
Doncho Donev

PURPOSE: This book provides step-by-step guidance on developing a sound publication strategy for how to prepare and get research papers published. The book is a user-friendly guide, a route map for publishing that covers many topics, ranging from abstracts and blogs, tables and trial registration to ethical principles and conventions for writing scientific papers. Publishing the results of scientific research in the form of a scientific paper is the ultimate goal and the final stage of the research of each scientist. To write and publish papers is never going to be an easy task. With this book as their guide, researchers will be better informed and therefore should have an easier and altogether more pleasant path to publication with clear direction on how to choose the right journal, avoid publication delays, and resolve authorship disputes and many other problems associated with scientific publishing.CONTENTS: The 188 pages of the book are distributed in 5 chapters in Part I and 249 entries ordered by the letters of Alphabet in Part II creating an A to Z of publication strategy. In the Appendices there are four sections covering further reading, organizations, guidelines and principles of good publication practice for company-sponsored medical research. The book also contains key references and useful websites within many entries where it seemed helpful. The last ten pages of the book present an index to help users to find the information of interest in the book.CONCLUSION: The book is intended to help all authors, young and old, novice and experienced, to plan their research and publications effectively and prepare manuscripts for journals and other publications, increasing the likelihood that their work will be published. Providing essential information on publishing strategy and process, the book should be extremely useful to everyone who wants to publish research results.


2020 ◽  
Vol 8 (1) ◽  
pp. 418-429
Author(s):  
Caswita

[FORUM GUMEULIS: EFFORTS TO IMPROVE TEACHER COMPETENCE IN WRITING SCIENTIFIC PAPERS IN TASIKMALAYA CITY]. The purpose of this study is to describe efforts to improve teacher competency in writing scientific papers through the activities of the Gumeulis forum in the City of Tasikmalaya, through training, guidance, mentoring and hands-on practice. The research method used is qualitative research with a type of case study. Data collection techniques through interviews, observation, and study of documentation. The results showed: (1) improvement of teacher competency in the city of Tasikmalaya in making the scientific paper more effectively carried out through teacher writing forums. (2) the development of teacher professional competence in making scientific papers through the Gumeulis forum activities shows an increase in teacher competency. (3) through the Gumeulis forum, there is mutual interaction together to learn to make Eastern Indonesia. (4), the Gumeulis Forum can create a conducive academic climate in improving teacher competency in creating scientific papers. The conclusion of the research shows that by learning together in the community will be able to improve the competence of teachers in writing scientific papers. This is because among members can discuss and learn together. Under the guidance of senior teachers in the community.


Author(s):  
Alan Kelly

This chapter reviews the development of the modern scientific paper, from the sixteenth century forward, and explores the ways in which scientific information has been disseminated in the past. Great scientific advances of the past are discussed in the context of how they were first published, or otherwise brought to the attention of the broader scientific community, and the modern scientific publishing sector is explored. The types and categories of scientific journals are discussed, along with an overview of current publishing trends, such as the exponential increase in number of journals, changes in the ways in which researchers access the literature, and in particular the emergence and current state of open access journals. In addition, various ways in which journals are ranked are discussed, and key trends in such lists over the last ten years or so explored.


Geophysics ◽  
1958 ◽  
Vol 23 (5) ◽  
pp. 944-952
Author(s):  
LAWRENCE Y. FAUST

Oral and written presentation of a scientific paper require dissimilar preparation. Properly planned figures carry the theme in oral presentation from introduction through conclusions. The accompanying comments by the speaker, using the slides as notes, explain and emphasize. The planning of figure sequence and practice of the running comments aid mutually in assuring an optimum organization. An integrated delivery results. The complete and permanent disclosure in the published paper utilizes figures and tables primarily for the display of data. The text describes the research and carries the argument. The abstract, tables, figures, and figure captions should provide a good synopsis of the paper. Complete disclosure requires clear writing which is attained by outside criticism and thoughtful and continued revision.


Sign in / Sign up

Export Citation Format

Share Document