A Scientific Workflow Framework Integrated with Object Deputy Model for Data Provenance

Author(s):  
Liwei Wang ◽  
Zhiyong Peng ◽  
Min Luo ◽  
Wenhao Ji ◽  
Zeqian Huang
2009 ◽  
Vol 31 (5) ◽  
pp. 721-732 ◽  
Author(s):  
Li-Wei WANG ◽  
Ze-Qian HUANG ◽  
Min LUO ◽  
Zhi-Yong PENG

Author(s):  
Khalid Belhajjame ◽  
Paolo Missier ◽  
Carole Goble

Data provenance is key to understanding and interpreting the results of scientific experiments. This chapter introduces and characterises data provenance in scientific workflows using illustrative examples taken from real-world workflows. The characterisation takes the form of a taxonomy that is used for comparing and analysing provenance capabilities supplied by existing scientific workflow systems.


Author(s):  
Sergio Manuel Serra da Cruz ◽  
Jose Antonio Pires do Nascimento

Reproducibility is a major feature of Science. Even agronomic research of exemplary quality may have irreproducible empirical findings because of random or systematic error. The ability to reproduce agronomic experiments based on statistical data and legacy scripts are not easily achieved. We propose RFlow, a tool that aid researchers to manage, share, and enact the scientific experiments that encapsulate legacy R scripts. RFlow transparently captures provenance of scripts and endows experiments reproducibility. Unlike existing computational approaches, RFlow is non-intrusive, does not require users to change their working way, it wraps agronomic experiments in a scientific workflow system. Our computational experiments show that the tool can collect different types of provenance metadata of real experiments and enrich agronomic data with provenance metadata. This study shows the potential of RFlow to serve as the primary integration platform for legacy R scripts, with implications for other data- and compute-intensive agronomic projects.


2012 ◽  
Vol 7 (1) ◽  
pp. 163-173 ◽  
Author(s):  
Arif Shaon ◽  
Sarah Callaghan ◽  
Bryan Lawrence ◽  
Brian Matthews ◽  
Timothy Osborn ◽  
...  

Traditionally, the formal scientific output in most fields of natural science has been limited to peer-reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved. It is increasingly important for open-access data to determine their authenticity and quality, especially considering the growing volumes of datasets appearing in the public domain. To address these issues, we present an approach that combines the Digital Object Identifier (DOI) – a widely adopted citation technique – with existing, widely adopted climate science data standards to formally publish detailed provenance of a climate research dataset as an associated scientific workflow. This is integrated with linked-data compliant data re-use standards (e.g. OAI-ORE) to enable a seamless link between a publication and the complete trail of lineage of the corresponding dataset, including the dataset itself.


Author(s):  
Todd Elsethagen ◽  
Eric Stephan ◽  
Bibi Raju ◽  
Malachi Schram ◽  
Matt MacDuff ◽  
...  

2014 ◽  
Vol 22 (3) ◽  
pp. 277
Author(s):  
Qiao Huijie ◽  
Lin Congtian ◽  
Wang Jiangning ◽  
Ji Liqiang

Sign in / Sign up

Export Citation Format

Share Document