scholarly journals Virtual Laboratories for Biodiversity Modelling: An Australian perspective

Author(s):  
Sarah Richmond ◽  
Chantal Huijbers

Recent technologies have enabled consistent and continuous collection of ecological data at high resolutions across large spatial scales. The challenge remains, however, to bring these data together and expose them to methods and tools to analyse the interaction between biodiversity and the environment. These challenges are mostly associated with the accessibility, visibility and interoperability of data, and the technical computation needed to interpret the data. Australia has invested in digital research infrastructures through the National Collaborative Research Infrastructure Strategy (NCRIS). Here we present two platforms that provide easy access to global biodiversity, climate and environmental datasets integrated with a suite of analytical tools and linked to high-performance cloud computing infrastructure. The Biodiversity and Climate Change Virtual Laboratory (BCCVL) is a point-and-click online platform for modelling species responses to environmental conditions, which provides an easy introduction into the scientific concepts of models without the need for the user to understand the underlying code. For ecologists who write their own modelling scripts, we have developed ecocloud: a new online environment that provides access to data connected with command-line analysis tools like RStudio and Jupyter Notebooks as well as a virtual desktop environment using Australia’s national cloud computing infrastructure. ecocloud is built through collaborations among key facilities within the ecosciences domain, establishing a collective long-term vision of creating an ecosystem of infrastructure that provides the capability to enable reliable prediction of future environmental outcomes. Underpinning these tools is an innovative training program, ecoEd, which provides cohesive training and skill development to enhance the translation of Australia’s digital research infrastructures to the ecoscience community by educating and upskilling the next generation of environmental scientists and managers. Both of these platforms are built using a best-practice microservice model that allows for complete flexibility, scalability and stability in a cloud environment. Both the BCCVL and ecocloud are open-source developments and provide opportunities for interoperability with other platforms (e.g. Atlas of Living Austalia). In Australia, the same technical infrastructure is also used for a platform for the humanities and social science domain, indicating that the underlying technologies are not domain specific. We therefore welcome collaborations with other organisations to further develop these platforms for the wider bio- and ecoinformatics community. This presentation will showcase the tools, services, and underpinning infrastructure alongside our training and engagement framework as an exemplar in building platforms for next generation biodiversity science.

2015 ◽  
pp. 566-579
Author(s):  
Keyun Ruan

Cloud computing is a major transition, and it comes at a unique historical and strategic time for applying foundational design thinking to secure the next-generation computing infrastructure and enable waves of business and technological innovation. In this chapter, the researcher summarizes six key research and development areas for designing a forensic-enabling cloud ecosystem, including architecture and matrix, standardization and strategy, evidence segregation, security and forensic integration, legal framework, and privacy.


2014 ◽  
Vol 17 (1) ◽  
pp. 139-152 ◽  
Author(s):  
Raffaele Montella ◽  
Giulio Giunta ◽  
Giuliano Laccetti

2017 ◽  
Author(s):  
Wendy Sharples ◽  
Ilya Zhukov ◽  
Markus Geimer ◽  
Klaus Goergen ◽  
Stefan Kollet ◽  
...  

Abstract. Geoscientific modeling is constantly evolving, with next generation geoscientific models and applications placing high demands on high performance computing (HPC) resources. These demands are being met by new developments in HPC architectures, software libraries, and infrastructures. New HPC developments require new programming paradigms leading to substantial investment in model porting, tuning, and refactoring of complicated legacy code in order to use these resources effectively. In addition to the challenge of new massively parallel HPC systems, reproducibility of simulation and analysis results is of great concern, as the next generation geoscientific models are based on complex model implementations and profiling, modeling and data processing workflows. Thus, in order to reduce both the duration and the cost of code migration, aid in the development of new models or model components, while ensuring reproducibility and sustainability over the complete data life cycle, a streamlined approach to profiling, porting, and provenance tracking is necessary.We propose a run control framework (RCF) integrated with a workflow engine which encompasses all stages of the modeling chain: 1. preprocess input, 2. compilation of code (including code instrumentation with performance analysis tools), 3. simulation run, 4. postprocess and analysis, to address these issues.Within this RCF, the workflow engine is used to create and manage benchmark or simulation parameter combinations and performs the documentation and data organization for reproducibility. This approach automates the process of porting and tuning, profiling, testing, and running a geoscientific model. We show that in using our run control framework, testing, benchmarking, profiling, and running models is less time consuming and more robust, resulting in more efficient use of HPC resources, more strategic code development, and enhanced data integrity and reproducibility.


2015 ◽  
Author(s):  
Pierre Carrier ◽  
Bill Long ◽  
Richard Walsh ◽  
Jef Dawson ◽  
Carlos P. Sosa ◽  
...  

High Performance Computing (HPC) Best Practice offers opportunities to implement lessons learned in areas such as computational chemistry and physics in genomics workflows, specifically Next-Generation Sequencing (NGS) workflows. In this study we will briefly describe how distributed-memory parallelism can be an important enhancement to the performance and resource utilization of NGS workflows. We will illustrate this point by showing results on the parallelization of the Inchworm module of the Trinity RNA-Seq pipeline for de novo transcriptome assembly. We show that these types of applications can scale to thousands of cores. Time scaling as well as memory scaling will be discussed at length using two RNA-Seq datasets, targeting the Mus musculus (mouse) and the Axolotl (Mexican salamander). Details about the efficient MPI communication and the impact on performance will also be shown. We hope to demonstrate that this type of parallelization approach can be extended to most types of bioinformatics workflows, with substantial benefits. The efficient, distributed-memory parallel implementation eliminates memory bottlenecks and dramatically accelerates NGS analysis. We further include a summary of programming paradigms available to the bioinformatics community, such as C++/MPI.


Author(s):  
Wolfgang Gentzsch ◽  
Burak Yenier

The adoption of cloud computing for engineering and scientific applications is still lagging behind, although many cloud providers today offer powerful computing infrastructure as a service, and enterprises are already making routine use of it. Reasons for this slow adoption are many: complex access to clouds, inflexible software licensing, time-consuming big data transfer, loss of control over their assets, service provider lock-in, to name a few. But recently, with the advent of the UberCloud's novel high-performance software container technology, many of these roadblocks are currently being removed. In this paper the authors describe the current status and landscape of clouds for engineers and scientists, the benefits and challenges, and how UberCloud is providing an online solution platform and container technology which reduce or even remove many of the current roadblock, and thus offer every engineer and scientist additional compute power on demand, in an easily accessible way.


2018 ◽  
Vol 11 (7) ◽  
pp. 2875-2895
Author(s):  
Wendy Sharples ◽  
Ilya Zhukov ◽  
Markus Geimer ◽  
Klaus Goergen ◽  
Sebastian Luehrs ◽  
...  

Abstract. Geoscientific modeling is constantly evolving, with next-generation geoscientific models and applications placing large demands on high-performance computing (HPC) resources. These demands are being met by new developments in HPC architectures, software libraries, and infrastructures. In addition to the challenge of new massively parallel HPC systems, reproducibility of simulation and analysis results is of great concern. This is due to the fact that next-generation geoscientific models are based on complex model implementations and profiling, modeling, and data processing workflows. Thus, in order to reduce both the duration and the cost of code migration, aid in the development of new models or model components, while ensuring reproducibility and sustainability over the complete data life cycle, an automated approach to profiling, porting, and provenance tracking is necessary. We propose a run control framework (RCF) integrated with a workflow engine as a best practice approach to automate profiling, porting, provenance tracking, and simulation runs. Our RCF encompasses all stages of the modeling chain: (1) preprocess input, (2) compilation of code (including code instrumentation with performance analysis tools), (3) simulation run, and (4) postprocessing and analysis, to address these issues. Within this RCF, the workflow engine is used to create and manage benchmark or simulation parameter combinations and performs the documentation and data organization for reproducibility. In this study, we outline this approach and highlight the subsequent developments scheduled for implementation born out of the extensive profiling of ParFlow. We show that in using our run control framework, testing, benchmarking, profiling, and running models is less time consuming and more robust than running geoscientific applications in an ad hoc fashion, resulting in more efficient use of HPC resources, more strategic code development, and enhanced data integrity and reproducibility.


2013 ◽  
pp. 331-344 ◽  
Author(s):  
Keyun Ruan

Cloud computing is a major transition, and it comes at a unique historical and strategic time for applying foundational design thinking to secure the next-generation computing infrastructure and enable waves of business and technological innovation. In this chapter, the researcher summarizes six key research and development areas for designing a forensic-enabling cloud ecosystem, including architecture and matrix, standardization and strategy, evidence segregation, security and forensic integration, legal framework, and privacy.


Author(s):  
Giuliano Pelfer

This article describes how archaeological and historical research grew as a multidisciplinary and interdisciplinary activity due to availability of larger amount of data within the reconstruction of global historical and archaeological contexts at a global spatio-temporal scale. The increased information, also integrated with data from the Earth Sciences, has had an effect on the exponential increase of complex sets of data and of refined methods of analysis. For such purposes, this article discusses the ArchaeoGRID Science Gateway paradigm for accessing ArchaeoGRID Cyberinfrastructure (CI), a Distributed Computing Infrastructure (DCI), that can supply storage and computing resources for managing and analyzing large amount of archaeological and historical data. In fact, ArchaeoGRID Science Gateway is emerging as high-level web environment that makes easier the access, in a transparent way, to DCI, as local high-performance computing, Grids and Clouds, from no specialized Virtual Research Communities (VRC) of archaeologists and historians.


Author(s):  
Giuliano Pelfer

This article describes how archaeological and historical research grew as a multidisciplinary and interdisciplinary activity due to availability of larger amount of data within the reconstruction of global historical and archaeological contexts at a global spatio-temporal scale. The increased information, also integrated with data from the Earth Sciences, has had an effect on the exponential increase of complex sets of data and of refined methods of analysis. For such purposes, this article discusses the ArchaeoGRID Science Gateway paradigm for accessing ArchaeoGRID Cyberinfrastructure (CI), a Distributed Computing Infrastructure (DCI), that can supply storage and computing resources for managing and analyzing large amount of archaeological and historical data. In fact, ArchaeoGRID Science Gateway is emerging as high-level web environment that makes easier the access, in a transparent way, to DCI, as local high-performance computing, Grids and Clouds, from no specialized Virtual Research Communities (VRC) of archaeologists and historians.


Sign in / Sign up

Export Citation Format

Share Document