synoptReg: An R package for computing a synoptic climate classification and a spatial regionalization of environmental data

Spatial entropy for biodiversity and environmental data: the R-package SpatEntropy

Environmental Modelling & Software ◽

10.1016/j.envsoft.2021.105149 ◽

2021 ◽

pp. 105149

Author(s):

Linda Altieri ◽

Daniela Cocchi ◽

Giulia Roli

Keyword(s):

R Package ◽

Environmental Data ◽

Spatial Entropy

Download Full-text

easyFulcrum: An R package to process and analyze ecological sampling data generated using the Fulcrum mobile application

PLoS ONE ◽

10.1371/journal.pone.0254293 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0254293 ◽

Cited By ~ 1

Author(s):

Matteo Di Bernardo ◽

Timothy A. Crombie ◽

Daniel E. Cook ◽

Erik C. Andersen

Keyword(s):

Mobile Application ◽

Large Scale ◽

Data Entry ◽

R Package ◽

Environmental Data ◽

Software Environment ◽

R Software ◽

Natural Sampling ◽

Disparate Data ◽

Ecological Sampling

Large-scale ecological sampling can be difficult and costly, especially for organisms that are too small to be easily identified in a natural environment by eye. Typically, these microscopic floral and fauna are sampled by collecting substrates from nature and then separating organisms from substrates in the laboratory. In many cases, diverse organisms can be identified to the species-level using molecular barcodes. To facilitate large-scale ecological sampling of microscopic organisms, we used a geographic data-collection platform for mobile devices called Fulcrum that streamlines the organization of geospatial sampling data, substrate photographs, and environmental data at natural sampling sites. These sampling data are then linked to organism isolation data from the laboratory. Here, we describe the easyFulcrum R package, which can be used to clean, process, and visualize ecological field sampling and isolation data exported from the Fulcrum mobile application. We developed this package for wild nematode sampling, but it can be used with other organisms. The advantages of using Fulcrum combined with easyFulcrum are (1) the elimination of transcription errors by replacing manual data entry and/or spreadsheets with a mobile application, (2) the ability to clean, process, and visualize sampling data using a standardized set of functions in the R software environment, and (3) the ability to join disparate data to each other, including environmental data from the field and the molecularly defined identities of individual specimens isolated from samples.

Download Full-text

learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data

10.1101/2021.12.13.472185 ◽

2021 ◽

Author(s):

Cathy C. Westhues ◽

Henner Simianer ◽

Timothy M. Beissinger

Keyword(s):

Machine Learning ◽

Genomic Prediction ◽

Prediction Models ◽

R Package ◽

Fixed Number ◽

Environmental Data ◽

Weather Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Daily Weather Data

We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial (MET) breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or can retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated in daily windows based on naive (for instance, daily windows with a fixed number of days) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient boosted trees, random forests, stacked ensemble models, and multi-layer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with MET experimental data in a user-friendly way. The package is fully open source and accessible on GitHub.

Download Full-text

microeco: An R package for data mining in microbial community ecology

FEMS Microbiology Ecology ◽

10.1093/femsec/fiaa255 ◽

2020 ◽

Author(s):

Chi Liu ◽

Yaoming Cui ◽

Xiangzhen Li ◽

Minjie Yao

Keyword(s):

Data Mining ◽

Microbial Community ◽

Community Ecology ◽

Amplicon Sequencing ◽

R Package ◽

Environmental Data ◽

Venn Diagram ◽

Diversity Analysis ◽

Sequencing Data ◽

Microbial Community Ecology

Abstract A large amount of sequencing data is produced in microbial community ecology studies using the high-throughput sequencing technique, especially amplicon-sequencing-based community data. After conducting the initial bioinformatic analysis of amplicon sequencing data, performing the subsequent statistics and data mining based on the operational taxonomic unit and taxonomic assignment tables is still complicated and time-consuming. To address this problem, we present an integrated R package-‘microeco’ as an analysis pipeline for treating microbial community and environmental data. This package was developed based on the R6 class system and combines a series of commonly used and advanced approaches in microbial community ecology research. The package includes classes for data preprocessing, taxa abundance plotting, venn diagram, alpha diversity analysis, beta diversity analysis, differential abundance test and indicator taxon analysis, environmental data analysis, null model analysis, network analysis and functional analysis. Each class is designed to provide a set of approaches that can be easily accessible to users. Compared with other R packages in the microbial ecology field, the microeco package is fast, flexible and modularized to use, and provides powerful and convenient tools for researchers. The microeco package can be installed from CRAN (The Comprehensive R Archive Network) or github (https://github.com/ChiLiubio/microeco).

Download Full-text

EnvRtype: a software to interplay enviromics and quantitative genomics in agriculture

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab040 ◽

2021 ◽

Vol 11 (4) ◽

Author(s):

Germano Costa-Neto ◽

Giovanni Galli ◽

Humberto Fanelli Carvalho ◽

José Crossa ◽

Roberto Fritsche-Neto

Keyword(s):

Large Scale ◽

Cost Effective ◽

Reaction Norm ◽

R Package ◽

Global Scale ◽

Environmental Data ◽

Fine Tuning ◽

Literature Mining ◽

Living Organisms ◽

Quantitative Genomics

Abstract Envirotyping is an essential technique used to unfold the nongenetic drivers associated with the phenotypic adaptation of living organisms. Here, we introduce the EnvRtype R package, a novel toolkit developed to interplay large-scale envirotyping data (enviromics) into quantitative genomics. To start a user-friendly envirotyping pipeline, this package offers: (1) remote sensing tools for collecting (get_weather and extract_GIS functions) and processing ecophysiological variables (processWTH function) from raw environmental data at single locations or worldwide; (2) environmental characterization by typing environments and profiling descriptors of environmental quality (env_typing function), in addition to gathering environmental covariables as quantitative descriptors for predictive purposes (W_matrix function); and (3) identification of environmental similarity that can be used as an enviromic-based kernel (env_typing function) in whole-genome prediction (GP), aimed at increasing ecophysiological knowledge in genomic best-unbiased predictions (GBLUP) and emulating reaction norm effects (get_kernel and kernel_model functions). We highlight literature mining concepts in fine-tuning envirotyping parameters for each plant species and target growing environments. We show that envirotyping for predictive breeding collects raw data and processes it in an eco-physiologically smart way. Examples of its use for creating global-scale envirotyping networks and integrating reaction-norm modeling in GP are also outlined. We conclude that EnvRtype provides a cost-effective envirotyping pipeline capable of providing high quality enviromic data for a diverse set of genomic-based studies, especially for increasing accuracy in GP across untested growing environments.

Download Full-text

EnvRtype: a software to interplay enviromics and quantitative genomics in agriculture

10.1101/2020.10.14.339705 ◽

2020 ◽

Author(s):

Germano Costa-Neto ◽

Giovanni Galli ◽

Humberto Fanelli Carvalho ◽

José Crossa ◽

Roberto Fritsche-Neto

Keyword(s):

Large Scale ◽

Cost Effective ◽

Reaction Norm ◽

R Package ◽

Global Scale ◽

Environmental Data ◽

Fine Tuning ◽

Literature Mining ◽

Living Organisms ◽

Quantitative Genomics

ABSTRACTEnvirotyping is an essential technique used to unfold the non-genetic drivers associated with the phenotypic adaptation of living organisms. Here we introduce the EnvRtype R package, a novel toolkit developed to interplay large-scale envirotyping data (enviromics) into quantitative genomics. To start a user-friendly envirotyping pipeline, this package offers: (1) remote sensing tools for collecting (get_weather and extract_GIS functions) and processing ecophysiological variables (processWTH function) from raw environmental data at single locations or worldwide; (2) environmental characterization by typing environments and profiling descriptors of environmental quality (env_typing function), in addition to gathering environmental covariables as quantitative descriptors for predictive purposes (W_matrix function); and (3) identification of environmental similarity that can be used as an enviromic-based kernel (env_typing function) in whole-genome prediction (GP), aimed at increasing ecophysiological knowledge in genomic best-unbiased predictions (GBLUP) and emulating reaction norm effects (get_kernel and kernel_model functions). We highlight literature mining concepts in fine-tuning envirotyping parameters for each plant species and target growing environments. We show that envirotyping for predictive breeding collects raw data and processes it in an eco-physiologically-smart way. Examples of its use for creating global-scale envirotyping networks and integrating reaction-norm modeling in GP are also outlined. We conclude that EnvRtype provides a cost-effective envirotyping pipeline capable of providing high quality enviromic data for a diverse set of genomic-based studies, especially for increasing accuracy in GP across untested growing environments.

Download Full-text

SWMPr: An R Package for Retrieving, Organizing, and Analyzing Environmental Data for Estuaries

The R Journal ◽

10.32614/rj-2016-015 ◽

2016 ◽

Vol 8 (1) ◽

pp. 219 ◽

Cited By ~ 1

Author(s):

Marcus,W Beck

Keyword(s):

R Package ◽

Environmental Data

Download Full-text

A cloud-based toolbox for the versatile environmental annotation of biodiversity data

PLoS Biology ◽

10.1371/journal.pbio.3001460 ◽

2021 ◽

Vol 19 (11) ◽

pp. e3001460

Author(s):

Richard Li ◽

Ajay Ranipeta ◽

John Wilshire ◽

Jeremy Malczyk ◽

Michelle Duong ◽

...

Keyword(s):

R Package ◽

Scale Dependence ◽

Temporal Scale ◽

Environmental Data ◽

Task Management ◽

Biodiversity Data ◽

Web Based ◽

Continental Scale ◽

User Friendly ◽

Vast Range

A vast range of research applications in biodiversity sciences requires integrating primary species, genetic, or ecosystem data with other environmental data. This integration requires a consideration of the spatial and temporal scale appropriate for the data and processes in question. But a versatile and scale flexible environmental annotation of biodiversity data remains constrained by technical hurdles. Existing tools have streamlined the intersection of occurrence records with gridded environmental data but have remained limited in their ability to address a range of spatial and temporal grains, especially for large datasets. We present the Spatiotemporal Observation Annotation Tool (STOAT), a cloud-based toolbox for flexible biodiversity–environment annotations. STOAT is optimized for large biodiversity datasets and allows user-specified spatial and temporal resolution and buffering in support of environmental characterizations that account for the uncertainty and scale of data and of relevant processes. The tool offers these services for a growing set of near global, remotely sensed, or modeled environmental data, including Landsat, MODIS, EarthEnv, and CHELSA. STOAT includes a user-friendly, web-based dashboard that provides tools for annotation task management and result visualization, linked to Map of Life, and a dedicated R package (rstoat) for programmatic access. We demonstrate STOAT functionality with several examples that illustrate phenological variation and spatial and temporal scale dependence of environmental characteristics of birds at a continental scale. We expect STOAT to facilitate broader exploration and assessment of the scale dependence of observations and processes in ecology.

Download Full-text

OPTIMOS PRIME: An R package for autoecological (optima and tolerance range) data calculation

10.1101/654152 ◽

2019 ◽

Author(s):

María Belén Sathicq ◽

María Mercedes Nicolosi Gelis ◽

Joaquín Cochero

Keyword(s):

Collaborative Work ◽

Species Abundance ◽

Weighted Average ◽

R Package ◽

Environmental Data ◽

Range Data ◽

List Type ◽

Tolerance Range ◽

Large Database ◽

Sample Data

ABSTRACTCalculation of autoecological data, such as optima and tolerance ranges to environmental variables, can be useful to establish the distribution and abundance of the species. These calculations, although mathematically not complex, can be prone to error when using a large database.We present an R package (“optimos.prime”) that uses species’ abundance data and environmental data to calculate the optimum value and tolerance range of each species to each environmental factor, by weighted average. Additionally, the package can create caterpillar plots to show the results.Using sample data from a phytoplankton database, we exemplify the use of the R package and its functions. A stand-alone version for Windows is also provided, and source code and documents are freely available on GitHub to encourage collaborative work.

Download Full-text

Identification and Estimation of Causal Effects Using a Negative-Control Exposure in Time-Series Studies With Applications to Environmental Epidemiology

American Journal of Epidemiology ◽

10.1093/aje/kwaa172 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yuanyuan Yu ◽

Hongkai Li ◽

Xiaoru Sun ◽

Xinhui Liu ◽

Fan Yang ◽

...

Keyword(s):

Time Series ◽

Causal Effect ◽

Environmental Epidemiology ◽

R Package ◽

Negative Control ◽

Environmental Data ◽

Causal Effects ◽

Unbiased Estimation ◽

Data Sets ◽

Time Series Studies

Abstract The initial aim of environmental epidemiology is to estimate the causal effects of environmental exposures on health outcomes. However, due to lack of enough covariates in most environmental data sets, current methods without enough adjustments for confounders inevitably lead to residual confounding. We propose a negative-control exposure based on a time-series studies (NCE-TS) model to effectively eliminate unobserved confounders using an after-outcome exposure as a negative-control exposure. We show that the causal effect is identifiable and can be estimated by the NCE-TS for continuous and categorical outcomes. Simulation studies indicate unbiased estimation by the NCE-TS model. The potential of NCE-TS is illustrated by 2 challenging applications: We found that living in areas with higher levels of surrounding greenness over 6 months was associated with less risk of stroke-specific mortality, based on the Shandong Ecological Health Cohort during January 1, 2010, to December 31, 2018. In addition, we found that the widely established negative association between temperature and cancer risks was actually caused by numbers of unobserved confounders, according to the Global Open Database from 2003–2012. The proposed NCE-TS model is implemented in an R package (R Foundation for Statistical Computing, Vienna, Austria) called NCETS, freely available on GitHub.

Download Full-text