scholarly journals Struo: a pipeline for building custom databases for common metagenome profilers

2019 ◽  
Author(s):  
Jacobo de la Cuesta-Zuluaga ◽  
Ruth E. Ley ◽  
Nicholas D. Youngblut

AbstractSummaryTaxonomic and functional information from microbial communities can be efficiently obtained by metagenome profiling, which requires databases of genes and genomes to which sequence reads are mapped. However, the databases that accompany metagenome profilers are not updated at a pace that matches the increase in available microbial genomes. To address this, we developed Struo, a modular pipeline that automatizes the acquisition of genomes from public repositories and the construction of custom databases for multiple metagenome profilers. The use of custom databases that broadly represent the known microbial diversity by incorporating novel genomes results in a substantial increase in mappability of reads in synthetic and real metagenome datasets.Availability and implementationSource code available for download at https://github.com/leylabmpi/Struo. Custom GTDB databases available at http://ftp.tue.mpg.de/ebio/projects/struo/[email protected]

2019 ◽  
Vol 36 (7) ◽  
pp. 2314-2315 ◽  
Author(s):  
Jacobo de la Cuesta-Zuluaga ◽  
Ruth E Ley ◽  
Nicholas D Youngblut

Abstract Summary Taxonomic and functional information from microbial communities can be efficiently obtained by metagenome profiling, which requires databases of genes and genomes to which sequence reads are mapped. However, the databases that accompany metagenome profilers are not updated at a pace that matches the increase in available microbial genomes, and unifying database content across metagenome profiling tools can be cumbersome. To address this, we developed Struo, a modular pipeline that automatizes the acquisition of genomes from public repositories and the construction of custom databases for multiple metagenome profilers. The use of custom databases that broadly represent the known microbial diversity by incorporating novel genomes results in a substantial increase in mappability of reads in synthetic and real metagenome datasets. Availability and implementation Source code available for download at https://github.com/leylabmpi/Struo. Custom genome taxonomy database databases available at http://ftp.tue.mpg.de/ebio/projects/struo/. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Author(s):  
Matthew R. Olm ◽  
Christopher T. Brown ◽  
Brandon Brooks ◽  
Jillian F. Banfield

The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that sequentially applies a fast, inaccurate estimation of genome distance and a slow but accurate measure of average nucleotide identity to reduce the computational time for pair-wise genome set comparisons by orders of magnitude. We demonstrate its use in a study where we separately assembled each metagenome from time series datasets. Groups of essentially identical genomes were identified with dRep, and the best genome from each set was selected. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using the typical co-assembly method. Documentation is available at http://drep.readthedocs.io/en/master/ and source code is available at https://github.com/MrOlm/drep.


mSystems ◽  
2021 ◽  
Vol 6 (3) ◽  
Author(s):  
Alicia Clum ◽  
Marcel Huntemann ◽  
Brian Bushnell ◽  
Brian Foster ◽  
Bryce Foster ◽  
...  

ABSTRACT The DOE Joint Genome Institute (JGI) Metagenome Workflow performs metagenome data processing, including assembly; structural, functional, and taxonomic annotation; and binning of metagenomic data sets that are subsequently included into the Integrated Microbial Genomes and Microbiomes (IMG/M) (I.-M. A. Chen, K. Chu, K. Palaniappan, A. Ratner, et al., Nucleic Acids Res, 49:D751–D763, 2021, https://doi.org/10.1093/nar/gkaa939) comparative analysis system and provided for download via the JGI data portal (https://genome.jgi.doe.gov/portal/). This workflow scales to run on thousands of metagenome samples per year, which can vary by the complexity of microbial communities and sequencing depth. Here, we describe the different tools, databases, and parameters used at different steps of the workflow to help with the interpretation of metagenome data available in IMG and to enable researchers to apply this workflow to their own data. We use 20 publicly available sediment metagenomes to illustrate the computing requirements for the different steps and highlight the typical results of data processing. The workflow modules for read filtering and metagenome assembly are available as a workflow description language (WDL) file (https://code.jgi.doe.gov/BFoster/jgi_meta_wdl). The workflow modules for annotation and binning are provided as a service to the user community at https://img.jgi.doe.gov/submit and require filling out the project and associated metadata descriptions in the Genomes OnLine Database (GOLD) (S. Mukherjee, D. Stamatis, J. Bertsch, G. Ovchinnikova, et al., Nucleic Acids Res, 49:D723–D733, 2021, https://doi.org/10.1093/nar/gkaa983). IMPORTANCE The DOE JGI Metagenome Workflow is designed for processing metagenomic data sets starting from Illumina fastq files. It performs data preprocessing, error correction, assembly, structural and functional annotation, and binning. The results of processing are provided in several standard formats, such as fasta and gff, and can be used for subsequent integration into the Integrated Microbial Genomes and Microbiomes (IMG/M) system where they can be compared to a comprehensive set of publicly available metagenomes. As of 30 July 2020, 7,155 JGI metagenomes have been processed by the DOE JGI Metagenome Workflow. Here, we present a metagenome workflow developed at the JGI that generates rich data in standard formats and has been optimized for downstream analyses ranging from assessment of the functional and taxonomic composition of microbial communities to genome-resolved metagenomics and the identification and characterization of novel taxa. This workflow is currently being used to analyze thousands of metagenomic data sets in a consistent and standardized manner.


2021 ◽  
Author(s):  
Anastasia Arturovna Semenova ◽  
◽  
Yulia Konstantinovna Yushina ◽  
Maria Alexandrovna Grudistova ◽  
Elena Viktorovna Zaiko ◽  
...  

The article discusses the results of a study of the microbial diversity of objects in the production environment of two meat processing enterprises, including antibiotic resistance, isolated strains of pathogenic microorganisms and their ability to biofilm formation.


2020 ◽  
Author(s):  
Xun Zhu ◽  
Ti-Cheng Chang ◽  
Richard Webby ◽  
Gang Wu

AbstractidCOV is a phylogenetic pipeline for quickly identifying the clades of SARS-CoV-2 virus isolates from raw sequencing data based on a selected clade-defining marker list. Using a public dataset, we show that idCOV can make equivalent calls as annotated by Nextstrain.org on all three common clade systems using user uploaded FastQ files directly. Web and equivalent command-line interfaces are available. It can be deployed on any Linux environment, including personal computer, HPC and the cloud. The source code is available at https://github.com/xz-stjude/idcov. A documentation for installation can be found at https://github.com/xz-stjude/idcov/blob/master/README.md.


el–Hayah ◽  
2012 ◽  
Vol 1 (4) ◽  
Author(s):  
Prihastuti Prihastuti

<p>Soils are made up of organic and an organic material. The organic soil component contains all the living creatures in the soil and the dead ones in various stages of decomposition.  Biological activity in soil helps to recycle nutrients, decompose organic matter making nutrient available for plant uptake, stabilize humus, and form soil particles.<br />The extent of the diversity of microbial in soil is seen to be critical to the maintenance of soil health and quality, as a wide range of microbial is involved in important soil functions.  That ecologically managed soils have a greater quantity and diversity of soil microbial. The two main drivers of soil microbial community structure, i.e., plant type and soil type, are thought to exert their function in a complex manner. The fact that in some situations the soil and in others the plant type is the key factor determining soil microbial diversity is related to their complexity of the microbial interactions in soil, including interactions between microbial and soil and microbial and plants. <br />The basic premise of organic soil stewardship is that all plant nutrients are present in the soil by maintaining a biologically active soil environment. The diversity of microbial communities has on ecological function and resilience to disturbances in soil ecosystems. Relationships are often observed between the extent of microbial diversity in soil, soil and plant quality and ecosystem sustainability. Agricultural management can be directed toward maximizing the quality of the soil microbial community in terms of disease suppression, if it is possible to shift soil microbial communities.</p><p>Keywords: structure, microbial, implication, sustainable agriculture<br /><br /></p>


2020 ◽  
Author(s):  
N Goonasekera ◽  
A Mahmoud ◽  
J Chilton ◽  
E Afgan

AbstractSummaryThe existence of more than 100 public Galaxy servers with service quotas is indicative of the need for an increased availability of compute resources for Galaxy to use. The GalaxyCloudRunner enables a Galaxy server to easily expand its available compute capacity by sending user jobs to cloud resources. User jobs are routed to the acquired resources based on a set of configurable rules and the resources can be dynamically acquired from any of 4 popular cloud providers (AWS, Azure, GCP, or OpenStack) in an automated fashion.Availability and implementationGalaxyCloudRunner is implemented in Python and leverages Docker containers. The source code is MIT licensed and available at https://github.com/cloudve/galaxycloudrunner. The documentation is available at http://gcr.cloudve.org/.ContactEnis Afgan ([email protected])Supplementary informationNone


2017 ◽  
Vol 5 (42) ◽  
Author(s):  
Juhi Gupta ◽  
Rashmi Rathour ◽  
Madan Kumar ◽  
Indu Shekhar Thakur

ABSTRACT We report the soil microbial diversity and functional aspects related to degradation of recalcitrant compounds, determined using a metagenomic approach, in a landfill lysimeter prepared with soil from Ghazipur landfill site, New Delhi, India. Metagenomic analysis revealed the presence and functional diversity of complex microbial communities responsible for waste degradation.


Microbiology ◽  
2021 ◽  
Vol 167 (10) ◽  
Author(s):  
Aarón Barraza ◽  
Juan J. Montes-Sánchez ◽  
M. Goretty Caamal-Chan ◽  
Abraham Loera-Muro

Arid plant communities provide variable diets that can affect digestive microbial communities of free-foraging ruminants. Thus, we used next-generation sequencing of 16S and 18S rDNA to characterize microbial communities in the rumen (regurgitated digesta) and large intestine (faeces) and diet composition of lactating creole goats from five flocks grazing in native plant communities in the Sonoran Desert in the rainy season. The bacterial communities in the rumen and large intestine of the five flocks had similar alpha diversity (Chao1, Shannon, and Simpson indices). However, bacterial community compositions were different: a bacterial community dominated by Proteobacteria in the rumen transitioned to a community dominated by Firmicutes in the large intestine. Bacterial communities of rumen were similar across flocks; similarly occurred with large-intestine communities. Archaea had a minimum presence in the goat digestive tract. We detected phylum Basidiomycota, Ascomycota, and Apicomplexa as the main fungi and protozoa. Analyses suggested different diet compositions; forbs and grasses composed the bulk of plants in the rumen and forbs and shrubs in faeces. Therefore, lactating goats consuming different diets in the Sonoran Desert in the rainy season share a similar core bacterial community in the rumen and another in the large intestine and present low archaeal communities.


mSphere ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Sarah L. Lebeis

ABSTRACT Sarah Lebeis studies the assembly and function of plant microbiomes. In this mSphere of Influence article, she reflects on how the paper “Functional Overlap of the Arabidopsis Leaf and Root Microbiota” (Y. Bai, D. B. Müller, G. Srinivas, R. Garrido-Oter, et al., Nature 528:364-369, 2015, https://doi.org/10.1038/nature16192) provided a roadmap for how large culture collections composed of well-characterized bacterial isolates provide essential resources to test hypotheses concerning microbial communities.


Sign in / Sign up

Export Citation Format

Share Document