A window into lysogeny: Revealing temperate phage biology with transcriptomics

Mapping Intimacies ◽

10.1101/787010 ◽

2019 ◽

Cited By ~ 1

Author(s):

Siân V. Owen ◽

Rocío Canals ◽

Nicolas Wenner ◽

Disa L. Hammarlöf ◽

Carsten Kröger ◽

...

Keyword(s):

Molecular Biology ◽

Sequence Data ◽

Temperate Phage ◽

Phage Resistance ◽

Bacterial Cells ◽

Rna Seq ◽

Bacterial Genomes ◽

Transcriptomic Data ◽

Dna Sequence Data ◽

Bacterial Rna

ABSTRACTIntegrated phage elements, known as prophages, are a pervasive feature of bacterial genomes. Prophages can enhance the fitness of their bacterial hosts by conferring beneficial functions, such as virulence, stress tolerance or phage resistance, which are encoded by accessory loci. Whilst the majority of phage-encoded genes are repressed during lysogeny, accessory loci are often highly expressed. However, novel prophage accessory loci are challenging to identify based on DNA sequence data alone. Here, we use bacterial RNA-seq data to examine the transcriptional landscapes of five Salmonella prophages. We show that transcriptomic data can be used to heuristically enrich for prophage features that are highly expressed within bacterial cells and often represent functionally-important accessory loci. Using this approach we identify a novel anti-sense RNA species in prophage BTP1, STnc6030, which mediates superinfection exclusion of phage BTP1 and immunity to closely-related phages. Bacterial transcriptomic datasets are a powerful tool to explore the molecular biology of temperate phages.

Download Full-text

Generating DNA sequence data with limited resources for molecular biology: Lessons from a barcoding project in Indonesia

Applications in Plant Sciences ◽

10.1002/aps3.1167 ◽

2018 ◽

Vol 6 (7) ◽

pp. e01167 ◽

Cited By ~ 2

Author(s):

Gillian H. Dean ◽

Rani Asmarayani ◽

Marlina Ardiyani ◽

Yessi Santika ◽

Teguh Triono ◽

...

Keyword(s):

Molecular Biology ◽

Dna Sequence ◽

Sequence Data ◽

Limited Resources ◽

Dna Sequence Data

Download Full-text

OperomeDB: A Database of Condition-Specific Transcription Units in Prokaryotic Genomes

BioMed Research International ◽

10.1155/2015/318217 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Kashish Chetal ◽

Sarath Chandra Janga

Keyword(s):

User Interface ◽

Operon Structure ◽

Rna Seq ◽

Bacterial Genomes ◽

Substantial Fraction ◽

Experimental Conditions ◽

Transcriptomic Data ◽

Operon Prediction ◽

Wide Range ◽

Prokaryotic Genomes

Background. In prokaryotic organisms, a substantial fraction of adjacent genes are organized into operons—codirectionally organized genes in prokaryotic genomes with the presence of a common promoter and terminator. Although several available operon databases provide information with varying levels of reliability, very few resources provide experimentally supported results. Therefore, we believe that the biological community could benefit from having a new operon prediction database with operons predicted using next-generation RNA-seq datasets.Description. We present operomeDB, a database which provides an ensemble of all the predicted operons for bacterial genomes using available RNA-sequencing datasets across a wide range of experimental conditions. Although several studies have recently confirmed that prokaryotic operon structure is dynamic with significant alterations across environmental and experimental conditions, there are no comprehensive databases for studying such variations across prokaryotic transcriptomes. Currently our database contains nine bacterial organisms and 168 transcriptomes for which we predicted operons. User interface is simple and easy to use, in terms of visualization, downloading, and querying of data. In addition, because of its ability to load custom datasets, users can also compare their datasets with publicly available transcriptomic data of an organism.Conclusion. OperomeDB as a database should not only aid experimental groups working on transcriptome analysis of specific organisms but also enable studies related to computational and comparative operomics.

Download Full-text

A hybrid capture RNA bait set for resolving genetic and evolutionary relationships in angiosperms from deep phylogeny to intraspecific lineage hybridization

10.1101/2021.09.06.456727 ◽

2021 ◽

Author(s):

Michelle Waycott ◽

Jent Kornelis van Dijk ◽

Ed Biffin

Keyword(s):

Sequence Data ◽

Evolutionary Relationships ◽

Plastid Gene ◽

High Quality ◽

Transcriptomic Data ◽

Dna Sequence Data ◽

Evolutionary Studies ◽

High Quality Sequence ◽

Targeted Capture ◽

Genomic Regions

Novel multi-gene targeted capture probes have been developed with the objective of obtaining multi-locus high quality sequence reads across any angiosperm lineage. Using existing genomic and transcriptomic data, two independent single assay probe/bait sets have been developed, the first targeting conserved exons from 20 low copy nuclear genes (OzBaits_NR V1.0) and the second, 19 plastid gene regions (OZBaits_CP V1.0). These universal bait sets can efficiently generate DNA sequence data that are suitable for systematics and evolutionary studies of flowering plants. The bait sets can be ordered as Daicel-Arbor Sciences custom myBaits. We demonstrate the utility of the bait set in consistently recovering the targeted genomic regions across an evolutionarily broad range of angiosperm taxa.

Download Full-text

A Challenge to Integrate Bioinformatics and Biodiversity Informatics Data as Museomics

Biodiversity Information Science and Standards ◽

10.3897/biss.2.26102 ◽

2018 ◽

Vol 2 ◽

pp. e26102 ◽

Cited By ~ 1

Author(s):

Takeru Nakazato

Keyword(s):

Gene Expression ◽

Phylogenetic Analysis ◽

Molecular Biology ◽

Life Science ◽

Sequence Data ◽

Molecular Data ◽

Biodiversity Informatics ◽

Science Data ◽

Dna Sequence Data ◽

Ngs Data

Museum-preserved samples are attracting attention as a rich resource for DNA studies. Museomics aims to link DNA sequence data back to the museum collection. Molecular biologists are interested in morphological information including body size, pattern, and colors, and sequence data have also become essential for biodiversity research as evidence for species identification and phylogenetic analysis. For more than 30 years, molecular data, such as DNA and protein sequences, have been captured by the DNA Data Bank of Japan (DDBJ), the European Bioinformatics Institute (EBI, UK), and the National Center for Biotechnology Information (NCBI, US) under the International Nucleotide Sequence Database Collaboration (INSDC). INSDC provides collected molecular data to researchers as public databases including GenBank for DNA sequences and Gene Expression Omnibus (GEO) for gene expression. These three institutes synchronize archived data and publish all data on an FTP (File Transfer Protocol) site so that it is available for big data analysis. In recent years, high-throughput sequencing technology, also called next-generation sequencing (NGS) technology, has been widely utilized for molecular biology including genomics, transcriptomics, and metagenomics. Biodiversity researchers also focus on NGS data for DNA barcoding and phylogenetic analysis as well as molecular biology. Additionally, a portable NGS platform, MinION (Oxford Nanopore Technologies), has been launched, enabling biodiversity researchers to perform DNA sequencing in the field. Along with GenBank and GEO data, INSDC accepts NGS data and provides a public primary database, called the Sequence Read Archive (SRA). As of March 2018, 6.4 Peta Bases of NGS data is freely available under more than 130,000 projects in SRA. The Database Center for Life Science (DBCLS) provides a search engine for public NGS data, called DBCLS SRA (http://sra.dbcls.jp/) in collaboration with DDBJ. SRA contains not only raw sequence reads or processed data mapped to genome, but also information on the experimental design, including project types, sequencing platforms, and sample species. Researchers can use this data to refine their search results. We also linked publications referring to NGS data to the corresponding SRA entries. The mission of DBCLS is to accelerate the accessibility of life science data. Collected data used to be described in the Excel-readable tabular format, but these formats are difficult to merge with other databases because of the ambiguity of labels. To overcome this difficulty, we recently integrated life science data with Semantic Web technology. We held annual meetings to integrate life science data, called BioHackathons, in which researchers from all over the world participated. UniProt and Ensembl databases currently provide an RDF (Resource Description Framework) version of curated genome and protein data, respectively. In the biodiversity domain, there are many databases such as GBIF (The Global Biodiversity Information Facility) for species occurrence records, EoL (The Encyclopedia of Life) as a knowledge base of all species, and BoL (The Barcode of Life) for DNA barcoding data. RDF is utilized to describe Darwin Core based data so that bioinformatics and biodiversity informatics researchers can technically merge both types of data. Currently, specimen data and DNA sequence data are not linked. Museomics starts with cross-referencing specimen and sequence IDs and by making data sources comply with an existing standard.

Download Full-text

Using DNA sequence data to investigate the invasive spotted lanternfly's origin, parasitoids, and microbial associates

10.1603/ice.2016.109017 ◽

2016 ◽

Author(s):

Julie M. Urban

Keyword(s):

Dna Sequence ◽

Sequence Data ◽

Dna Sequence Data

Download Full-text

Integration of transcriptomic data identifies key hallmark genes in hypertrophic cardiomyopathy

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-02147-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jing Xu ◽

Xiangdong Liu ◽

Qiming Dai

Keyword(s):

Machine Learning ◽

Hypertrophic Cardiomyopathy ◽

Heart Diseases ◽

Expression Patterns ◽

Support Vector ◽

Rna Seq ◽

Ppi Network ◽

Learning Methods ◽

Transcriptomic Data ◽

Machine Learning Methods

Abstract Background Hypertrophic cardiomyopathy (HCM) represents one of the most common inherited heart diseases. To identify key molecules involved in the development of HCM, gene expression patterns of the heart tissue samples in HCM patients from multiple microarray and RNA-seq platforms were investigated. Methods The significant genes were obtained through the intersection of two gene sets, corresponding to the identified differentially expressed genes (DEGs) within the microarray data and within the RNA-Seq data. Those genes were further ranked using minimum-Redundancy Maximum-Relevance feature selection algorithm. Moreover, the genes were assessed by three different machine learning methods for classification, including support vector machines, random forest and k-Nearest Neighbor. Results Outstanding results were achieved by taking exclusively the top eight genes of the ranking into consideration. Since the eight genes were identified as candidate HCM hallmark genes, the interactions between them and known HCM disease genes were explored through the protein–protein interaction (PPI) network. Most candidate HCM hallmark genes were found to have direct or indirect interactions with known HCM diseases genes in the PPI network, particularly the hub genes JAK2 and GADD45A. Conclusions This study highlights the transcriptomic data integration, in combination with machine learning methods, in providing insight into the key hallmark genes in the genetic etiology of HCM.

Download Full-text

Taxonomic Re-Examination of Nine Rosellinia Types (Ascomycota, Xylariales) Stored in the Saccardo Mycological Collection

Microorganisms ◽

10.3390/microorganisms9030666 ◽

2021 ◽

Vol 9 (3) ◽

pp. 666

Author(s):

Niccolò Forin ◽

Alfredo Vizzini ◽

Federico Fainelli ◽

Enrico Ercole ◽

Barbara Baldan

Keyword(s):

Sequence Data ◽

Phylogenetic Analyses ◽

Illumina Miseq ◽

Botanical Garden ◽

Molecular Study ◽

Its2 Sequence ◽

Type Specimens ◽

Its1 Sequence ◽

Dna Sequence Data ◽

Nomen Novum

In a recent monograph on the genus Rosellinia, type specimens worldwide were revised and re-classified using a morphological approach. Among them, some came from Pier Andrea Saccardo’s fungarium stored in the Herbarium of the Padova Botanical Garden. In this work, we taxonomically re-examine via a morphological and molecular approach nine different Roselliniasensu Saccardo types. ITS1 and/or ITS2 sequences were successfully obtained applying Illumina MiSeq technology and phylogenetic analyses were carried out in order to elucidate their current taxonomic position. Only the ITS1 sequence was recovered for Rosellinia areolata, while for R. geophila, only the ITS2 sequence was recovered. We proposed here new combinations for Rosellinia chordicola, R. geophila and R. horridula, while for R. ambigua, R. areolata, R. australis, R. romana and R. somala, we did not suggest taxonomic changes compared to the current ones. The name Rosellinia subsimilis Sacc. is invalid, as it is a later homonym of R. subsimilis P. Karst. & Starbäck. Therefore, we introduced Coniochaeta dakotensis as a nomen novum for R. subsimilis Sacc. This is the first time that these types have been subjected to a molecular study. Our results demonstrate that old types are an important source of DNA sequence data for taxonomic re-examinations.

Download Full-text

Sequence data from isolated lichen-associated melanized fungi enhance delimitation of two new lineages within Chaetothyriomycetidae

Mycological Progress ◽

10.1007/s11557-021-01706-8 ◽

2021 ◽

Vol 20 (7) ◽

pp. 911-927

Author(s):

Lucia Muggia ◽

Yu Quan ◽

Cécile Gueidan ◽

Abdullah M. S. Al-Hatmi ◽

Martin Grube ◽

...

Keyword(s):

Sequence Data ◽

Single Species ◽

Sister Group ◽

Asexual Propagation ◽

Dna Sequence Data ◽

Wide Range ◽

The Family ◽

Rock Inhabiting Fungi ◽

Stable Habitat

AbstractLichen thalli provide a long-lived and stable habitat for colonization by a wide range of microorganisms. Increased interest in these lichen-associated microbial communities has revealed an impressive diversity of fungi, including several novel lineages which still await formal taxonomic recognition. Among these, members of the Eurotiomycetes and Dothideomycetes usually occur asymptomatically in the lichen thalli, even if they share ancestry with fungi that may be parasitic on their host. Mycelia of the isolates are characterized by melanized cell walls and the fungi display exclusively asexual propagation. Their taxonomic placement requires, therefore, the use of DNA sequence data. Here, we consider recently published sequence data from lichen-associated fungi and characterize and formally describe two new, individually monophyletic lineages at family, genus, and species levels. The Pleostigmataceae fam. nov. and Melanina gen. nov. both comprise rock-inhabiting fungi that associate with epilithic, crust-forming lichens in subalpine habitats. The phylogenetic placement and the monophyly of Pleostigmataceae lack statistical support, but the family was resolved as sister to the order Verrucariales. This family comprises the species Pleostigma alpinum sp. nov., P. frigidum sp. nov., P. jungermannicola, and P. lichenophilum sp. nov. The placement of the genus Melanina is supported as a lineage within the Chaetothyriales. To date, this genus comprises the single species M. gunde-cimermaniae sp. nov. and forms a sister group to a large lineage including Herpotrichiellaceae, Chaetothyriaceae, Cyphellophoraceae, and Trichomeriaceae. The new phylogenetic analysis of the subclass Chaetothyiomycetidae provides new insight into genus and family level delimitation and classification of this ecologically diverse group of fungi.

Download Full-text

DNA sonification for public engagement in bioinformatics

BMC Research Notes ◽

10.1186/s13104-021-05685-7 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Heleen Plaisier ◽

Thomas R. Meagher ◽

Daniel Barker

Keyword(s):

Dna Sequence ◽

Public Engagement ◽

Sequence Data ◽

Sensory Perception ◽

Data Representation ◽

Sequence Information ◽

Dna Sequence Data ◽

Public Events ◽

Dna Base ◽

Alternative Means

Abstract Objective Visualisation methods, primarily color-coded representation of sequence data, have been a predominant means of representation of DNA data. Algorithmic conversion of DNA sequence data to sound—sonification—represents an alternative means of representation that uses a different range of human sensory perception. We propose that sonification has value for public engagement with DNA sequence information because it has potential to be entertaining as well as informative. We conduct preliminary work to explore the potential of DNA sequence sonification in public engagement with bioinformatics. We apply a simple sonification technique for DNA, in which each DNA base is represented by a specific note. Additionally, a beat may be added to indicate codon boundaries or for musical effect. We report a brief analysis from public engagement events we conducted that featured this method of sonification. Results We report on use of DNA sequence sonification at two public events. Sonification has potential in public engagement with bioinformatics, both as a means of data representation and as a means to attract audience to a drop-in stand. We also discuss further directions for research on integration of sonification into bioinformatics public engagement and education.

Download Full-text

The state of the art in soybean transcriptomics resources and gene coexpression networks

in silico Plants ◽

10.1093/insilicoplants/diab005 ◽

2021 ◽

Author(s):

Fabricio Almeida-Silva ◽

Kanhu C Moharana ◽

Thiago M Venancio

Keyword(s):

State Of The Art ◽

The State ◽

Gene Coexpression Network ◽

Rna Seq ◽

Transcriptomic Data ◽

The Past ◽

Gene Coexpression ◽

Genomics Research ◽

Public Repositories ◽

Coexpression Networks

Abstract In the past decade, over 3000 samples of soybean transcriptomic data have accumulated in public repositories. Here, we review the state of the art in soybean transcriptomics, highlighting the major microarray and RNA-seq studies that investigated soybean transcriptional programs in different tissues and conditions. Further, we propose approaches for integrating such big data using gene coexpression network and outline important web resources that may facilitate soybean data acquisition and analysis, contributing to the acceleration of soybean breeding and functional genomics research.

Download Full-text