scholarly journals Analysis of the genomic basis of functional diversity in dinoflagellates using a transcriptome-based sequence similarity network

2018 ◽  
Vol 27 (10) ◽  
pp. 2365-2380 ◽  
Author(s):  
Arnaud Meng ◽  
Erwan Corre ◽  
Ian Probert ◽  
Andres Gutierrez-Rodriguez ◽  
Raffaele Siano ◽  
...  
2017 ◽  
Author(s):  
Arnaud Meng ◽  
Erwan Corre ◽  
Ian Probert ◽  
Andres Gutierrez-Rodriguez ◽  
Raffaele Siano ◽  
...  

ABSTRACTDinoflagellates are one of the most abundant and functionally diverse groups of eukaryotes. Despite an overall scarcity of genomic information for dinoflagellates, constantly emerging high-throughput sequencing resources can be used to characterize and compare these organisms. We assembled de novo and processed 46 dinoflagellate transcriptomes and used a sequence similarity network (SSN) to compare the underlying genomic basis of functional features within the group. This approach constitutes the most comprehensive picture to date of the genomic potential of dinoflagellates. A core proteome composed of 252 connected components (CCs) of putative conserved protein domains (pCDs) was identified. Of these, 206 were novel and 16 lacked any functional annotation in public databases. Integration of functional information in our network analyses allowed investigation of pCDs specifically associated to functional traits. With respect to toxicity, sequences homologous to those of proteins involved in toxin biosynthesis pathways (e.g. sxtA1-4 and sxtG) were not specific to known toxin-producing species. Although not fully specific to symbiosis, the most represented functions associated with proteins involved in the symbiotic trait were related to membrane processes and ion transport. Overall, our SSN approach led to identification of 45,207 and 90,794 specific and constitutive pCDs of respectively the toxic and symbiotic species represented in our analyses. Of these, 56% and 57% respectively (i.e. 25,393 and 52,193 pCDs) completely lacked annotation in public databases. This stresses the extent of our lack of knowledge, while emphasizing the potential of SSNs to identify candidate pCDs for further functional genomic characterization.


2020 ◽  
Vol 48 (5) ◽  
pp. 1941-1951
Author(s):  
Elizabeth C. Gray ◽  
Daniel M. Beringer ◽  
Michelle M. Meyer

Structured cis-regulatory RNAs have evolved across all domains of life, highlighting the utility and plasticity of RNA as a regulatory molecule. Homologous RNA sequences and structures often have similar functions, but homology may also be deceiving. The challenges that derive from trying to assign function to structure and vice versa are not trivial. Bacterial riboswitches, viral and eukaryotic IRESes, CITEs, and 3′ UTR elements employ an array of mechanisms to exert their effects. Bioinformatic searches coupled with biochemical and functional validation have elucidated some shared and many unique ways cis-regulators are employed in mRNA transcripts. As cis-regulatory RNAs are resolved in greater detail, it is increasingly apparent that shared homology can mask the full spectrum of mRNA cis-regulator functional diversity. Furthermore, similar functions may be obscured by lack of obvious sequence similarity. Thus looking beyond homology is crucial for furthering our understanding of RNA-based regulation.


2020 ◽  
Author(s):  
Javier M. González

ABSTRACTThe superfamily of metallo-β-lactamases (MBL) comprises an ancient group of proteins found in all domains of life, sharing a characteristic αββα fold and a histidine-rich motif for binding of transition metal ions, with the ability to catalyze a variety of hydrolysis and redox reactions. Herein, structural homology and sequence similarity network (SSN) analysis are used to assist the phylogenetic reconstruction of the MBL superfamily, introducing tanglegrams to evaluate structure-function relationships. SSN neighborhood connectivity is applied for spotting protein families within SSN clusters, showing that 98 % of the superfamily remains to be explored experimentally. Further SSN research is suggested in order to determine their topological properties, which will be instrumental for the improvement of automated sequence annotation methods.


2021 ◽  
Vol 59 (10) ◽  
pp. 931-940
Author(s):  
Bin Wei ◽  
Ya-Kun Wang ◽  
Jin-Biao Yu ◽  
Si-Jia Wang ◽  
Yan-Lei Yu ◽  
...  

2016 ◽  
Vol 110 (3) ◽  
pp. 495a
Author(s):  
Geng-Ming Hu ◽  
Te-Lun Mai ◽  
Chi-Ming Chen

2008 ◽  
Vol 4 (5) ◽  
pp. e1000063 ◽  
Author(s):  
Nan Song ◽  
Jacob M. Joseph ◽  
George B. Davis ◽  
Dannie Durand

Author(s):  
Shu Cheng ◽  
Slim Karkar ◽  
Eric Bapteste ◽  
Nathan Yee ◽  
Paul Falkowski ◽  
...  

2021 ◽  
Vol 118 (4) ◽  
pp. e2018289118
Author(s):  
Katherine H. O’Toole ◽  
Barbara Imperiali ◽  
Karen N. Allen

The monotopic phosphoglycosyl transferase (monoPGT) superfamily comprises over 38,000 nonredundant sequences represented in bacterial and archaeal domains of life. Members of the superfamily catalyze the first membrane-committed step in en bloc oligosaccharide biosynthetic pathways, transferring a phosphosugar from a soluble nucleoside diphosphosugar to a membrane-resident polyprenol phosphate. The singularity of the monoPGT fold and its employment in the pivotal first membrane-committed step allows confident assignment of both protein and corresponding pathway. The diversity of the family is revealed by the generation and analysis of a sequence similarity network for the superfamily, with fusion of monoPGTs with other pathway members being the most frequent and extensive elaboration. Three common fusions were identified: sugar-modifying enzymes, glycosyl transferases, and regulatory domains. Additionally, unexpected fusions of the monoPGT with members of the polytopic PGT superfamily were discovered, implying a possible evolutionary link through the shared polyprenol phosphate substrate. Notably, a phylogenetic reconstruction of the monoPGT superfamily shows a radial burst of functionalization, with a minority of members comprising only the minimal PGT catalytic domain. The commonality and identity of the fusion partners in the monoPGT superfamily is consistent with advantageous colocalization of pathway members at membrane interfaces.


2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Dan Liu ◽  
Yingjun Ma ◽  
Xingpeng Jiang ◽  
Tingting He

Abstract Background Viruses are closely related to bacteria and human diseases. It is of great significance to predict associations between viruses and hosts for understanding the dynamics and complex functional networks in microbial community. With the rapid development of the metagenomics sequencing, some methods based on sequence similarity and genomic homology have been used to predict associations between viruses and hosts. However, the known virus-host association network was ignored in these methods. Results We proposed a kernelized logistic matrix factorization with integrating different information to predict potential virus-host associations on the heterogeneous network (ILMF-VH) which is constructed by connecting a virus network with a host network based on known virus-host associations. The virus network is constructed based on oligonucleotide frequency measurement, and the host network is constructed by integrating oligonucleotide frequency similarity and Gaussian interaction profile kernel similarity through similarity network fusion. The host prediction accuracy of our method is better than other methods. In addition, case studies show that the host of crAssphage predicted by ILMF-VH is consistent with presumed host in previous studies, and another potential host Escherichia coli is also predicted. Conclusions The proposed model is an effective computational tool for predicting interactions between viruses and hosts effectively, and it has great potential for discovering novel hosts of viruses.


Sign in / Sign up

Export Citation Format

Share Document