Transposon insertional mutagenesis in Saccharomyces uvarum reveals trans-acting effects influencing species-dependent essential genes

Mapping Intimacies ◽

10.1101/218305 ◽

2017 ◽

Cited By ~ 1

Author(s):

Monica R. Sanchez ◽

Celia Payen ◽

Frances Cheong ◽

Blake T. Hovde ◽

Sarah Bissonnette ◽

...

Keyword(s):

Insertional Mutagenesis ◽

Gene Annotation ◽

Experimental Testing ◽

Sequence Similarity ◽

Normal Function ◽

Essential Genes ◽

Model Organisms ◽

Comparative Genomic ◽

Saccharomyces Uvarum ◽

Cellular Processes

AbstractTo understand how complex genetic networks perform and regulate diverse cellular processes, the function of each individual component must be defined. Comprehensive phenotypic studies of mutant alleles have been successful in model organisms in determining what processes depend on the normal function of a gene. These results are often ported to newly sequenced genomes by using sequence homology. However, sequence similarity does not always mean identical function or phenotype, suggesting that new methods are required to functionally annotate newly sequenced species. We have implemented comparative analysis by high-throughput experimental testing of gene dispensability in Saccharomyces uvarum, a sister species of S. cerevisiae. We created haploid and heterozygous diploid Tn7 insertional mutagenesis libraries in S. uvarum to identify species dependent essential genes, with the goal of detecting genes with divergent functions and/or different genetic interactions. Comprehensive gene dispensability comparisons with S. cerevisiae predicted diverged dispensability at 12% of conserved orthologs, and validation experiments confirmed 22 differentially essential genes. Surprisingly, despite their differences in essentiality, these genes were capable of cross-species complementation, demonstrating that trans-acting factors that are background-dependent contribute to differential gene essentiality. This study demonstrates that direct experimental testing of gene disruption phenotypes across species can inform comparative genomic analyses and improve gene annotation. Our method can be widely applied in microorganisms to further our understanding of genome evolution.

Download Full-text

Colocality to Cofunctionality: Eukaryotic Gene Neighborhoods as a Resource for Function Discovery

Molecular Biology and Evolution ◽

10.1093/molbev/msaa221 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fatima Foflonker ◽

Crysten E Blaby-Haas

Keyword(s):

Green Algae ◽

Sequence Similarity ◽

Phosphoglycerate Kinase ◽

Orthologous Gene ◽

Model Organisms ◽

Comparative Genomic ◽

Evolutionary Origins ◽

Eukaryotic Gene ◽

Arsenic Detoxification ◽

Function Discovery

Abstract Diverging from the classic paradigm of random gene order in eukaryotes, gene proximity can be leveraged to systematically identify functionally related gene neighborhoods in eukaryotes, utilizing techniques pioneered in bacteria. Current methods of identifying gene neighborhoods typically rely on sequence similarity to characterized gene products. However, this approach is not robust for nonmodel organisms like algae, which are evolutionarily distant from well-characterized model organisms. Here, we utilize a comparative genomic approach to identify evolutionarily conserved proximal orthologous gene pairs conserved across at least two taxonomic classes of green algae. A total of 317 gene neighborhoods were identified. In some cases, gene proximity appears to have been conserved since before the streptophyte–chlorophyte split, 1,000 Ma. Using functional inferences derived from reconstructed evolutionary relationships, we identified several novel functional clusters. A putative mycosporine-like amino acid, “sunscreen,” neighborhood contains genes similar to either vertebrate or cyanobacterial pathways, suggesting a novel mosaic biosynthetic pathway in green algae. One of two putative arsenic-detoxification neighborhoods includes an organoarsenical transporter (ArsJ), a glyceraldehyde 3-phosphate dehydrogenase-like gene, homologs of which are involved in arsenic detoxification in bacteria, and a novel algal-specific phosphoglycerate kinase-like gene. Mutants of the ArsJ-like transporter and phosphoglycerate kinase-like genes in Chlamydomonas reinhardtii were found to be sensitive to arsenate, providing experimental support for the role of these identified neighbors in resistance to arsenate. Potential evolutionary origins of neighborhoods are discussed, and updated annotations for formerly poorly annotated genes are presented, highlighting the potential of this strategy for functional annotation.

Download Full-text

Transposon insertional mutagenesis in Saccharomyces uvarum reveals trans-acting effects influencing species-dependent essential genes

Genome Research ◽

10.1101/gr.232330.117 ◽

2019 ◽

Vol 29 (3) ◽

pp. 396-406 ◽

Cited By ~ 12

Author(s):

Monica R. Sanchez ◽

Celia Payen ◽

Frances Cheong ◽

Blake T. Hovde ◽

Sarah Bissonnette ◽

...

Keyword(s):

Insertional Mutagenesis ◽

Essential Genes ◽

Saccharomyces Uvarum

Download Full-text

PIC-Me: paralogs and isoforms classifier based on machine-learning approaches

BMC Bioinformatics ◽

10.1186/s12859-021-04229-x ◽

2021 ◽

Vol 22 (S11) ◽

Author(s):

Jooseong Oh ◽

Sung-Gwon Lee ◽

Chungoo Park

Keyword(s):

Machine Learning ◽

Large Scale ◽

Gene Annotation ◽

Sequence Similarity ◽

Global Analysis ◽

Model Organism ◽

Model Organisms ◽

Support Vector ◽

Learning Approaches ◽

Rna Seq

Abstract Background Paralogs formed through gene duplication and isoforms formed through alternative splicing have been important processes for increasing protein diversity and maintaining cellular homeostasis. Despite their recognized importance and the advent of large-scale genomic and transcriptomic analyses, paradoxically, accurate annotations of all gene loci to allow the identification of paralogs and isoforms remain surprisingly incomplete. In particular, the global analysis of the transcriptome of a non-model organism for which there is no reference genome is especially challenging. Results To reliably discriminate between the paralogs and isoforms in RNA-seq data, we redefined the pre-existing sequence features (sequence similarity, inverse count of consecutive identical or non-identical blocks, and match-mismatch fraction) previously derived from full-length cDNAs and EST sequences and described newly discovered genomic and transcriptomic features (twilight zone of protein sequence alignment and expression level difference). In addition, the effectiveness and relevance of the proposed features were verified with two widely used support vector machine (SVM) and random forest (RF) models. From nine RNA-seq datasets, all AUC (area under the curve) scores of ROC (receiver operating characteristic) curves were over 0.9 in the RF model and significantly higher than those in the SVM model. Conclusions In this study, using an RF model with five proposed RNA-seq features, we implemented our method called Paralogs and Isoforms Classifier based on Machine-learning approaches (PIC-Me) and showed that it outperformed an existing method. Finally, we envision that our tool will be a valuable computational resource for the genomics community to help with gene annotation and will aid in comparative transcriptomics and evolutionary genomics studies, especially those on non-model organisms.

Download Full-text

BITACORA: A comprehensive tool for the identification and annotation of gene families in genome assemblies

10.1101/593889 ◽

2019 ◽

Cited By ~ 1

Author(s):

Joel Vizueta ◽

Alejandro Sánchez-Gracia ◽

Julio Rozas

Keyword(s):

Dna Sequences ◽

Gene Annotation ◽

Sequence Similarity ◽

Gene Families ◽

Genomic Research ◽

Model Organisms ◽

Large Gene ◽

Genomic Annotation ◽

Gene Models ◽

Genome Assemblies

AbstractGene annotation is a critical bottleneck in genomic research, especially for the comprehensive study of very large gene families in the genomes of non-model organisms. Despite the recent progress in automatic methods, the tools developed for this task often produce inaccurate annotations, such as fused, chimeric, partial or even completely absent gene models for many family copies, which require considerable extra efforts to be amended. Here we present BITACORA, a bioinformatics solution that integrates sequence similarity search tools and Perl scripts to facilitate both the curation of these inaccurate annotations and the identification of previously undetected gene family copies directly from DNA sequences. We tested the performance of the BITACORA pipeline in annotating the members of two chemosensory gene families of different sizes in seven available chelicerate genome drafts. Despite the relatively high fragmentation of some of these drafts, BITACORA was able to improve the annotation of many members of these families and detected thousands of new chemoreceptors encoded in genome sequences. The program generates an output file in the general feature format (GFF) files, with both curated and novel gene models, and a FASTA file with the predicted proteins. These outputs can be easily integrated in genomic annotation editors, greatly facilitating subsequent manual annotation and downstream evolutionary analyses.

Download Full-text

Identification of Novel Toxin Genes from the Stinging Nettle Caterpillar Parasa lepida (Cramer, 1799): Insights into the Evolution of Lepidoptera Toxins

Insects ◽

10.3390/insects12050396 ◽

2021 ◽

Vol 12 (5) ◽

pp. 396

Author(s):

Natrada Mitpuangchon ◽

Kwan Nualcharoen ◽

Singtoe Boonrotpong ◽

Patamarerk Engsontia

Keyword(s):

Protease Inhibitors ◽

Proteolytic Enzymes ◽

Gene Annotation ◽

Sequence Similarity ◽

New Drugs ◽

Toxin Gene ◽

Cone Snail ◽

Stinging Nettle ◽

Toxin Genes ◽

Nettle Caterpillar

Many animal species can produce venom for defense, predation, and competition. The venom usually contains diverse peptide and protein toxins, including neurotoxins, proteolytic enzymes, protease inhibitors, and allergens. Some drugs for cancer, neurological disorders, and analgesics were developed based on animal toxin structures and functions. Several caterpillar species possess venoms that cause varying effects on humans both locally and systemically. However, toxins from only a few species have been investigated, limiting the full understanding of the Lepidoptera toxin diversity and evolution. We used the RNA-seq technique to identify toxin genes from the stinging nettle caterpillar, Parasa lepida (Cramer, 1799). We constructed a transcriptome from caterpillar urticating hairs and reported 34,968 unique transcripts. Using our toxin gene annotation pipeline, we identified 168 candidate toxin genes, including protease inhibitors, proteolytic enzymes, and allergens. The 21 P. lepida novel Knottin-like peptides, which do not show sequence similarity to any known peptide, have predicted 3D structures similar to tarantula, scorpion, and cone snail neurotoxins. We highlighted the importance of convergent evolution in the Lepidoptera toxin evolution and the possible mechanisms. This study opens a new path to understanding the hidden diversity of Lepidoptera toxins, which could be a fruitful source for developing new drugs.

Download Full-text

Cross-Predicting Essential Genes between Two Model Eukaryotic Species Using Machine Learning

International Journal of Molecular Sciences ◽

10.3390/ijms22105056 ◽

2021 ◽

Vol 22 (10) ◽

pp. 5056

Author(s):

Tulio L. Campos ◽

Pasi K. Korhonen ◽

Neil D. Young

Keyword(s):

Machine Learning ◽

Experimental Studies ◽

Essential Genes ◽

Subcellular Localisation ◽

Cross Prediction ◽

Gene Essentiality ◽

C Elegans ◽

Cellular Processes ◽

Eukaryotic Species ◽

The Cross

Experimental studies of Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular and cellular processes in metazoans at large. Since the publication of their genomes, functional genomic investigations have identified genes that are essential or non-essential for survival in each species. Recently, a range of features linked to gene essentiality have been inferred using a machine learning (ML)-based approach, allowing essentiality predictions within a species. Nevertheless, predictions between species are still elusive. Here, we undertake a comprehensive study using ML to discover and validate features of essential genes common to both C. elegans and D. melanogaster. We demonstrate that the cross-species prediction of gene essentiality is possible using a subset of features linked to nucleotide/protein sequences, protein orthology and subcellular localisation, single-cell RNA-seq, and histone methylation markers. Complementary analyses showed that essential genes are enriched for transcription and translation functions and are preferentially located away from heterochromatin regions of C. elegans and D. melanogaster chromosomes. The present work should enable the cross-prediction of essential genes between model and non-model metazoans.

Download Full-text

Microbial Functional Responses to Cholesterol Catabolism in Denitrifying Sludge

mSystems ◽

10.1128/msystems.00113-18 ◽

2018 ◽

Vol 3 (5) ◽

Cited By ~ 6

Author(s):

Sean Ting-Shyang Wei ◽

Yu-Wei Wu ◽

Tzong-Huei Lee ◽

Yi-Shiang Huang ◽

Cheng-Yu Yang ◽

...

Keyword(s):

16S Rrna ◽

De Novo ◽

Anthropogenic Impacts ◽

Sequence Similarity ◽

Community Level ◽

Model Organisms ◽

Substrate Uptake ◽

Functional Responses ◽

Content Type ◽

Rare Biosphere

ABSTRACTThe 2,3-secopathway, the pathway for anaerobic cholesterol degradation, has been established in the denitrifying betaproteobacteriumSterolibacterium denitrificans. However, knowledge of how microorganisms respond to cholesterol at the community level is elusive. Here, we applied mesocosm incubation and 16S rRNA sequencing to reveal that, in denitrifying sludge communities, three betaproteobacterial operational taxonomic units (OTUs) with low (94% to 95%) 16S rRNA sequence similarity toStl. denitrificansare cholesterol degraders and members of the rare biosphere. Metatranscriptomic and metabolite analyses show that these degraders adopt the 2,3-secopathway to sequentially catalyze the side chain and sterane of cholesterol and that two molybdoenzymes—steroid C25 dehydrogenase and 1-testosterone dehydrogenase/hydratase—are crucial for these bioprocesses, respectively. The metatranscriptome further suggests that these betaproteobacterial degraders display chemotaxis and motility toward cholesterol and that FadL-like transporters may be the key components for substrate uptake. Also, these betaproteobacteria are capable of transporting micronutrients and synthesizing cofactors essential for cellular metabolism and cholesterol degradation; however, the required cobalamin is possibly provided by cobalamin-de novo-synthesizing gamma-, delta-, and betaproteobacteria via the salvage pathway. Overall, our results indicate that the ability to degrade cholesterol in sludge communities is reserved for certain rare biosphere members and that C25 dehydrogenase can serve as a biomarker for sterol degradation in anoxic environments.IMPORTANCESteroids are ubiquitous and abundant natural compounds that display recalcitrance. Biodegradation via sludge communities in wastewater treatment plants is the primary removal process for steroids. To date, compared to studies for aerobic steroid degradation, the knowledge of anaerobic degradation of steroids has been based on only a few model organisms. Due to the increase of anthropogenic impacts, steroid inputs may affect microbial diversity and functioning in ecosystems. Here, we first investigated microbial functional responses to cholesterol, the most abundant steroid in sludge, at the community level. Our metagenomic and metatranscriptomic analyses revealed that the capacities for cholesterol approach, uptake, and degradation are unique traits of certain low-abundance betaproteobacteria, indicating the importance of the rare biosphere in bioremediation. Apparent expression of genes involved in cofactorde novosynthesis and salvage pathways suggests that these micronutrients play important roles for cholesterol degradation in sludge communities.

Download Full-text

Increased burden of deleterious variants in essential genes in autism spectrum disorder

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1613195113 ◽

2016 ◽

Vol 113 (52) ◽

pp. 15054-15059 ◽

Cited By ~ 30

Author(s):

Xiao Ji ◽

Rachel L. Kember ◽

Christopher D. Brown ◽

Maja Bućan

Keyword(s):

Autism Spectrum Disorder ◽

Large Scale ◽

De Novo ◽

Autism Spectrum ◽

Essential Genes ◽

Spectrum Disorder ◽

Model Organisms ◽

Disease Genes ◽

Priority List ◽

Human Orthologs

Autism spectrum disorder (ASD) is a heterogeneous, highly heritable neurodevelopmental syndrome characterized by impaired social interaction, communication, and repetitive behavior. It is estimated that hundreds of genes contribute to ASD. We asked if genes with a strong effect on survival and fitness contribute to ASD risk. Human orthologs of genes with an essential role in pre- and postnatal development in the mouse [essential genes (EGs)] are enriched for disease genes and under strong purifying selection relative to human orthologs of mouse genes with a known nonlethal phenotype [nonessential genes (NEGs)]. This intolerance to deleterious mutations, commonly observed haploinsufficiency, and the importance of EGs in development suggest a possible cumulative effect of deleterious variants in EGs on complex neurodevelopmental disorders. With a comprehensive catalog of 3,915 mammalian EGs, we provide compelling evidence for a stronger contribution of EGs to ASD risk compared with NEGs. By examining the exonic de novo and inherited variants from 1,781 ASD quartet families, we show a significantly higher burden of damaging mutations in EGs in ASD probands compared with their non-ASD siblings. The analysis of EGs in the developing brain identified clusters of coexpressed EGs implicated in ASD. Finally, we suggest a high-priority list of 29 EGs with potential ASD risk as targets for future functional and behavioral studies. Overall, we show that large-scale studies of gene function in model organisms provide a powerful approach for prioritization of genes and pathogenic variants identified by sequencing studies of human disease.

Download Full-text

The FHA domain mediates phosphoprotein interactions

Journal of Cell Science ◽

10.1242/jcs.113.23.4143 ◽

2000 ◽

Vol 113 (23) ◽

pp. 4143-4149 ◽

Cited By ~ 7

Author(s):

J. Li ◽

G.I. Lee ◽

S.R. Van Doren ◽

J.C. Walker

Keyword(s):

Sequence Similarity ◽

Sequence Motif ◽

Amino Acid Residues ◽

Fha Domain ◽

Tertiary Structures ◽

Cellular Processes ◽

Forkhead Transcription Factors ◽

Binding Domains ◽

Cycle Arrest ◽

Protein Kinase Signaling

The forkhead-associated (FHA) domain is a phosphopeptide-binding domain first identified in a group of forkhead transcription factors but is present in a wide variety of proteins from both prokaryotes and eukaryotes. In yeast and human, many proteins containing an FHA domain are found in the nucleus and involved in DNA repair, cell cycle arrest, or pre-mRNA processing. In plants, the FHA domain is part of a protein that is localized to the plasma membrane and participates in the regulation of receptor-like protein kinase signaling pathways. Recent studies show that a functional FHA domain consists of 120–140 amino acid residues, which is significantly larger than the sequence motif first described. Although FHA domains do not exhibit extensive sequence similarity, they share similar secondary and tertiary structures, featuring a sandwich of two anti-parallel (beta)-sheets. One intriguing finding is that FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine, distinguishing them from other well-studied phosphoprotein-binding domains. The diversity of proteins containing FHA domains and potential differences in binding specificities suggest the FHA domain is involved in coordinating diverse cellular processes.

Download Full-text

A Septin Cytoskeleton-Targeting Small Molecule, Forchlorfenuron, Inhibits Epithelial Migration via Septin-Independent Perturbation of Cellular Signaling

Cells ◽

10.3390/cells9010084 ◽

2019 ◽

Vol 9 (1) ◽

pp. 84 ◽

Cited By ~ 3

Author(s):

Lei Sun ◽

Xuelei Cao ◽

Susana Lechuga ◽

Alex Feygin ◽

Nayden G. Naydenov ◽

...

Keyword(s):

Epithelial Cells ◽

Cellular Systems ◽

Model Organisms ◽

Cellular Functions ◽

Epithelial Migration ◽

Cellular Processes ◽

Dependent Inhibition ◽

Erk Activity ◽

Ht 29 ◽

Target Effects

Septins are GTP-binding proteins that self-assemble into high-order cytoskeletal structures, filaments, and rings. The septin cytoskeleton has a number of cellular functions, including regulation of cytokinesis, cell migration, vesicle trafficking, and receptor signaling. A plant cytokinin, forchlorfenuron (FCF), interacts with septin subunits, resulting in the altered organization of the septin cytoskeleton. Although FCF has been extensively used to examine the roles of septins in various cellular processes, its specificity, and possible off-target effects in vertebrate systems, has not been investigated. In the present study, we demonstrate that FCF inhibits spontaneous, as well as hepatocyte growth factor-induced, migration of HT-29 and DU145 human epithelial cells. Additionally, FCF increases paracellular permeability of HT-29 cell monolayers. These inhibitory effects of FCF persist in epithelial cells where the septin cytoskeleton has been disassembled by either CRISPR/Cas9-mediated knockout or siRNA-mediated knockdown of septin 7, insinuating off-target effects of FCF. Biochemical analysis reveals that FCF-dependent inhibition of the motility of control and septin-depleted cells is accompanied by decreased expression of the c-Jun transcription factor and inhibited ERK activity. The described off-target effects of FCF strongly suggests that caution is warranted while using this compound to examine the biological functions of septins in cellular systems and model organisms.

Download Full-text