Single-strand DNA processing: phylogenomics and sequence diversity of a superfamily of potential prokaryotic HuH endonucleases

Mapping Intimacies ◽

10.1101/279083 ◽

2018 ◽

Author(s):

Yves Quentin ◽

Patricia Siguier ◽

Mick Chandler ◽

Gwennaele Fichant

Keyword(s):

Protein Interactions ◽

Insertion Sequence ◽

Sequence Similarity ◽

Family Members ◽

Secondary Structures ◽

Single Strand ◽

Cell Physiology ◽

Taxonomic Distribution ◽

Specific Distribution ◽

A Genome

AbstractBackgroundSome mobile genetic elements target the lagging strand template during DNA replication. Bacterial examples are insertion sequences IS608and ISDra2(IS200/IS605family members). They use obligatory single-stranded circular DNA intermediates for excision and insertion and encode a transposase, TnpAIS200, which recognizes subterminal secondary structures at the insertion sequence ends. Similar secondary structures, Repeated Extragenic Palindromes (REP), are present in many bacterial genomes. TnpAIS200-related proteins, TnpAREP, have been identified and could be responsible for REP sequence proliferation. These proteins share a conserved HuH/Tyrosine core domain responsible for catalysis and are involved in processes of ssDNA cleavage and ligation. Our goal is to characterize the diversity of these proteins collectively referred as the TnpAY1family.ResultsA genome-wide analysis of sequences similar to TnpAIS200and TnpAREPin prokaryotes revealed a large number of family members with a wide taxonomic distribution. These can be arranged into three distinct classes and 12 subclasses based on sequence similarity. One subclass includes sequences similar to TnpAIS200. Proteins from other subclasses are not associated with typical insertion sequence features. These are characterized by specific additional domains possibly involved in protein/DNA or protein/protein interactions. Their genes are found in more than 25% of species analyzed. They exhibit a patchy taxonomic distribution consistent with dissemination by horizontal gene transfers followed by loss. ThetnpAREPgenes of five subclasses are flanked by typical REP sequences in a REPtron-like arrangement. Four distinct REP types were characterized with a subclass specific distribution. Other subclasses are not associated with REP sequences but have a large conserved domain located in C-terminal end of their sequence. This unexpected diversity suggests that, while most likely involved in processing single-strand DNA, proteins from different subfamilies may play a number of different roles.ConclusionsWe established a detailed classification of TnpAY1proteins, consolidated by the analysis of the conserved core domains and the characterization of additional domains. The data obtained illustrate the unexpected diversity of the TnpAY1family and provide a strong framework for future evolutionary and functional studies. By their potential function in ssDNA edition, they may confer adaptive responses to host cell physiology and metabolism.

Download Full-text

Identification and characterization of Fep15, a new selenocysteine-containing member of the Sep15 protein family

Biochemical Journal ◽

10.1042/bj20051569 ◽

2006 ◽

Vol 394 (3) ◽

pp. 575-579 ◽

Cited By ~ 28

Author(s):

Sergey V. Novoselov ◽

Deame Hua ◽

Alexey V. Lobanov ◽

Vadim N. Gladyshev

Keyword(s):

Mammalian Cells ◽

Insertion Sequence ◽

Phylogenetic Analyses ◽

Dependent Manner ◽

Putative Active Site ◽

A Genome ◽

Sequence Elements ◽

Identification And Characterization ◽

Insertion Sequence Elements

Sec (selenocysteine) is a rare amino acid in proteins. It is co-translationally inserted into proteins at UGA codons with the help of SECIS (Sec insertion sequence) elements. A full set of selenoproteins within a genome, known as the selenoproteome, is highly variable in different organisms. However, most of the known eukaryotic selenoproteins are represented in the mammalian selenoproteome. In addition, many of these selenoproteins have cysteine orthologues. Here, we describe a new selenoprotein, designated Fep15, which is distantly related to members of the 15 kDa selenoprotein (Sep15) family. Fep15 is absent in mammals, can be detected only in fish and is present in these organisms only in the selenoprotein form. In contrast with other members of the Sep15 family, which contain a putative active site composed of Sec and cysteine, Fep15 has only Sec. When transiently expressed in mammalian cells, Fep15 incorporated Sec in an SECIS- and SBP2 (SECIS-binding protein 2)-dependent manner and was targeted to the endoplasmic reticulum by its N-terminal signal peptide. Phylogenetic analyses of Sep15 family members suggest that Fep15 evolved by gene duplication.

Download Full-text

Swim-exercised mice show a decreased level of protein O-GlcNAcylation and expression of O-GlcNAc transferase in heart

Journal of Applied Physiology ◽

10.1152/japplphysiol.00147.2011 ◽

2011 ◽

Vol 111 (1) ◽

pp. 157-162 ◽

Cited By ~ 37

Author(s):

Darrell D. Belke

Keyword(s):

Protein Interactions ◽

Contractile Function ◽

Cell Physiology ◽

Protein Protein Interactions ◽

Training Exercise ◽

General Protein ◽

Exercise Induced ◽

Glcnac Transferase ◽

The Impact ◽

Physiological Hypertrophy

Swim-training exercise in mice leads to cardiac remodeling associated with an improvement in contractile function. Protein O-linked N-acetylglucosamine ( O-GlcNAcylation) is a posttranslational modification of serine and threonine residues capable of altering protein-protein interactions affecting gene transcription, cell signaling pathways, and general cell physiology. Increased levels of protein O-GlcNAcylation in the heart have been associated with pathological conditions such as diabetes, ischemia, and hypertrophic heart failure. In contrast, the impact of physiological exercise on protein O-GlcNAcylation in the heart is currently unknown. Swim-training exercise in mice was associated with the development of a physiological hypertrophy characterized by an improvement in contractile function relative to sedentary mice. General protein O-GlcNAcylation was significantly decreased in swim-exercised mice. This effect was mirrored in the level of O-GlcNAcylation of individual proteins such as SP1. The decrease in protein O-GlcNAcylation was associated with a decrease in the expression of O-GlcNAc transferase (OGT) and glutamine-fructose amidotransferase (GFAT) 2 mRNA. O-GlcNAcase (OGA) activity was actually lower in swim-trained than sedentary hearts, suggesting that it did not contribute to the decreased protein O-GlcNAcylation. Thus it appears that exercise-induced physiological hypertrophy is associated with a decrease in protein O-GlcNAcylation, which could potentially contribute to changes in gene expression and other physiological changes associated with exercise.

Download Full-text

Nonomuraea montanisoli sp. nov., isolated from mountain forest soil

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004695 ◽

2021 ◽

Author(s):

Suchart Chanama ◽

Chanwit Suriyachadkun ◽

Manee Chanama

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Related Species ◽

Sequence Similarity ◽

Diaminopimelic Acid ◽

Mountain Forest ◽

Rrna Gene ◽

Content Type ◽

Link Type ◽

A Genome

A novel actinomycete, strain SMC 257T, was isolated from a soil sample collected from mountain forest, Nan Province, Thailand. Strain SMC 257T formed tightly closed spiral spore chains on aerial mycelia. A polyphasic approach was used for the taxonomic study of this strain. Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain SMC 257T belonged to the genus Nonomuraea , and the closest phylogenetically related species were Nonomuraea roseoviolacea subsp. carminata JCM 9946T (98.9 % 16S rRNA gene sequence similarity), Nonomuraea rhodomycinica TBRC 6557T (98.4 %), and Nonomuraea roseoviolacea subsp. roseoviolacea JCM 3145T (98.3 %). Genome sequencing revealed a genome size of 9.76 Mbp and a G+C content of 72.3 mol%. The genome average nucleotide identity (ANI) and the digital DNA–DNA hybridization (dDDH) values that distinguished this novel strain from its closest related species were species boundary of 95–96 % and 70 %, respectively. The cell wall peptidoglycan contained meso-diaminopimelic acid. The whole-cell sugars were glucose, ribose, madurose and mannose. The major menaquinone was MK-9(H4). The polar lipid profile consisted of phosphatidylethanolamine, hydroxyphosphatidylethanolamine, lysophosphatidylethanolamine, diphosphatidylglycerol, N-phosphatidylglycerol, phosphatidylinositol and phosphatidylinositol mannosides. The predominant cellular fatty acids were C17 : 0 10-methyl and iso-C16 : 0. Based on comparative analysis of phenotypic, chemotaxonomic and genotypic data, strain SMC 257T is considered to represent a novel species of the genus Nonomuraea , for which the name Nonomuraea montanisoli is proposed. The type strain is SMC 257T (=TBRC 13065T=NBRC 114772T).

Download Full-text

Hansschlegelia quercus sp. nov., a novel methylotrophic bacterium isolated from oak buds

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY ◽

10.1099/ijsem.0.004323 ◽

2020 ◽

Vol 70 (8) ◽

pp. 4646-4652 ◽

Cited By ~ 5

Author(s):

Nadezhda V. Agafonova ◽

Elena N. Kaparullina ◽

Denis S. Grouzdev ◽

Nina V. Doronina

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Type Species ◽

Sequence Similarity ◽

Methylotrophic Bacterium ◽

Rrna Gene ◽

Methylotrophic Bacteria ◽

Content Type ◽

Link Type ◽

A Genome

Novel aerobic, restricted facultatively methylotrophic bacteria were isolated from buds of English oak (Quercus robur L.; strain DubT) and northern red oak (Quercus rubra L.; strain KrD). The isolates were Gram-negative, asporogenous, motile short rods that multiplied by binary fisson. They utilized methanol, methylamine and a few polycarbon compounds as carbon and energy sources. Optimal growth occurred at 25 °C and pH 7.5. The dominant phospholipids were phosphatidylethanolamine, phosphatidylcholine, diphosphatidylglycerol and phoshatidylglycerol. The major cellular fatty acids of cells were C18 : 1 ω7c, 11-methyl C18 : 1 ω7c and C16 : 0. The major ubiquinone was Q-10. Analysis of 16S rRNA gene sequences showed that the strains were closely related to the members of the genus Hansschlegelia : Hansschlegelia zhihuaiae S113T(97.5–98.0 %), Hansschlegelia plantiphila S1T (97.4–97.6 %) and Hansschlegelia beijingensis PG04T(97.0–97.2 %). The 16S rRNA gene sequence similarity between strains DubT and KrD was 99.7 %, and the DNA–DNA hybridization (DDH) result between the strains was 85 %. The ANI and the DDH values between strain DubT and H. zhihuaiae S113T were 80.1 and 21.5 %, respectively. Genome sequencing of the strain DubT revealed a genome size of 3.57 Mbp and a G+C content of 67.0 mol%. Based on the results of the phenotypic, chemotaxonomic and genotypic analyses, it is proposed that the isolates be assigned to the genus Hansschlegelia as Hansschlegelia quercus sp. nov. with the type strain DubT (=VKM B-3284T=CCUG 73648T=JCM 33463T).

Download Full-text

Prioritization of genes associated with the pathogenesis of leukosis in cattle

Vavilov Journal of Genetics and Breeding ◽

10.18699/vj18.451 ◽

2019 ◽

Vol 22 (8) ◽

pp. 1063-1069 ◽

Cited By ~ 1

Author(s):

N. S. Yudin ◽

N. L. Podkolodnyy ◽

T. A. Agarkova ◽

E. V. Ignatieva

Keyword(s):

Protein Interactions ◽

Genome Wide Association Study ◽

Association Studies ◽

Mammalian Species ◽

Genome Wide Association ◽

Farm Animals ◽

Genome Wide Association Studies ◽

Protein Protein Interactions ◽

Genome Wide ◽

A Genome

Selection by means of genetic markers is a promising approach to the eradication of infectious diseases in farm animals, especially in the absence of eﬀective methods of treatment and prevention. Bovine leukemia virus (BLV) is spread throughout the world and represents one of the biggest problems for the livestock production and food security in Russia. However, recent genome-wide association studies have shown that sensitivity/resistance to BLV is polygenic. The aim of this study was to create a catalog of cattle genes and genes of other mammalian species involved in the pathogenesis of BLV-induced infection and to perform gene prioritization using bioinformatics methods. Based on manually collected information from a range of open sources, a total of 446 genes were included in the catalog of cattle genes and genes of other mammals involved in the pathogenesis of BLV-induced infection. The following criteria were used to prioritize 446 genes from the catalog: (1) the gene is associated with leukemia according to a genome-wide association study; (2) the gene is associated with leukemia according to a case-control study; (3) the role of the gene in leukemia development has been studied using knockout mice; (4) protein-protein interactions exist between the gene-encoded protein and either viral particles or individual viral proteins; (5) the gene is annotated with Gene Ontology terms that are overrepresented for a given list of genes; (6) the gene participates in biological pathways from the KEGG or REACTOME databases, which are over-represented for a given list of genes; (7) the protein encoded by the gene has a high number of protein-protein interactions with proteins encoded by other genes from the catalog. Based on each criterion, a rank was assigned to each gene. Then the ranks were summarized and an overall rank was determined. Prioritization of 446 candidate genes allowed us to identify 5 genes of interest (TNF,LTB,BOLA-DQA1,BOLA-DRB3,ATF2), which can aﬀect the sensitivity/resistance of cattle to leukemia.

Download Full-text

A Genome-Wide Arrayed cDNA Screen to Identify Functional Modulators of α7 Nicotinic Acetylcholine Receptors

SLAS DISCOVERY Advancing Life Sciences ◽

10.1177/1087057116676086 ◽

2016 ◽

Vol 22 (2) ◽

pp. 155-165 ◽

Cited By ~ 2

Author(s):

Elizabeth B. Rex ◽

Nikhil Shukla ◽

Shenyan Gu ◽

David Bredt ◽

Daniel DiSepio

Keyword(s):

Cdna Library ◽

High Throughput ◽

Protein Interactions ◽

Nicotinic Acetylcholine Receptors ◽

Calcium Flux ◽

Functional Assay ◽

Α7 Nicotinic Acetylcholine Receptor ◽

Nicotinic Acetylcholine ◽

Genome Wide ◽

A Genome

Cellular signaling is in part regulated by the composition and subcellular localization of a series of protein interactions that collectively form a signaling complex. Using the α7 nicotinic acetylcholine receptor (α7nAChR) as a proof-of-concept target, we developed a platform to identify functional modulators (or auxiliary proteins) of α7nAChR signaling. The Broad cDNA library was transiently cotransfected with α7nAChR cDNA in HEK293T cells in a high-throughput fashion. Using this approach in combination with a functional assay, we identified positive modulators of α7nAChR activity. We identified known positive modulators/auxiliary proteins present in the cDNA library that regulate α7nAChR signaling, in addition to identifying novel modulators of α7nAChR signaling. These included NACHO, SPDYE11, TCF4, and ZC3H12A, all of which increased PNU-120596-mediated nicotine-dependent calcium flux. Importantly, these auxiliary proteins did not modulate GluR1(o)-mediated Ca flux. To elucidate a possible mechanism of action, we employed an α7nAChR-HA surface staining assay. NACHO enhanced α7nAChR surface expression; however, the mechanism responsible for the SPDYE11-, TCF4-, and ZC3H12A-dependent modulation of α7nAChR has yet to be defined. This report describes the development and validation of a high-throughput, genome-wide cDNA screening platform coupled to FLIPR functional assays in order to identify functional modulators of α7nAChR signaling.

Download Full-text

Colibacter massiliensis gen. nov. sp. nov., a novel Gram-stain-positive anaerobic diplococcal bacterium, isolated from the human left colon

Scientific Reports ◽

10.1038/s41598-019-53791-1 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Hussein Anani ◽

Rita Abou Abdallah ◽

May Khoder ◽

Anthony Fontanini ◽

Morgane Mailhe ◽

...

Keyword(s):

16S Rrna ◽

Sequence Similarity ◽

Left Colon ◽

Deficiency Anemia ◽

Rrna Gene ◽

Gram Stain ◽

16S Rrna Sequence ◽

A Genome ◽

The Family ◽

Bacterium Strain

AbstractThe gut microbiota is considered to play a key role in human health. As a consequence, deciphering its microbial diversity is mandatory. A polyphasic taxonogenomic strategy based on the combination of phenotypic and genomic analyses was used to characterize a new bacterium, strain Marseille-P2911. This strain was isolated from a left colon sample of a 60-year old man who underwent a colonoscopy for an etiological investigation of iron-deficiency anemia in Marseille, France. On the basis of 16S rRNA sequence comparison, the closest phylogenetic neighbor was Anaeroglobus geminatus (94.59% 16S rRNA gene sequence similarity) within the family Veillonellaceae. Cells were anaerobic, Gram-stain-positive, non-spore-forming, catalase/oxidase negative cocci grouped in pairs. The bacterium was able to grow at 37 °C after 2 days of incubation. Strain Marseille-P2911 exhibited a genome size of 1,715,864-bp with a 50.2% G + C content, and digital DNA-DNA hybridization (dDDH) and OrthoANI values with A. geminatus of only 19.1 ± 4.5% and 74.42%, respectively. The latter value being lower than the threshold for genus delineation (80.5%), we propose the creation of the new genus Colibacter gen. nov., with strain Marseille-P2911T (=DSM 103304 = CSUR P2911) being the type strain of the new species Colibacter massiliensis gen. nov., sp. nov.

Download Full-text

A Comprehensive Survey on the Terpene Synthase Gene Family Provides New Insight into Its Evolutionary Patterns

Genome Biology and Evolution ◽

10.1093/gbe/evz142 ◽

2019 ◽

Vol 11 (8) ◽

pp. 2078-2098 ◽

Cited By ~ 8

Author(s):

Shu-Ye Jiang ◽

Jingjing Jin ◽

Rajani Sarojam ◽

Srinivasan Ramachandran

Keyword(s):

Gene Family ◽

Large Scale ◽

Family Members ◽

Terpene Synthase ◽

Limited Information ◽

Terpene Synthases ◽

Genome Wide ◽

A Genome ◽

Family Expansion ◽

Insight Into

Abstract Terpenes are organic compounds and play important roles in plant growth and development as well as in mediating interactions of plants with the environment. Terpene synthases (TPSs) are the key enzymes responsible for the biosynthesis of terpenes. Although some species were employed for the genome-wide identification and characterization of the TPS family, limited information is available regarding the evolution, expansion, and retention mechanisms occurring in this gene family. We performed a genome-wide identification of the TPS family members in 50 sequenced genomes. Additionally, we also characterized the TPS family from aromatic spearmint and basil plants using RNA-Seq data. No TPSs were identified in algae genomes but the remaining plant species encoded various numbers of the family members ranging from 2 to 79 full-length TPSs. Some species showed lineage-specific expansion of certain subfamilies, which might have contributed toward species or ecotype divergence or environmental adaptation. A large-scale family expansion was observed mainly in dicot and monocot plants, which was accompanied by frequent domain loss. Both tandem and segmental duplication significantly contributed toward family expansion and expression divergence and played important roles in the survival of these expanded genes. Our data provide new insight into the TPS family expansion and evolution and suggest that TPSs might have originated from isoprenyl diphosphate synthase genes.

Download Full-text

Structure-based prediction of ligand–protein interactions on a genome-wide scale

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1705381114 ◽

2017 ◽

Vol 114 (52) ◽

pp. 13685-13690 ◽

Cited By ~ 21

Author(s):

Howook Hwang ◽

Fabian Dey ◽

Donald Petrey ◽

Barry Honig

Keyword(s):

Binding Site ◽

Protein Interactions ◽

Kinase Inhibitors ◽

Structural Information ◽

Structural Alignment ◽

Scoring Function ◽

A Genome ◽

Small Molecule Ligands ◽

Wide Scale ◽

Approved Drugs

We report a template-based method, LT-scanner, which scans the human proteome using protein structural alignment to identify proteins that are likely to bind ligands that are present in experimentally determined complexes. A scoring function that rapidly accounts for binding site similarities between the template and the proteins being scanned is a crucial feature of the method. The overall approach is first tested based on its ability to predict the residues on the surface of a protein that are likely to bind small-molecule ligands. The algorithm that we present, LBias, is shown to compare very favorably to existing algorithms for binding site residue prediction. LT-scanner’s performance is evaluated based on its ability to identify known targets of Food and Drug Administration (FDA)-approved drugs and it too proves to be highly effective. The specificity of the scoring function that we use is demonstrated by the ability of LT-scanner to identify the known targets of FDA-approved kinase inhibitors based on templates involving other kinases. Combining sequence with structural information further improves LT-scanner performance. The approach we describe is extendable to the more general problem of identifying binding partners of known ligands even if they do not appear in a structurally determined complex, although this will require the integration of methods that combine protein structure and chemical compound databases.

Download Full-text

Inferring Genome Trees by Using a Filter To Eliminate Phylogenetically Discordant Sequences and a Distance Matrix Based on Mean Normalized BLASTP Scores

Journal of Bacteriology ◽

10.1128/jb.184.8.2072-2080.2002 ◽

2002 ◽

Vol 184 (8) ◽

pp. 2072-2080 ◽

Cited By ~ 75

Author(s):

G. D. Paul Clarke ◽

Robert G. Beiko ◽

Mark A. Ragan ◽

Robert L. Charlebois

Keyword(s):

Sequence Similarity ◽

Distance Matrix ◽

Bootstrap Support ◽

Systematic Bias ◽

Reading Frame ◽

Topological Features ◽

A Genome ◽

Pairwise Sequence Similarity ◽

Similarity Relationships ◽

Genome Tree

ABSTRACT Darwin's paradigm holds that the diversity of present-day organisms has arisen via a process of genetic descent with modification, as on a bifurcating tree. Evidence is accumulating that genes are sometimes transferred not along lineages but rather across lineages. To the extent that this is so, Darwin's paradigm can apply only imperfectly to genomes, potentially complicating or perhaps undermining attempts to reconstruct historical relationships among genomes (i.e., a genome tree). Whether most genes in a genome have arisen via treelike (vertical) descent or by lateral transfer across lineages can be tested if enough complete genome sequences are used. We define a phylogenetically discordant sequence (PDS) as an open reading frame (ORF) that exhibits patterns of similarity relationships statistically distinguishable from those of most other ORFs in the same genome. PDSs represent between 6.0 and 16.8% (mean, 10.8%) of the analyzable ORFs in the genomes of 28 bacteria, eight archaea, and one eukaryote (Saccharomyces cerevisiae). In this study we developed and assessed a distance-based approach, based on mean pairwise sequence similarity, for generating genome trees. Exclusion of PDSs improved bootstrap support for basal nodes but altered few topological features, indicating that there is little systematic bias among PDSs. Many but not all features of the genome tree from which PDSs were excluded are consistent with the 16S rRNA tree.

Download Full-text