StructAnalyzer - a tool for sequence vs. structure similarity analysis

Jakub Wiedemann; Maciej Miłostan

doi:10.18388/abp.2016_1333

StructAnalyzer - a tool for sequence vs. structure similarity analysis

Acta Biochimica Polonica ◽

10.18388/abp.2016_1333 ◽

2017 ◽

Vol 63 (4) ◽

Cited By ~ 4

Author(s):

Jakub Wiedemann ◽

Maciej Miłostan

Keyword(s):

Sequence Similarity ◽

General Rule ◽

Structural Diversity ◽

Structural Similarity ◽

Focus Attention ◽

Rna Structures ◽

Structural Variants ◽

Tertiary Structures ◽

Structure Similarity ◽

Primary Structures

In the world of RNAs and proteins similarities on the level of primary structures of two comparable molecules usually correspond to structural similarities on the tertiary level. In other words, measures of sequence and structure similarities are in general correlated – high value of sequence similarity impose high value of structural similarity. However important exceptions that stay in contrary with the general rule can be identified. It is possible to find similar structures with very different sequences and also similar sequences with very different structures. In this paper we focus attention on the latter case and propose a tool, called StructAnalyzer, supporting analysis of relations between sequence and structure similarities. Recognition of diversity of tertiary structures of molecules with very similar primary structures may be the key for better understanding of mechanisms influencing folding of RNA or proteins and as result their function. StructAnalyzer allows exploration and visualization of structural diversity in relation to sequence similarity. We show how the tool can be used to screen RNA structures in PDB for sequences with structural variants.

Download Full-text

Structural diversity in metal ion chelation and the structure of uroporphyrinogen III synthase

Biochemical Society Transactions ◽

10.1042/bst0300595 ◽

2002 ◽

Vol 30 (4) ◽

pp. 595-600 ◽

Cited By ~ 21

Author(s):

H. L. Schubert ◽

E. Raux ◽

M. A. A. Matthews ◽

J. D. Phillips ◽

K. S. Wilson ◽

...

Keyword(s):

Active Site ◽

Metal Ion ◽

Structural Diversity ◽

Structural Similarity ◽

Ring Closure ◽

Catalytic Reactions ◽

Central Metal ◽

Tertiary Structures ◽

Bacillus Subtilus ◽

Metal Ion Chelation

All tetrapyrroles are synthesized through a branched pathway, and although each tetrapyrrole receives unique modifications around the ring periphery, they all share the unifying feature of a central metal ion. Each pathway maintains a unique metal ion chelatase, and several tertiary structures have been determined, including those of the protoporphyrin ferrochelatase from both human and Bacillus subtilus, and the cobalt chelatase CbiK. These enzymes exhibit strong structural similarity and appear to function by a similar mechanism. Met8p, from Saccharomyces cerevisiae, catalyses ferrochelation during the synthesis of sirohaem, and the structure reveals a novel chelatase architecture whereby both ferrochelation and NAD+-dependent dehydrogenation take place in a single bifunctional active site. Asp-141 appears to participate in both catalytic reactions. The final common biosynthetic step in tetrapyrrole biosynthesis is the generation of uroporphyrinogen by uroporphyrinogen III synthase, whereby the D ring of hydroxymethylbilane is flipped during ring closure to generate the asymmetrical structure of uroporphyrinogen III. The recently derived structure of uroporphyrinogen III synthase reveals a bi-lobed structure in which the active site lies between the domains.

Download Full-text

Structure Unveils Relationships between RNA Virus Polymerases

Viruses ◽

10.3390/v13020313 ◽

2021 ◽

Vol 13 (2) ◽

pp. 313

Author(s):

Heli A. M. Mönttinen ◽

Janne J. Ravantti ◽

Minna M. Poranen

Keyword(s):

Phylogenetic Tree ◽

Rna Viruses ◽

Rna Virus ◽

Sequence Similarity ◽

Protein Structures ◽

Structural Similarity ◽

Functional Differentiation ◽

Comparison Method ◽

Homologous Structure ◽

Biological Entities

RNA viruses are the fastest evolving known biological entities. Consequently, the sequence similarity between homologous viral proteins disappears quickly, limiting the usability of traditional sequence-based phylogenetic methods in the reconstruction of relationships and evolutionary history among RNA viruses. Protein structures, however, typically evolve more slowly than sequences, and structural similarity can still be evident, when no sequence similarity can be detected. Here, we used an automated structural comparison method, homologous structure finder, for comprehensive comparisons of viral RNA-dependent RNA polymerases (RdRps). We identified a common structural core of 231 residues for all the structurally characterized viral RdRps, covering segmented and non-segmented negative-sense, positive-sense, and double-stranded RNA viruses infecting both prokaryotic and eukaryotic hosts. The grouping and branching of the viral RdRps in the structure-based phylogenetic tree follow their functional differentiation. The RdRps using protein primer, RNA primer, or self-priming mechanisms have evolved independently of each other, and the RdRps cluster into two large branches based on the used transcription mechanism. The structure-based distance tree presented here follows the recently established RdRp-based RNA virus classification at genus, subfamily, family, order, class and subphylum ranks. However, the topology of our phylogenetic tree suggests an alternative phylum level organization.

Download Full-text

Structural similarity of chymopapain forms as indicated by circular dichroism

Biochemical Journal ◽

10.1042/bj2570183 ◽

1989 ◽

Vol 257 (1) ◽

pp. 183-186 ◽

Cited By ~ 9

Author(s):

S Solis-Mendiola ◽

R Zubillaga-Luna ◽

A Rojo-Dominguez ◽

A Hernandez-Arana

Keyword(s):

Liquid Chromatography ◽

High Resolution ◽

Cation Exchange ◽

Proteolytic Activity ◽

Structural Similarity ◽

Tertiary Structures ◽

Exchange Column ◽

Cation Exchange Column ◽

Modified Proteins ◽

Circular Dichroïsm

Four chymopapain forms were isolated by high-resolution liquid chromatography on a cation-exchange column. The three major forms possess nearly identical secondary and tertiary structures, as judged from their c.d. spectra; these components showed similar proteolytic activity and Mr values close to that of papain. The fourth isolated component seems to be a mixture of modified proteins.

Download Full-text

Structural and functional analysis of the Na+/H+ exchanger

Biochemical Journal ◽

10.1042/bj20061062 ◽

2007 ◽

Vol 401 (3) ◽

pp. 623-633 ◽

Cited By ~ 165

Author(s):

Emily R. Slepkov ◽

Jan K. Rainey ◽

Brian D. Sykes ◽

Larry Fliegel

Keyword(s):

Intracellular Ph ◽

Sequence Similarity ◽

Structural Data ◽

Structural Similarity ◽

Integral Membrane Protein ◽

Volume Control ◽

Amino Acid Residues ◽

Physiological Processes ◽

High Resolution Structure ◽

Extracellular Sodium

The mammalian NHE (Na+/H+ exchanger) is a ubiquitously expressed integral membrane protein that regulates intracellular pH by removing a proton in exchange for an extracellular sodium ion. Of the nine known isoforms of the mammalian NHEs, the first isoform discovered (NHE1) is the most thoroughly characterized. NHE1 is involved in numerous physiological processes in mammals, including regulation of intracellular pH, cell-volume control, cytoskeletal organization, heart disease and cancer. NHE comprises two domains: an N-terminal membrane domain that functions to transport ions, and a C-terminal cytoplasmic regulatory domain that regulates the activity and mediates cytoskeletal interactions. Although the exact mechanism of transport by NHE1 remains elusive, recent studies have identified amino acid residues that are important for NHE function. In addition, progress has been made regarding the elucidation of the structure of NHEs. Specifically, the structure of a single TM (transmembrane) segment from NHE1 has been solved, and the high-resolution structure of the bacterial Na+/H+ antiporter NhaA has recently been elucidated. In this review we discuss what is known about both functional and structural aspects of NHE1. We relate the known structural data for NHE1 to the NhaA structure, where TM IV of NHE1 shows surprising structural similarity with TM IV of NhaA, despite little primary sequence similarity. Further experiments that will be required to fully understand the mechanism of transport and regulation of the NHE1 protein are discussed.

Download Full-text

The FHA domain mediates phosphoprotein interactions

Journal of Cell Science ◽

10.1242/jcs.113.23.4143 ◽

2000 ◽

Vol 113 (23) ◽

pp. 4143-4149 ◽

Cited By ~ 7

Author(s):

J. Li ◽

G.I. Lee ◽

S.R. Van Doren ◽

J.C. Walker

Keyword(s):

Sequence Similarity ◽

Sequence Motif ◽

Amino Acid Residues ◽

Fha Domain ◽

Tertiary Structures ◽

Cellular Processes ◽

Forkhead Transcription Factors ◽

Binding Domains ◽

Cycle Arrest ◽

Protein Kinase Signaling

The forkhead-associated (FHA) domain is a phosphopeptide-binding domain first identified in a group of forkhead transcription factors but is present in a wide variety of proteins from both prokaryotes and eukaryotes. In yeast and human, many proteins containing an FHA domain are found in the nucleus and involved in DNA repair, cell cycle arrest, or pre-mRNA processing. In plants, the FHA domain is part of a protein that is localized to the plasma membrane and participates in the regulation of receptor-like protein kinase signaling pathways. Recent studies show that a functional FHA domain consists of 120–140 amino acid residues, which is significantly larger than the sequence motif first described. Although FHA domains do not exhibit extensive sequence similarity, they share similar secondary and tertiary structures, featuring a sandwich of two anti-parallel (beta)-sheets. One intriguing finding is that FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine, distinguishing them from other well-studied phosphoprotein-binding domains. The diversity of proteins containing FHA domains and potential differences in binding specificities suggest the FHA domain is involved in coordinating diverse cellular processes.

Download Full-text

Group AStreptococcusT Antigens Have a Highly Conserved Structure Concealed under a Heterogeneous Surface That Has Implications for Vaccine Design

Infection and Immunity ◽

10.1128/iai.00205-19 ◽

2019 ◽

Vol 87 (6) ◽

Cited By ~ 2

Author(s):

Paul G. Young ◽

Jeremy M. Raynes ◽

Jacelyn M. Loh ◽

Thomas Proft ◽

Edward N. Baker ◽

...

Keyword(s):

Sequence Similarity ◽

Sequence Divergence ◽

Structural Similarity ◽

T Antigen ◽

T Antigens ◽

Content Type ◽

Group A ◽

Efficacious Vaccine ◽

High Level ◽

Conserved Core

ABSTRACTGroup AStreptococcus(GAS) (Streptococcus pyogenes) is an important human pathogen associated with significant global morbidity and mortality for which there is no safe and efficacious vaccine. The T antigen, a protein that polymerizes to form the backbone of the GAS pilus structure, is a potential vaccine candidate. Previous surveys of theteegene, which encodes the T antigen, have identified 21 differentteetypes and subtypes such that any T antigen-based vaccine must be multivalent and carefully designed to provide broad strain coverage. In this study, the crystal structures of three two-domain T antigens (T3.2, T13, and T18.1) were determined and found to have remarkable structural similarity to the previously reported T1 antigen, despite moderate overall sequence similarity. This has enabled reliable modeling of all major two-domain T antigens to reveal that T antigen sequence variation is distributed along the full length of the protein and shields a highly conserved core. Immunoassays performed with sera from immunized animals and commercial T-typing sera identified a significant cross-reactive antibody response between T18.1, T18.2, T3.2, and T13. The existence of shared epitopes between T antigens, combined with the remarkably conserved structure and high level of surface sequence divergence, has important implications for the design of multivalent T antigen-based vaccines.

Download Full-text

An NMR-based approach reveals the core structure of the functional domain of SINEUP lncRNAs

Nucleic Acids Research ◽

10.1093/nar/gkaa598 ◽

2020 ◽

Vol 48 (16) ◽

pp. 9346-9360

Author(s):

Takako Ohyama ◽

Hazuki Takahashi ◽

Harshita Sharma ◽

Toshio Yamazaki ◽

Stefano Gustincich ◽

...

Keyword(s):

Nuclear Magnetic Resonance ◽

Computational Prediction ◽

Functional Domain ◽

Rna Structures ◽

Tertiary Structures ◽

Functional Roles ◽

The Core ◽

Non Coding Rnas ◽

Dynamic Domain

Abstract Long non-coding RNAs (lncRNAs) are attracting widespread attention for their emerging regulatory, transcriptional, epigenetic, structural and various other functions. Comprehensive transcriptome analysis has revealed that retrotransposon elements (REs) are transcribed and enriched in lncRNA sequences. However, the functions of lncRNAs and the molecular roles of the embedded REs are largely unknown. The secondary and tertiary structures of lncRNAs and their embedded REs are likely to have essential functional roles, but experimental determination and reliable computational prediction of large RNA structures have been extremely challenging. We report here the nuclear magnetic resonance (NMR)-based secondary structure determination of the 167-nt inverted short interspersed nuclear element (SINE) B2, which is embedded in antisense Uchl1 lncRNA and upregulates the translation of sense Uchl1 mRNAs. By using NMR ‘fingerprints’ as a sensitive probe in the domain survey, we successfully divided the full-length inverted SINE B2 into minimal units made of two discrete structured domains and one dynamic domain without altering their original structures after careful boundary adjustments. This approach allowed us to identify a structured domain in nucleotides 31–119 of the inverted SINE B2. This approach will be applicable to determining the structures of other regulatory lncRNAs.

Download Full-text

RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures

Nucleic Acids Research ◽

10.1093/nar/gkaa1097 ◽

2020 ◽

Vol 49 (D1) ◽

pp. D452-D457

Author(s):

Lisanna Paladin ◽

Martina Bevilacqua ◽

Sara Errigo ◽

Damiano Piovesan ◽

Ivan Mičetić ◽

...

Keyword(s):

Protein Data Bank ◽

Tandem Repeat ◽

Tandem Repeats ◽

Classification Scheme ◽

Sequence Similarity ◽

Protein Structures ◽

Hierarchical Classification ◽

Structural Similarity ◽

Data Bank ◽

Similarity Class

Abstract The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.

Download Full-text

Characterization of the gene celD and its encoded product 1,4-β-d-glucan glucohydrolase D from Pseudomonas fluorescens subsp. cellulosa

Biochemical Journal ◽

10.1042/bj2850947 ◽

1992 ◽

Vol 285 (3) ◽

pp. 947-955 ◽

Cited By ~ 30

Author(s):

J E Rixon ◽

L M A Ferreira ◽

A J Durrant ◽

J I Laurie ◽

G P Hazlewood ◽

...

Keyword(s):

Pseudomonas Fluorescens ◽

Plant Cell Wall ◽

Genomic Library ◽

Cell Envelope ◽

Sequence Similarity ◽

Significant Proportion ◽

Structural Similarity ◽

Reading Frame ◽

E Coli ◽

Beta Glucosidase

A genomic library of Pseudomonas fluorescens subsp. cellulosa DNA constructed in pUC18 and expressed in Escherichia coli was screened for recombinants expressing 4-methylumbelliferyl beta-D-glucoside hydrolysing activity (MUGase). A single MUGase-positive clone was isolated. The MUGase hydrolysed cellobiose, cellotriose, cellotetraose, cellopentaose and cellohexaose to glucose, by sequentially cleaving glucose residues from the non-reducing end of the cello-oligosaccharides. The Km values for cellobiose and cellohexaose hydrolysis were 1.2 mM and 28 microM respectively. The enzyme exhibited no activity against soluble or insoluble cellulose, xylan and xylobiose. Thus the MUGase is classified as a 1,4-beta-D-glucan glucohydrolase (EC 3.2.1.74) and is designated 1,4-beta-D-glucan glucohydrolase D (CELD). When expressed by E. coli, CELD was located in the cell-envelope fraction; a significant proportion of the native enzyme was also associated with the cell envelope when synthesized by its endogenous host. The nucleotide sequence of the gene, celD, which encodes CELD, revealed an open reading frame of 2607 bp, encoding a protein of M(r) 92,000. The deduced primary structure of CELD was confirmed by the M(r) of CELD (85,000) expressed by E. coli and P. fluorescens subsp. cellulosa, and by the experimentally determined N-terminus of the enzyme purified from E. coli, which showed identity with residues 52-67 of the celD translated sequence. The structure of the N-terminal region of full-length CELD was similar to the signal peptides of P. fluorescens subsp. cellulosa plant-cell-wall hydrolases. Deletion of the N-terminal 47 residues of CELD solubilized MUGase activity in E. coli. CELD exhibited sequence similarity with beta-glucosidase B of Clostridium thermocellum, particularly in the vicinity of the active-site aspartate residue, but did not display structural similarity with the mature forms of cellulases and xylanases expressed by P. fluorescens subsp. cellulosa.

Download Full-text

Distribution and Characterization of AKT Homologs in the Tangerine Pathotype of Alternaria alternata

Phytopathology ◽

10.1094/phyto.2000.90.7.762 ◽

2000 ◽

Vol 90 (7) ◽

pp. 762-768 ◽

Cited By ~ 63

Author(s):

A. Masunaka ◽

A. Tanaka ◽

T. Tsuge ◽

T. L. Peever ◽

L. W. Timmer ◽

...

Keyword(s):

Alternaria Alternata ◽

Sequence Similarity ◽

Structural Similarity ◽

Toxin Production ◽

Brown Spot ◽

Japanese Pear ◽

Black Rot ◽

Acid Moiety ◽

High Sequence Similarity ◽

Rough Lemon

The tangerine pathotype of Alternaria alternata produces a host-selective toxin (HST), known as ACT-toxin, and causes Alternaria brown spot disease of citrus. The structure of ACT-toxin is closely related to AK- and AF-toxins, which are HSTs produced by the Japanese pear and strawberry pathotypes of A. alternata, respectively. AC-, AK-, and AF-toxins are chemically similar and share a 9,10-epoxy-8-hydroxy-9-methyl-decatrienoic acid moiety. Two genes controlling AK-toxin biosynthesis (AKT1 and AKT2) were recently cloned from the Japanese pear pathotype of A. alternata. Portions of these genes were used as heterologous probes in Southern blots, that detected homologs in 13 isolates of A. alternata tangerine pathotype from Minneola tangelo in Florida. Partial sequencing of the homologs in one of these isolates demonstrated high sequence similarity to AKT1 (89.8%) and to AKT2 (90.7%). AKT homologs were not detected in nine isolates of A. alternata from rough lemon, six isolates of nonpathogenic A. alternata, and one isolate of A. citri that causes citrus black rot. The presence of homologs in the Minneola isolates and not in the rough lemon isolates, nonpathogens or black rot isolates, correlates perfectly to pathogenicity on Iyo tangerine and ACT-toxin production. Functionality of the homologs was demonstrated by detection of transcripts using reverse transcription-polymerase chain reaction (RT-PCR) in total RNA of the tangerine pathotype of A. alternata. The high sequence similarity of AKT and AKT homologs in the tangerine patho-type, combined with the structural similarity of AK-toxin and ACT-toxin, may indicate that these homologs are involved in the biosynthesis of the decatrienoic acid moiety of ACT-toxin.

Download Full-text