High-resolution structure of NodZ fucosyltransferase involved in the biosynthesis of the nodulation factor.

Krzysztof Brzezinski; Tomasz Stepkowski; Santosh Panjikar; Grzegorz Bujacz; Mariusz Jaskolski

doi:10.18388/abp.2007_3227

High-resolution structure of NodZ fucosyltransferase involved in the biosynthesis of the nodulation factor.

Acta Biochimica Polonica ◽

10.18388/abp.2007_3227 ◽

2007 ◽

Vol 54 (3) ◽

pp. 537-549 ◽

Cited By ~ 17

Author(s):

Krzysztof Brzezinski ◽

Tomasz Stepkowski ◽

Santosh Panjikar ◽

Grzegorz Bujacz ◽

Mariusz Jaskolski

Keyword(s):

Catalytic Mechanism ◽

Broad Class ◽

Sequence Motifs ◽

Sequence Alignments ◽

Crystal Forms ◽

Conserved Sequence ◽

Terminal Domain ◽

Phosphate Ions ◽

High Resolution Structure ◽

Nodulation Factor

The fucosyltransferase NodZ is involved in the biosynthesis of the nodulation factor in nitrogen-fixing symbiotic bacteria. It catalyzes alpha1,6 transfer of l-fucose from GDP-fucose to the reducing residue of the synthesized Nod oligosaccharide. We present the structure of the NodZ protein from Bradyrhizobium expressed in Escherichia coli and crystallized in the presence of phosphate ions in two crystal forms. The enzyme is arranged into two domains of nearly equal size. Although NodZ falls in one broad class (GT-B) with other two-domain glycosyltransferases, the topology of its domains deviates from the canonical Rossmann fold, with particularly high distortions in the N-terminal domain. Mutational data combined with structural and sequence alignments indicate residues of potential importance in GDP-fucose binding or in the catalytic mechanism. They are all clustered in three conserved sequence motifs located in the C-terminal domain.

Download Full-text

The first crystal structure of the peptidase domain of the U32 peptidase family

Acta Crystallographica Section D Biological Crystallography ◽

10.1107/s1399004715019549 ◽

2015 ◽

Vol 71 (12) ◽

pp. 2505-2512 ◽

Cited By ~ 4

Author(s):

Magdalena Schacherl ◽

Angelika A. M. Montada ◽

Elena Brunstein ◽

Ulrich Baumann

Keyword(s):

Crystal Structure ◽

Catalytic Mechanism ◽

Quaternary Structure ◽

Catalytic Domain ◽

Three Dimensional ◽

Zinc Ion ◽

Crystal Structure Analysis ◽

Dimensional Structure ◽

Sequence Motifs ◽

Conserved Sequence

The U32 family is a collection of over 2500 annotated peptidases in the MEROPS database with unknown catalytic mechanism. They mainly occur in bacteria and archaea, but a few representatives have also been identified in eukarya. Many of the U32 members have been linked to pathogenicity, such as proteins fromHelicobacterandSalmonella. The first crystal structure analysis of a U32 catalytic domain fromMethanopyrus kandleri(genemk0906) reveals a modified (βα)8TIM-barrel fold with some unique features. The connecting segment between strands β7 and β8 is extended and helix α7 is located on top of the C-terminal end of the barrel body. The protein exhibits a dimeric quaternary structure in which a zinc ion is symmetrically bound by histidine and cysteine side chains from both monomers. These residues reside in conserved sequence motifs. No typical proteolytic motifs are discernible in the three-dimensional structure, and biochemical assays failed to demonstrate proteolytic activity. A tunnel in which an acetate ion is bound is located in the C-terminal part of the β-barrel. Two hydrophobic grooves lead to a tunnel at the C-terminal end of the barrel in which an acetate ion is bound. One of the grooves binds to aStrep-Tag II of another dimer in the crystal lattice. Thus, these grooves may be binding sites for hydrophobic peptides or other ligands.

Download Full-text

Database Mining for Novel Bacterial β-Etherases, Glutathione-Dependent Lignin-Degrading Enzymes

Applied and Environmental Microbiology ◽

10.1128/aem.02026-19 ◽

2019 ◽

Vol 86 (2) ◽

Cited By ~ 2

Author(s):

Hauke Voß ◽

Carina Amata Heck ◽

Marcus Schallmey ◽

Anett Schallmey

Keyword(s):

Biochemical Characterization ◽

Phylogenetic Analyses ◽

Sequence Motifs ◽

Database Mining ◽

Sequence Alignments ◽

Renewable Source ◽

Aromatic Polymer ◽

Conserved Sequence ◽

Peptide Pattern ◽

Lignin Depolymerization

ABSTRACT Lignin is the most abundant aromatic polymer in nature and a promising renewable source for the provision of aromatic platform chemicals and biofuels. β-Etherases are enzymes with a promising potential for application in lignin depolymerization due to their selectivity in the cleavage of β-O-4 aryl ether bonds. However, only a very limited number of these enzymes have been described and characterized so far. Using peptide pattern recognition (PPR) as well as phylogenetic analyses, 96 putatively novel β-etherases have been identified, some even originating from bacteria outside the order Sphingomonadales. A set of 13 diverse enzymes was selected for biochemical characterization, and β-etherase activity was confirmed for all of them. Some enzymes displayed up to 3-fold higher activity than previously known β-etherases. Moreover, conserved sequence motifs specific for either LigE- or LigF-type enzymes were deduced from multiple-sequence alignments and the PPR-derived peptides. In combination with structural information available for the β-etherases LigE and LigF, insight into the potential structural and/or functional role of conserved residues within these sequence motifs is provided. Phylogenetic analyses further suggest the presence of additional bacterial enzymes with potential β-etherase activity outside the classical LigE- and LigF-type enzymes as well as the recently described heterodimeric β-etherases. IMPORTANCE The use of biomass as a renewable source and replacement for crude oil for the provision of chemicals and fuels is of major importance for current and future societies. Lignin, the most abundant aromatic polymer in nature, holds promise as a renewable starting material for the generation of required aromatic structures. However, a controlled and selective lignin depolymerization to yield desired aromatic structures is a very challenging task. In this regard, bacterial β-etherases are especially interesting, as they are able to cleave the most abundant bond type in lignin with high selectivity. With this study, we significantly expanded the toolbox of available β-etherases for application in lignin depolymerization and discovered more active as well as diverse enzymes than previously known. Moreover, the identification of further β-etherases by sequence database mining in the future will be facilitated considerably through our deduced etherase-specific sequence motifs.

Download Full-text

Cloning and sequencing of four new mammalian monocarboxylate transporter (MCT) homologues confirms the existence of a transporter family with an ancient past

Biochemical Journal ◽

10.1042/bj3290321 ◽

1998 ◽

Vol 329 (2) ◽

pp. 321-328 ◽

Cited By ~ 240

Author(s):

T. Nigel PRICE ◽

N. Vicky JACKSON ◽

P. Andrew HALESTRAP

Keyword(s):

Monocarboxylate Transporter ◽

Cdna Libraries ◽

Sequence Motifs ◽

Intracellular Loop ◽

Cloning And Sequencing ◽

Sequence Alignments ◽

Yeast Saccharomyces Cerevisiae ◽

Multiple Sequence ◽

Conserved Sequence ◽

Transporter Family

Measurement of monocarboxylate transport kinetics in a range of cell types has provided strong circumstantial evidence for a family of monocarboxylate transporters (MCTs). Two mammalian MCT isoforms (MCT1 and MCT2) and a chicken isoform (REMP or MCT3) have already been cloned, sequenced and expressed, and another MCT-like sequence (XPCT) has been identified. Here we report the identification of new human MCT homologues in the database of expression sequence tags and the cloning and sequencing of four new full-length MCT-like sequences from human cDNA libraries, which we have denoted MCT3, MCT4, MCT5 and MCT6. Northern blotting revealed a unique tissue distribution for the expression of mRNA for each of the seven putative MCT isoforms (MCT1-MCT6 and XPCT). All sequences were predicted to have 12 transmembrane (TM) helical domains with a large intracellular loop between TM6 and TM7. Multiple sequence alignments showed identities ranging from 20% to 55%, with the greatest conservation in the predicted TM regions and more variation in the C-terminal than the N-terminal region. Searching of additional sequence databases identified candidate MCT homologues from the yeast Saccharomyces cerevisiae, the nematode worm Caenorhabditis elegans and the archaebacterium Sulfolobus solfataricus. Together these sequences constitute a new family of transporters with some strongly conserved sequence motifs, the possible functions of which are discussed.

Download Full-text

Proteins Binding to the Carbohydrate HNK-1: Common Origins?

International Journal of Molecular Sciences ◽

10.3390/ijms22158116 ◽

2021 ◽

Vol 22 (15) ◽

pp. 8116

Author(s):

Gaston Castillo ◽

Ralf Kleene ◽

Melitta Schachner ◽

Gabriele Loers ◽

Andrew E. Torda

Keyword(s):

Large Scale ◽

System Development ◽

Sequence Similarity ◽

High Mobility ◽

Binding Constants ◽

Nervous System Development ◽

Sequence Motifs ◽

Sequence Alignments ◽

Conserved Sequence ◽

Bona Fide

The human natural killer (HNK-1) carbohydrate plays important roles during nervous system development, regeneration after trauma and synaptic plasticity. Four proteins have been identified as receptors for HNK-1: the laminin adhesion molecule, high-mobility group box 1 and 2 (also called amphoterin) and cadherin 2 (also called N-cadherin). Because of HNK-1′s importance, we asked whether additional receptors for HNK-1 exist and whether the four identified proteins share any similarity in their primary structures. A set of 40,000 sequences homologous to the known HNK-1 receptors was selected and used for large-scale sequence alignments and motif searches. Although there are conserved regions and highly conserved sites within each of these protein families, there was no sequence similarity or conserved sequence motifs found to be shared by all families. Since HNK-1 receptors have not been compared regarding binding constants and since it is not known whether the sulfated or non-sulfated part of HKN-1 represents the structurally crucial ligand, the receptors are more heterogeneous in primary structure than anticipated, possibly involving different receptor or ligand regions. We thus conclude that the primary protein structure may not be the sole determinant for a bona fide HNK-1 receptor, rendering receptor structure more complex than originally assumed.

Download Full-text

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Bioinformatics ◽

10.1093/bioinformatics/btab083 ◽

2021 ◽

Author(s):

Yanrong Ji ◽

Zhihan Zhou ◽

Han Liu ◽

Ramana V Davuluri

Keyword(s):

Dna Sequences ◽

Regulatory Elements ◽

Ease Of Use ◽

Fine Tuning ◽

Supplementary Information ◽

Sequence Motifs ◽

Semantic Relationship ◽

Accurate Identification ◽

Conserved Sequence ◽

Genome Wide

Abstract Motivation Deciphering the language of non-coding DNA is one of the fundamental problems in genome research. Gene regulatory code is highly complex due to the existence of polysemy and distant semantic relationship, which previous informatics methods often fail to capture especially in data-scarce scenarios. Results To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, to capture global and transferrable understanding of genomic DNA sequences based on up and downstream nucleotide contexts. We compared DNABERT to the most widely used programs for genome-wide regulatory elements prediction and demonstrate its ease of use, accuracy and efficiency. We show that the single pre-trained transformers model can simultaneously achieve state-of-the-art performance on prediction of promoters, splice sites and transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, DNABERT enables direct visualization of nucleotide-level importance and semantic relationship within input sequences for better interpretability and accurate identification of conserved sequence motifs and functional genetic variant candidates. Finally, we demonstrate that pre-trained DNABERT with human genome can even be readily applied to other organisms with exceptional performance. We anticipate that the pre-trained DNABERT model can be fined tuned to many other sequence analyses tasks. Availability and implementation The source code, pretrained and finetuned model for DNABERT are available at GitHub (https://github.com/jerryji1993/DNABERT). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

High Resolution Structure of the N-terminal Domain of Tissue Inhibitor of Metalloproteinases-2 and Characterization of Its Interaction Site with Matrix Metalloproteinase-3

Journal of Biological Chemistry ◽

10.1074/jbc.273.34.21736 ◽

1998 ◽

Vol 273 (34) ◽

pp. 21736-21743 ◽

Cited By ~ 65

Author(s):

Frederick W. Muskett ◽

Tom A. Frenkiel ◽

James Feeney ◽

Robert B. Freedman ◽

Mark D. Carr ◽

...

Keyword(s):

High Resolution ◽

Matrix Metalloproteinase ◽

Tissue Inhibitor Of Metalloproteinases ◽

Tissue Inhibitor ◽

Interaction Site ◽

Matrix Metalloproteinase 3 ◽

Terminal Domain ◽

High Resolution Structure ◽

Resolution Structure

Download Full-text

Olfactory expression of trace amine-associated receptors requires cooperative cis-acting enhancers

Nature Communications ◽

10.1038/s41467-021-23824-3 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Ami Shah ◽

Madison Ratkowski ◽

Alessandro Rosa ◽

Paul Feinstein ◽

Thomas Bozza

Keyword(s):

Gene Expression ◽

Large Family ◽

Sequence Motifs ◽

Specific Expression ◽

Cis Acting ◽

Conserved Sequence ◽

Trace Amine ◽

Sequence Elements ◽

Cell Type Specific Expression ◽

Cell Type Specific

AbstractOlfactory sensory neurons express a large family of odorant receptors (ORs) and a small family of trace amine-associated receptors (TAARs). While both families are subject to so-called singular expression (expression of one allele of one gene), the mechanisms underlying TAAR gene choice remain obscure. Here, we report the identification of two conserved sequence elements in the mouse TAAR cluster (T-elements) that are required for TAAR gene expression. We observed that cell-type-specific expression of a TAAR-derived transgene required either T-element. Moreover, deleting either element reduced or abolished expression of a subset of TAAR genes, while deleting both elements abolished olfactory expression of all TAARs in cis with the mutation. The T-elements exhibit several features of known OR enhancers but also contain highly conserved, unique sequence motifs. Our data demonstrate that TAAR gene expression requires two cooperative cis-acting enhancers and suggest that ORs and TAARs share similar mechanisms of singular expression.

Download Full-text

Antimutator mutations in the alpha subunit of Escherichia coli DNA polymerase III: identification of the responsible mutations and alignment with other DNA polymerases.

Genetics ◽

10.1093/genetics/134.4.1039 ◽

1993 ◽

Vol 134 (4) ◽

pp. 1039-1044 ◽

Cited By ~ 2

Author(s):

I J Fijalkowska ◽

R M Schaaper

Keyword(s):

Escherichia Coli ◽

Amino Acid ◽

Dna Polymerase ◽

Dna Polymerases ◽

Alpha Subunit ◽

Sequence Motifs ◽

Dna Polymerase Iii ◽

Conserved Sequence ◽

Polymerase Iii ◽

Dna Replication Errors

Abstract The dnaE gene of Escherichia coli encodes the DNA polymerase (alpha subunit) of the main replicative enzyme, DNA polymerase III holoenzyme. We have previously identified this gene as the site of a series of seven antimutator mutations that specifically decrease the level of DNA replication errors. Here we report the nucleotide sequence changes in each of the different antimutator dnaE alleles. For each a single, but different, amino acid substitution was found among the 1,160 amino acids of the protein. The observed substitutions are generally nonconservative. All affected residues are located in the central one-third of the protein. Some insight into the function of the regions of polymerase III containing the affected residues was obtained by amino acid alignment with other DNA polymerases. We followed the principles developed in 1990 by M. Delarue et al. who have identified in DNA polymerases from a large number of prokaryotic and eukaryotic sources three highly conserved sequence motifs, which are suggested to contain components of the polymerase active site. We succeeded in finding these three conserved motifs in polymerase III as well. However, none of the amino acid substitutions responsible for the antimutator phenotype occurred at these sites. This and other observations suggest that the effect of these mutations may be exerted indirectly through effects on polymerase conformation and/or DNA/polymerase interactions.

Download Full-text

p190 RhoGAP, the major RasGAP-associated protein, binds GTP directly.

Molecular and Cellular Biology ◽

10.1128/mcb.14.11.7173 ◽

1994 ◽

Vol 14 (11) ◽

pp. 7173-7181 ◽

Cited By ~ 40

Author(s):

R Foster ◽

K Q Hu ◽

D A Shaywitz ◽

J Settleman

Keyword(s):

Structural Features ◽

Cellular Protein ◽

Small Gtpases ◽

Sequence Motifs ◽

Amino Terminal ◽

Terminal Domain ◽

Gtp Binding ◽

Guanine Nucleotide Binding ◽

Complex Forms ◽

Carboxy Terminal

In mitogenically stimulated cells, a specific complex forms between the Ras GTPase-activating protein (RasGAP) and the cellular protein p190. We have previously reported that p190 contains a carboxy-terminal domain that functions as a GAP for the Rho family GTPases. Thus, the RasGAP-p190 complex may serve to couple Ras- and Rho-mediated signalling pathways. In addition to its RhoGAP domain, p190 contains an amino-terminal domain that contains sequence motifs found in all known GTPases. Here, we report that p190 binds GTP and GDP through this conserved domain and that the structural requirements for binding are similar to those seen with other GTPases. While the purified protein is unable to hydrolyze GTP, we detect an activity in cell lysates that can promote GTP hydrolysis by p190. A mutated form of p190 that fails to bind nucleotide retains its RasGAP binding and RhoGAP activities, indicating that GTP binding by p190 is not required for these functions. The sequence of p190 in the GTP-binding domain, which shares structural features with both the Ras-like small GTPases and the larger G proteins, suggests that this protein defines a novel class of guanine nucleotide-binding proteins.

Download Full-text

Molecular and Cytological Analyses of Large Tracks of Centromeric DNA Reveal the Structure and Evolutionary Dynamics of Maize Centromeres

Genetics ◽

10.1093/genetics/163.2.759 ◽

2003 ◽

Vol 163 (2) ◽

pp. 759-770 ◽

Cited By ~ 5

Author(s):

Kiyotaka Nagaki ◽

Junqi Song ◽

Robert M Stupar ◽

Alexander S Parokonny ◽

Qiaoping Yuan ◽

...

Keyword(s):

Evolutionary Dynamics ◽

Artificial Chromosome ◽

Grass Species ◽

Sequence Motifs ◽

Long Terminal Repeats ◽

Satellite Repeat ◽

Sequence Comparisons ◽

Centromeric Dna ◽

Conserved Sequence ◽

Satellite Sequences

Abstract We sequenced two maize bacterial artificial chromosome (BAC) clones anchored by the centromere-specific satellite repeat CentC. The two BACs, consisting of ∼200 kb of cytologically defined centromeric DNA, are composed exclusively of satellite sequences and retrotransposons that can be classified as centromere specific or noncentromere specific on the basis of their distribution in the maize genome. Sequence analysis suggests that the original maize sequences were composed of CentC arrays that were expanded by retrotransposon invasions. Seven centromere-specific retrotransposons of maize (CRM) were found in BAC 16H10. The CRM elements inserted randomly into either CentC monomers or other retrotransposons. Sequence comparisons of the long terminal repeats (LTRs) of individual CRM elements indicated that these elements transposed within the last 1.22 million years. We observed that all of the previously reported centromere-specific retrotransposons in rice and barley, which belong to the same family as the CRM elements, also recently transposed with the oldest element having transposed ∼3.8 million years ago. Highly conserved sequence motifs were found in the LTRs of the centromere-specific retrotransposons in the grass species, suggesting that the LTRs may be important for the centromere specificity of this retrotransposon family.

Download Full-text