Analysis and prediction of functional sub-types from protein sequence alignments

Sridhar S. Hannenhalli; Robert B. Russell

doi:10.1006/jmbi.2000.4036

Crystallization of an atypical short-chain dehydrogenase fromVibrio vulnificuslacking the conserved catalytic tetrad

Acta Crystallographica Section F Structural Biology and Crystallization Communications ◽

10.1107/s1744309112018672 ◽

2012 ◽

Vol 68 (7) ◽

pp. 771-774 ◽

Cited By ~ 3

Author(s):

Geraldine Buysschaert ◽

Kenneth Verstraete ◽

Savvas N. Savvides ◽

Bjorn Vergauwen

Keyword(s):

Protein Sequence ◽

Scientific Literature ◽

Vibrio Vulnificus ◽

Short Chain ◽

Reduced Form ◽

Molecular Replacement ◽

Structural Studies ◽

Sequence Alignments ◽

Crystal Forms ◽

Short Chain Dehydrogenase

Short-chain dehydrogenases/reductases (SDRs) are a rapidly expanding superfamily of enzymes that are found in all kingdoms of life. Hallmarked by a highly conserved Asn-Ser-Tyr-Lys catalytic tetrad, SDRs have a broad substrate spectrum and play diverse roles in key metabolic processes. Locus tag VVA1599 inVibrio vulnificusencodes a short-chain dehydrogenase (hereafter referred to as SDRvv) which lacks the signature catalytic tetrad of SDR members. Structure-based protein sequence alignments have suggested that SDRvv may harbour a unique binding site for its nicotinamide cofactor. To date, structural studies of SDRs with altered catalytic centres are underrepresented in the scientific literature, thus limiting understanding of their spectrum of substrate and cofactor preferences. Here, the expression, purification and crystallization of recombinant SDRvv are presented. Two well diffracting crystal forms could be obtained by cocrystallization in the presence of the reduced form of the phosphorylated nicotinamide cofactor NADPH. The collected data were of sufficient quality for successful structure determination by molecular replacement and subsequent refinement. This work sets the stage for deriving the identity of the natural substrate of SDRvv and the structure–function landscape of typical and atypical SDRs.

Download Full-text

PROMALS web server for accurate multiple protein sequence alignments

Nucleic Acids Research ◽

10.1093/nar/gkm227 ◽

2007 ◽

Vol 35 (Web Server) ◽

pp. W649-W652 ◽

Cited By ~ 46

Author(s):

J. Pei ◽

B.-H. Kim ◽

M. Tang ◽

N. V. Grishin

Keyword(s):

Protein Sequence ◽

Web Server ◽

Sequence Alignments ◽

Multiple Protein

Download Full-text

Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms

Bioinformatics ◽

10.1093/bioinformatics/btn097 ◽

2008 ◽

Vol 24 (9) ◽

pp. 1145-1153 ◽

Cited By ~ 3

Author(s):

A. Poleksic ◽

M. Fienup

Keyword(s):

Protein Sequence ◽

Sequence Alignments ◽

Sequence Profiles

Download Full-text

Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation

Bioinformatics ◽

10.1093/bioinformatics/9.6.745 ◽

1993 ◽

Vol 9 (6) ◽

pp. 745-756 ◽

Cited By ~ 94

Author(s):

Craig D. Livingstone ◽

Geoffrey J. Barton

Keyword(s):

Protein Sequence ◽

Hierarchical Analysis ◽

Sequence Alignments ◽

Residue Conservation

Download Full-text

Determination of reliable regions in protein sequence alignments

Protein Engineering Design and Selection ◽

10.1093/protein/3.7.565 ◽

1990 ◽

Vol 3 (7) ◽

pp. 565-569 ◽

Cited By ~ 60

Author(s):

Martin Vingron ◽

Patrick Argos

Keyword(s):

Protein Sequence ◽

Sequence Alignments

Download Full-text

HPMV: Human protein mutation viewer — relating sequence mutations to protein sequence architecture and function changes

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720015500286 ◽

2015 ◽

Vol 13 (05) ◽

pp. 1550028 ◽

Cited By ~ 2

Author(s):

Westley Arthur Sherman ◽

Durga Bhavani Kuchibhatla ◽

Vachiranee Limviphuvadh ◽

Sebastian Maurer-Stroh ◽

Birgit Eisenhaber ◽

...

Keyword(s):

Protein Sequence ◽

Genetic Disorders ◽

Human Protein ◽

Post Translational Modification ◽

Sequence Alignments ◽

Multiple Sequence ◽

Sequence Architecture ◽

Protein Mutation ◽

Architectural Features ◽

Protein Mutations

Next-generation sequencing advances are rapidly expanding the number of human mutations to be analyzed for causative roles in genetic disorders. Our Human Protein Mutation Viewer (HPMV) is intended to explore the biomolecular mechanistic significance of non-synonymous human mutations in protein-coding genomic regions. The tool helps to assess whether protein mutations affect the occurrence of sequence-architectural features (globular domains, targeting signals, post-translational modification sites, etc.). As input, HPMV accepts protein mutations — as UniProt accessions with mutations (e.g. HGVS nomenclature), genome coordinates, or FASTA sequences. As output, HPMV provides an interactive cartoon showing the mutations in relation to elements of the sequence architecture. A large variety of protein sequence architectural features were selected for their particular relevance to mutation interpretation. Clicking a sequence feature in the cartoon expands a tree view of additional information including multiple sequence alignments of conserved domains and a simple 3D viewer mapping the mutation to known PDB structures, if available. The cartoon is also correlated with a multiple sequence alignment of similar sequences from other organisms. In cases where a mutation is likely to have a straightforward interpretation (e.g. a point mutation disrupting a well-understood targeting signal), this interpretation is suggested. The interactive cartoon can be downloaded as standalone viewer in Java jar format to be saved and viewed later with only a standard Java runtime environment. The HPMV website is: http://hpmv.bii.a-star.edu.sg/ .

Download Full-text

Using homology relations within a database markedly boosts protein sequence similarity search

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1424324112 ◽

2015 ◽

Vol 112 (22) ◽

pp. 7003-7008 ◽

Cited By ~ 4

Author(s):

Jing Tong ◽

Ruslan I. Sadreyev ◽

Jimin Pei ◽

Lisa N. Kinch ◽

Nick V. Grishin

Keyword(s):

Protein Sequence ◽

Sequence Similarity ◽

Large Fraction ◽

Query Sequence ◽

Homology Search ◽

Sequence Alignments ◽

Current Sequence ◽

Detection Quality ◽

And Function ◽

Precision Rate

Inference of homology from protein sequences provides an essential tool for analyzing protein structure, function, and evolution. Current sequence-based homology search methods are still unable to detect many similarities evident from protein spatial structures. In computer science a search engine can be improved by considering networks of known relationships within the search database. Here, we apply this idea to protein-sequence–based homology search and show that it dramatically enhances the search accuracy. Our new method, COMPADRE (COmparison of Multiple Protein sequence Alignments using Database RElationships) assesses the relationship between the query sequence and a hit in the database by considering the similarity between the query and hit’s known homologs. This approach increases detection quality, boosting the precision rate from 18% to 83% at half-coverage of all database homologs. The increased precision rate allows detection of a large fraction of protein structural relationships, thus providing structure and function predictions for previously uncharacterized proteins. Our results suggest that this general approach is applicable to a wide variety of methods for detection of biological similarities. The web server is available at prodata.swmed.edu/compadre.

Download Full-text

Predicting reliable regions in protein sequence alignments

Bioinformatics ◽

10.1093/bioinformatics/18.2.306 ◽

2002 ◽

Vol 18 (2) ◽

pp. 306-314 ◽

Cited By ~ 52

Author(s):

M. Cline ◽

R. Hughey ◽

K. Karplus

Keyword(s):

Protein Sequence ◽

Sequence Alignments

Download Full-text

Empirical Analysis of Protein Insertions and Deletions Determining Parameters for the Correct Placement of Gaps in Protein Sequence Alignments

Journal of Molecular Biology ◽

10.1016/j.jmb.2004.05.045 ◽

2004 ◽

Vol 341 (2) ◽

pp. 617-631 ◽

Cited By ~ 55

Author(s):

Mike S.S. Chang ◽

Steven A. Benner

Keyword(s):

Empirical Analysis ◽

Protein Sequence ◽

Sequence Alignments ◽

Insertions And Deletions ◽

Correct Placement

Download Full-text

A novel sequence alignment algorithm based on deep learning of the protein folding code

Bioinformatics ◽

10.1093/bioinformatics/btaa810 ◽

2020 ◽

Cited By ~ 1

Author(s):

Mu Gao ◽

Jeffrey Skolnick

Keyword(s):

Protein Folding ◽

Deep Learning ◽

Sequence Alignment ◽

Protein Sequence ◽

Protein Structures ◽

Supplementary Information ◽

Alignment Algorithm ◽

Sequence Alignments ◽

Alignment Algorithms ◽

Structural Alignments

Abstract Motivation From evolutionary interference, function annotation to structural prediction, protein sequence comparison has provided crucial biological insights. While many sequence alignment algorithms have been developed, existing approaches often cannot detect hidden structural relationships in the ‘twilight zone’ of low sequence identity. To address this critical problem, we introduce a computational algorithm that performs protein Sequence Alignments from deep-Learning of Structural Alignments (SAdLSA, silent ‘d’). The key idea is to implicitly learn the protein folding code from many thousands of structural alignments using experimentally determined protein structures. Results To demonstrate that the folding code was learned, we first show that SAdLSA trained on pure α-helical proteins successfully recognizes pairs of structurally related pure β-sheet protein domains. Subsequent training and benchmarking on larger, highly challenging datasets show significant improvement over established approaches. For challenging cases, SAdLSA is ∼150% better than HHsearch for generating pairwise alignments and ∼50% better for identifying the proteins with the best alignments in a sequence library. The time complexity of SAdLSA is O(N) thanks to GPU acceleration. Availability and implementation Datasets and source codes of SAdLSA are available free of charge for academic users at http://sites.gatech.edu/cssb/sadlsa/. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text