scholarly journals The CATH extended protein-family database: Providing structural annotations for genome sequences

2009 ◽  
Vol 11 (2) ◽  
pp. 233-244 ◽  
Author(s):  
Frances M.G. Pearl ◽  
David Lee ◽  
James E. Bray ◽  
Daniel W.A. Buchan ◽  
Adrian J. Shepherd ◽  
...  
2000 ◽  
Vol 28 (1) ◽  
pp. 273-276 ◽  
Author(s):  
H. Huang

2002 ◽  
Vol 18 (12) ◽  
pp. 1666-1672 ◽  
Author(s):  
A. J. Shepherd ◽  
N. J. Martin ◽  
R. G. Johnson ◽  
P. Kellam ◽  
C. A. Orengo

PROTEOMICS ◽  
2002 ◽  
Vol 2 (1) ◽  
pp. 11-21 ◽  
Author(s):  
Christine A. Orengo ◽  
James E. Bray ◽  
Daniel W. A. Buchan ◽  
Andrew Harrison ◽  
David Lee ◽  
...  

1999 ◽  
Vol 27 (1) ◽  
pp. 272-274 ◽  
Author(s):  
C. H. Wu ◽  
S. Shivakumar ◽  
H. Huang

2020 ◽  
Author(s):  
Elena Tea Russo ◽  
Alessandro Laio ◽  
Marco Punta

As the UniProt database approaches the 200 million entries' mark, the vast majority of proteins it contains lack any experimental validation of their functions. In this context, the identification of homologous relationships between proteins remains the single most widely applicable tool for generating functional and structural hypotheses in silico. Although many databases exist that classify proteins and protein domains into homologous families, large sections of the sequence space remain unassigned. We introduce DPCfam, a new unsupervised procedure that uses sequence alignments and Density Peak Clustering to automatically classify homologous protein regions. Here, we present a proof-of-principle experiment based on the analysis of two clans from the Pfam protein family database. Our tests indicate that DPCfam automatically-generated clusters are generally evolutionary accurate corresponding to one or more Pfam families and that they cover a significant fraction of known homologs. Overall, DPCfam shows potential both for assisting manual annotation efforts (domain discovery, detection of classification inconsistencies, improvement of family coverage and boosting of clan membership) and as a stand-alone tool for unsupervised classification of sparsely annotated protein datasets such as those from environmental metagenomics studies (domain discovery, analysis of domain diversity). Algorithm implementation used in this paper is available at https://gitlab.com/ETRu/dpcfam (Requires Python 3, C++ compiler and runs on Linux systems.); data are available at https://zenodo.org/record/3934399


2018 ◽  
Vol 200 (7) ◽  
Author(s):  
Tino Krell

ABSTRACT Two-component systems (TCS) exist in bacteria and archaea. In contrast to the knowledge of bacterial TCSs, little information is available on their archaeal counterparts. In the current issue of Journal of Bacteriology , Galperin and coworkers present a bioinformatics analysis of TCS genes from archaeal genome sequences (M. Y. Galperin, K. S. Makarova, Y. I. Wolf, and E. V. Koonin, J Bacteriol 200:e00681-17, 2018, https://doi.org/10.1128/JB.00681-17 ). This study identifies different aspects in which TCS-mediated signaling differs in bacteria and archaea and forms a sound basis for the experimental design of studies to increase our knowledge of this poorly investigated protein family.


2004 ◽  
Vol 33 (Database issue) ◽  
pp. D226-D229 ◽  
Author(s):  
T. Meinel

2011 ◽  
Vol 5 ◽  
pp. BBI.S7316 ◽  
Author(s):  
Shaneka S. Simmons ◽  
Raphael D. Isokpehi ◽  
Shyretha D. Brown ◽  
Donee L. Mcallister ◽  
Charnia C. Hall ◽  
...  

Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R. palustris genomes.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1975
Author(s):  
Matt Jeffryes ◽  
Alex Bateman

Protein family databases are an important tool for biologists trying to dissect the function of proteins. Comparing potential new families to the thousands of existing entries is an important task when operating a protein family database. This comparison helps to understand whether a collection of protein regions forms a novel family or has overlaps with existing families of proteins. In this paper, we describe a method for performing this analysis with an adjustable level of accuracy, depending on the desired speed, enabling interactive comparisons. This method is based upon the MinHash algorithm, which we have further extended to calculate the Jaccard containment rather than the Jaccard index of the original MinHash technique. Testing this method with the Pfam protein family database, we are able to compare potential new families to the over 17,000 existing families in Pfam in less than a second, with little loss in accuracy.


2010 ◽  
Vol 192 (8) ◽  
pp. 2068-2076 ◽  
Author(s):  
Morten Kjos ◽  
Lars Snipen ◽  
Zhian Salehian ◽  
Ingolf F. Nes ◽  
Dzung B. Diep

ABSTRACT The Abi protein family consists of putative membrane-bound metalloproteases. While they are involved in membrane anchoring of proteins in eukaryotes, little is known about their function in prokaryotes. In some known bacteriocin loci, Abi genes have been found downstream of bacteriocin structural genes (e.g., pln locus from Lactobacillus plantarum and sag locus from Streptococcus pyogenes), where they probably are involved in self-immunity. By modifying the profile hidden Markov model used to select Abi proteins in the Pfam protein family database, we show that this family is larger than presently recognized. Using bacteriocin-associated Abi genes as a means to search for novel bacteriocins in sequenced genomes, seven new bacteriocin-like loci were identified in Gram-positive bacteria. One such locus, from Lactobacillus sakei 23K, was selected for further experimental study, and it was confirmed that the bacteriocin-like genes (skkAB) exhibited antimicrobial activity when expressed in a heterologous host and that the associated Abi gene (skkI) conferred immunity against the cognate bacteriocin. Similar investigation of the Abi gene plnI and the Abi-like gene plnL from L. plantarum also confirmed their involvement in immunity to their cognate bacteriocins (PlnEF and PlnJK, respectively). Interestingly, the immunity genes from these three systems conferred a high degree of cross-immunity against each other's bacteriocins, suggesting the recognition of a common receptor. Site-directed mutagenesis demonstrated that the conserved motifs constituting the putative proteolytic active site of the Abi proteins are essential for the immunity function of SkkI, and to our knowledge, this represents a new concept in self-immunity.


Sign in / Sign up

Export Citation Format

Share Document