protein family database Latest Research Papers

As the UniProt database approaches the 200 million entries' mark, the vast majority of proteins it contains lack any experimental validation of their functions. In this context, the identification of homologous relationships between proteins remains the single most widely applicable tool for generating functional and structural hypotheses in silico. Although many databases exist that classify proteins and protein domains into homologous families, large sections of the sequence space remain unassigned. We introduce DPCfam, a new unsupervised procedure that uses sequence alignments and Density Peak Clustering to automatically classify homologous protein regions. Here, we present a proof-of-principle experiment based on the analysis of two clans from the Pfam protein family database. Our tests indicate that DPCfam automatically-generated clusters are generally evolutionary accurate corresponding to one or more Pfam families and that they cover a significant fraction of known homologs. Overall, DPCfam shows potential both for assisting manual annotation efforts (domain discovery, detection of classification inconsistencies, improvement of family coverage and boosting of clan membership) and as a stand-alone tool for unsupervised classification of sparsely annotated protein datasets such as those from environmental metagenomics studies (domain discovery, analysis of domain diversity). Algorithm implementation used in this paper is available at https://gitlab.com/ETRu/dpcfam (Requires Python 3, C++ compiler and runs on Linux systems.); data are available at https://zenodo.org/record/3934399

Download Full-text

Characterization of Prenylated C-terminal Peptides Using a Novel Capture Technique Coupled with LCMS

10.1101/2020.01.15.908152 ◽

2020 ◽

Author(s):

James A. Wilkins ◽

Krista Kaasik ◽

Robert J. Chalkley ◽

Alma Burlingame

Keyword(s):

Cell Culture ◽

High Performance ◽

Direct Approach ◽

Metabolic Labeling ◽

Tissue Samples ◽

Mixed Disulfide ◽

Cell Extracts ◽

Protein Family Database ◽

Prenylated Proteins

Post-translational modifications play a critical and diverse role in regulating cellular activities. Despite their fundamentally important role in cellular function, there has been no report to date of an effective generalized approach to the targeting, extraction and characterization of the critical c-terminal regions of natively prenylated proteins. Various chemical modification and metabolic labelling strategies in cell culture have been reported. However, their applicability is limited to cell culture systems and does not allow for analysis of tissue samples. The chemical characteristics (hydrophobicity, low abundance, highly basic charge) of many of the c-terminal regions of prenylated proteins have impaired the use of standard proteomic workflows. In this context, we sought a direct approach to the problem in order to examine these proteins in tissue without the use of labelling. Here we demonstrate that prenylated proteins can be captured on chromatographic resins functionalized with mixed disulfide functions. Protease treatment of resin-bound proteins using chymotryptic digestion revealed peptides from many known prenylated proteins. Exposure of the protease-treated resin to reducing agents and hydro organic mixtures released c-terminal peptides with intact prenyl groups along with other enzymatic modifications expected in this protein family. Database and search parameters were selected to allow for c-terminal modifications unique to these molecules such as CAAX box processing and c-terminal methylation. In summary, we present a direct approach to enrich and obtain information at a molecular level of detail about prenylation of proteins from tissue and cell extracts using high performance LCMS without the need for metabolic labeling and derivatization.

Download Full-text

Rapid identification of novel protein families using similarity searches

F1000Research ◽

10.12688/f1000research.17315.1 ◽

2018 ◽

Vol 7 ◽

pp. 1975

Author(s):

Matt Jeffryes ◽

Alex Bateman

Keyword(s):

Rapid Identification ◽

Jaccard Index ◽

Protein Family ◽

Important Task ◽

Protein Families ◽

P Protein ◽

Protein Family Database ◽

Similarity Searches ◽

Novel Protein

Protein family databases are an important tool for biologists trying to dissect the function of proteins. Comparing potential new families to the thousands of existing entries is an important task when operating a protein family database. This comparison helps to understand whether a collection of protein regions forms a novel family or has overlaps with existing families of proteins. In this paper, we describe a method for performing this analysis with an adjustable level of accuracy, depending on the desired speed, enabling interactive comparisons. This method is based upon the MinHash algorithm, which we have further extended to calculate the Jaccard containment rather than the Jaccard index of the original MinHash technique. Testing this method with the Pfam protein family database, we are able to compare potential new families to the over 17,000 existing families in Pfam in less than a second, with little loss in accuracy.

Download Full-text

The Abi Proteins and Their Involvement in Bacteriocin Self-Immunity

Journal of Bacteriology ◽

10.1128/jb.01553-09 ◽

2010 ◽

Vol 192 (8) ◽

pp. 2068-2076 ◽

Cited By ~ 61

Author(s):

Morten Kjos ◽

Lars Snipen ◽

Zhian Salehian ◽

Ingolf F. Nes ◽

Dzung B. Diep

Keyword(s):

Receptor Site ◽

Protein Family ◽

Site Directed Mutagenesis ◽

Gram Positive Bacteria ◽

Common Receptor ◽

Protein Family Database ◽

Immunity Genes ◽

Membrane Anchoring ◽

Immunity Function ◽

High Degree

ABSTRACT The Abi protein family consists of putative membrane-bound metalloproteases. While they are involved in membrane anchoring of proteins in eukaryotes, little is known about their function in prokaryotes. In some known bacteriocin loci, Abi genes have been found downstream of bacteriocin structural genes (e.g., pln locus from Lactobacillus plantarum and sag locus from Streptococcus pyogenes), where they probably are involved in self-immunity. By modifying the profile hidden Markov model used to select Abi proteins in the Pfam protein family database, we show that this family is larger than presently recognized. Using bacteriocin-associated Abi genes as a means to search for novel bacteriocins in sequenced genomes, seven new bacteriocin-like loci were identified in Gram-positive bacteria. One such locus, from Lactobacillus sakei 23K, was selected for further experimental study, and it was confirmed that the bacteriocin-like genes (skkAB) exhibited antimicrobial activity when expressed in a heterologous host and that the associated Abi gene (skkI) conferred immunity against the cognate bacteriocin. Similar investigation of the Abi gene plnI and the Abi-like gene plnL from L. plantarum also confirmed their involvement in immunity to their cognate bacteriocins (PlnEF and PlnJK, respectively). Interestingly, the immunity genes from these three systems conferred a high degree of cross-immunity against each other's bacteriocins, suggesting the recognition of a common receptor. Site-directed mutagenesis demonstrated that the conserved motifs constituting the putative proteolytic active site of the Abi proteins are essential for the immunity function of SkkI, and to our knowledge, this represents a new concept in self-immunity.

Download Full-text

The CATH extended protein-family database: Providing structural annotations for genome sequences

Protein Science ◽

10.1110/ps.16802 ◽

2009 ◽

Vol 11 (2) ◽

pp. 233-244 ◽

Cited By ~ 23

Author(s):

Frances M.G. Pearl ◽

David Lee ◽

James E. Bray ◽

Daniel W.A. Buchan ◽

Adrian J. Shepherd ◽

...

Keyword(s):

Protein Family ◽

Genome Sequences ◽

Protein Family Database

Download Full-text

The SYSTERS Protein Family Database in 2005

Nucleic Acids Research ◽

10.1093/nar/gki030 ◽

2004 ◽

Vol 33 (Database issue) ◽

pp. D226-D229 ◽

Cited By ~ 23

Author(s):

T. Meinel

Keyword(s):

Protein Family ◽

Protein Family Database

Download Full-text

Faculty Opinions recommendation of PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1011539.181067 ◽

2003 ◽

Author(s):

Janet Thornton

Keyword(s):

Domain Structure ◽

Protein Family ◽

Family Resources ◽

Structure Database ◽

Protein Family Database

Download Full-text

PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources

Bioinformatics ◽

10.1093/bioinformatics/18.12.1666 ◽

2002 ◽

Vol 18 (12) ◽

pp. 1666-1672 ◽

Cited By ~ 8

Author(s):

A. J. Shepherd ◽

N. J. Martin ◽

R. G. Johnson ◽

P. Kellam ◽

C. A. Orengo

Keyword(s):

Domain Structure ◽

Protein Family ◽

Family Resources ◽

Structure Database ◽

Protein Family Database

Download Full-text

The CATH protein family database: A resource for structural and functional annotation of genomes

PROTEOMICS ◽

10.1002/1615-9861(200201)2:1<11::aid-prot11>3.0.co;2-t ◽

2002 ◽

Vol 2 (1) ◽

pp. 11-21 ◽

Cited By ~ 51

Author(s):

Christine A. Orengo ◽

James E. Bray ◽

Daniel W. A. Buchan ◽

Andrew Harrison ◽

David Lee ◽

...

Keyword(s):

Functional Annotation ◽

Protein Family ◽

Protein Family Database

Download Full-text

ProClass protein family database

Nucleic Acids Research ◽

10.1093/nar/28.1.273 ◽

2000 ◽

Vol 28 (1) ◽

pp. 273-276 ◽

Cited By ~ 9

Author(s):

H. Huang

Keyword(s):

Protein Family ◽

Protein Family Database

Download Full-text

protein family database
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

DPCfam: a new method for unsupervised protein family classification

Characterization of Prenylated C-terminal Peptides Using a Novel Capture Technique Coupled with LCMS

Rapid identification of novel protein families using similarity searches

The Abi Proteins and Their Involvement in Bacteriocin Self-Immunity

The CATH extended protein-family database: Providing structural annotations for genome sequences

The SYSTERS Protein Family Database in 2005

Faculty Opinions recommendation of PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources.

PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources

The CATH protein family database: A resource for structural and functional annotation of genomes

ProClass protein family database

Export Citation Format

protein family databaseRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

DPCfam: a new method for unsupervised protein family classification

Characterization of Prenylated C-terminal Peptides Using a Novel Capture Technique Coupled with LCMS

Rapid identification of novel protein families using similarity searches

The Abi Proteins and Their Involvement in Bacteriocin Self-Immunity

The CATH extended protein-family database: Providing structural annotations for genome sequences

The SYSTERS Protein Family Database in 2005

Faculty Opinions recommendation of PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources.

PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources

The CATH protein family database: A resource for structural and functional annotation of genomes

ProClass protein family database

protein family database
Recently Published Documents