A protein standard that emulates homology for the characterization of protein inference algorithms

Mapping Intimacies ◽

10.1101/236471 ◽

2017 ◽

Cited By ~ 2

Author(s):

Matthew The ◽

Fredrik Edfors ◽

Yasset Perez-Riverol ◽

Samuel H. Payne ◽

Michael R. Hoopmann ◽

...

Keyword(s):

Shotgun Proteomics ◽

Tryptic Peptides ◽

Protein Fragments ◽

E Coli ◽

Protein Inference ◽

Inference Algorithms ◽

False Sense ◽

Inference Methods ◽

Natural Way

AbstractA natural way to benchmark the performance of an analytical experimental setup is to use samples of known content, and see to what degree one can correctly infer the content of such a sample from the data. For shotgun proteomics, one of the inherent problems of interpreting data is that the measured analytes are peptides and not the actual proteins themselves. As some proteins share proteolytic peptides, there might be more than one possible causative set of proteins resulting in a given set of peptides and there is a need for mechanisms that infer proteins from lists of detected peptides. A weakness of commercially available samples of known content is that they consist of proteins that are deliberately selected for producing tryptic peptides that are unique to a single protein. Unfortunately, such samples do not expose any complications in protein inference. For a realistic benchmark of protein inference procedures, there is, therefore, a need for samples of known content where the present proteins share peptides with known absent proteins. Here, we present such a standard, that is based on E. coli expressed human protein fragments. To illustrate the usage of this standard, we benchmark a set of different protein inference procedures on the data. We observe that inference procedures excluding shared peptides provide more accurate estimates of errors compared to methods that include information from shared peptides, while still giving a reasonable performance in terms of the number of identified proteins. We also demonstrate that using a sample of known protein content without proteins with shared tryptic peptides can give a false sense of accuracy for many protein inference methods.

Download Full-text

In Silico Proteome Cleavage Reveals Iterative Digestion Strategy for High Sequence Coverage

ISRN Computational Biology ◽

10.1155/2014/960902 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 7

Author(s):

Jesse G. Meyer

Keyword(s):

Posttranslational Modifications ◽

Rna Splicing ◽

Shotgun Proteomics ◽

Peptide Sequence ◽

Sequence Coverage ◽

Proteome Coverage ◽

Protein Inference ◽

Peptide Length ◽

Splicing Variants

In the postgenome era, biologists have sought to measure the complete complement of proteins, termed proteomics. Currently, the most effective method to measure the proteome is with shotgun, or bottom-up, proteomics, in which the proteome is digested into peptides that are identified followed by protein inference. Despite continuous improvements to all steps of the shotgun proteomics workflow, observed proteome coverage is often low; some proteins are identified by a single peptide sequence. Complete proteome sequence coverage would allow comprehensive characterization of RNA splicing variants and all posttranslational modifications, which would drastically improve the accuracy of biological models. There are many reasons for the sequence coverage deficit, but ultimately peptide length determines sequence observability. Peptides that are too short are lost because they match many protein sequences and their true origin is ambiguous. The maximum observable peptide length is determined by several analytical challenges. This paper explores computationally how peptide lengths produced from several common proteome digestion methods limit observable proteome coverage. Iterative proteome cleavage strategies are also explored. These simulations reveal that maximized proteome coverage can be achieved by use of an iterative digestion protocol involving multiple proteases and chemical cleavages that theoretically allow 92.9% proteome coverage.

Download Full-text

Protein inference using PIA workflows and PSI standard file formats

10.1101/424473 ◽

2018 ◽

Author(s):

Julian Uszkoreit ◽

Yasset Perez-Riverol ◽

Britta Eggers ◽

Katrin Marcus ◽

Martin Eisenacher

Keyword(s):

Mass Spectrometry ◽

Open Source ◽

High Throughput ◽

Biological Samples ◽

Shotgun Proteomics ◽

Original Sample ◽

Protein Inference ◽

File Formats ◽

Inference Algorithms ◽

Identification And Quantification

AbstractProteomics using LC-MS/MS has become one of the main methods to analyze the proteins in biological samples in high-throughput. But the existing mass spectrometry instruments are still limited with respect to resolution and measurable mass ranges, which is one of the main reasons why shotgun proteomics is the major approach. Here, proteins are digested, which leads to the identification and quantification of peptides instead. While often neglected, the important step of protein inference needs to be conducted to infer from the identified peptides to the actual proteins in the original sample.In this work, we highlight some of the previously published and newly added features of the tool PIA – Protein Inference Algorithms, which helps the user with the protein inference of measured samples. We also highlight the importance of the usage of PSI standard file formats, as PIA is the only current software supporting all available standards used for spectrum identification and protein inference. Additionally, we briefly describe the benefits of working with workflow environments for proteomics analyses and show the new features of the PIA nodes for the KNIME Analytics Platform. Finally, we benchmark PIA against a recently published dataset for isoform detection.PIA is open source and available for download on GitHub (https://github.com/mpc-bioinformatics/pia) or directly via the community extensions inside the KNIME analytics platform.

Download Full-text

Discovery of the "environment-specific" E. coli from the subtropical marine sediment and to develop a novel shotgun proteomics-based method for strain level characterization of E. coli

10.14711/thesis-b1330191 ◽

2014 ◽

Author(s):

Min Zhang

Keyword(s):

Marine Sediment ◽

Shotgun Proteomics ◽

Strain Level ◽

E Coli

Download Full-text

A Protein Standard That Emulates Homology for the Characterization of Protein Inference Algorithms

Journal of Proteome Research ◽

10.1021/acs.jproteome.7b00899 ◽

2018 ◽

Vol 17 (5) ◽

pp. 1879-1886 ◽

Cited By ~ 11

Author(s):

Matthew The ◽

Fredrik Edfors ◽

Yasset Perez-Riverol ◽

Samuel H. Payne ◽

Michael R. Hoopmann ◽

...

Keyword(s):

Protein Inference ◽

Inference Algorithms

Download Full-text

Rapid, Refined, and Robust Method for Expression, Purification, and Characterization of Recombinant Human Amyloid-beta M1-42

10.26434/chemrxiv.7725002.v1 ◽

2019 ◽

Author(s):

Priya Prakash ◽

Travis Lantz ◽

Krupal P. Jethava ◽

Gaurav Chopra

Keyword(s):

Amyloid Beta ◽

Western Blots ◽

Force Microscopy ◽

E Coli ◽

Atomic Force ◽

Set Up ◽

Short Time

Amyloid plaques found in the brains of Alzheimer’s disease (AD) patients primarily consists of amyloid beta 1-42 (Ab42). Commercially, Ab42 is synthetized using peptide synthesizers. We describe a robust methodology for expression of recombinant human Ab(M1-42) in Rosetta(DE3)pLysS and BL21(DE3)pLysS competent E. coli with refined and rapid analytical purification techniques. The peptide is isolated and purified from the transformed cells using an optimized set-up for reverse-phase HPLC protocol, using commonly available C18 columns, yielding high amounts of peptide (~15-20 mg per 1 L culture) in a short time. The recombinant Ab(M1-42) forms characteristic aggregates similar to synthetic Ab42 aggregates as verified by western blots and atomic force microscopy to warrant future biological use. Our rapid, refined, and robust technique to purify human Ab(M1-42) can be used to synthesize chemical probes for several downstream in vitro and in vivo assays to facilitate AD research.

Download Full-text

Antimicrobial Resistance and Virulence Characterization of E. Coli Isolated from Subclinical Mastitic Sheep and Goats

Benha Veterinary Medical Journal ◽

10.21608/bvmj.2018.44977 ◽

2018 ◽

Vol 34 (3) ◽

pp. 267-278

Author(s):

Ashraf A. Abd El-Tawab ◽

Mohamed G. Aggour ◽

Fatma I. El- Hofy ◽

Marwa M. Y. El- Mesalami

Keyword(s):

Antimicrobial Resistance ◽

Sheep And Goats ◽

E Coli

Download Full-text

Functional identification of ygiP as a positive regulator of the ttdA-ttdB-ygjE operon

Microbiology ◽

10.1099/mic.0.28753-0 ◽

2006 ◽

Vol 152 (7) ◽

pp. 2129-2135 ◽

Cited By ~ 17

Author(s):

Taku Oshima ◽

Francis Biville

Keyword(s):

Functional Characterization ◽

Anaerobic Growth ◽

Functional Identification ◽

New Members ◽

E Coli ◽

Inner Membrane Protein ◽

Tartrate Dehydratase

Functional characterization of unknown genes is currently a major task in biology. The search for gene function involves a combination of various in silico, in vitro and in vivo approaches. Available knowledge from the study of more than 21 LysR-type regulators in Escherichia coli has facilitated the classification of new members of the family. From sequence similarities and its location on the E. coli chromosome, it is suggested that ygiP encodes a lysR regulator controlling the expression of a neighbouring operon; this operon encodes the two subunits of tartrate dehydratase (TtdA, TtdB) and YgiE, an integral inner-membrane protein possibly involved in tartrate uptake. Expression of tartrate dehydratase, which converts tartrate to oxaloacetate, is required for anaerobic growth on glycerol as carbon source in the presence of tartrate. Here, it has been demonstrated that disruption of ygiP, ttdA or ygjE abolishes tartrate-dependent anaerobic growth on glycerol. It has also been shown that tartrate-dependent induction of the ttdA-ttdB-ygjE operon requires a functional YgiP.

Download Full-text

Molecular characterization of clinical carbapenem-resistant Enterobacterales from Qatar

European Journal of Clinical Microbiology & Infectious Diseases ◽

10.1007/s10096-021-04185-7 ◽

2021 ◽

Author(s):

Fatma Ben Abid ◽

Clement K. M. Tsui ◽

Yohei Doi ◽

Anand Deshmukh ◽

Christi L. McElheny ◽

...

Keyword(s):

Common Species ◽

Clinical Samples ◽

Whole Genome ◽

E Coli ◽

Carbapenem Resistant ◽

Genes Encoding ◽

Encoding Genes ◽

Sequence Types ◽

Common Sequence

AbstractOne hundred forty-nine carbapenem-resistant Enterobacterales from clinical samples obtained between April 2014 and November 2017 were subjected to whole genome sequencing and multi-locus sequence typing. Klebsiella pneumoniae (81, 54.4%) and Escherichia coli (38, 25.5%) were the most common species. Genes encoding metallo-β-lactamases were detected in 68 (45.8%) isolates, and OXA-48-like enzymes in 60 (40.3%). blaNDM-1 (45; 30.2%) and blaOXA-48 (29; 19.5%) were the most frequent. KPC-encoding genes were identified in 5 (3.6%) isolates. Most common sequence types were E. coli ST410 (8; 21.1%) and ST38 (7; 18.4%), and K. pneumoniae ST147 (13; 16%) and ST231 (7; 8.6%).

Download Full-text

ModularBoost: an efficient network inference algorithm based on module decomposition

BMC Bioinformatics ◽

10.1186/s12859-021-04074-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xinyu Li ◽

Wei Zhang ◽

Jianming Zhang ◽

Guang Li

Keyword(s):

Network Inference ◽

Detection Methods ◽

Inference Problem ◽

Topological Constraints ◽

Inference Algorithms ◽

Module Detection ◽

Series Expression ◽

Gene Modules ◽

Inference Methods ◽

Complicated Task

Abstract Background Given expression data, gene regulatory network(GRN) inference approaches try to determine regulatory relations. However, current inference methods ignore the inherent topological characters of GRN to some extent, leading to structures that lack clear biological explanation. To increase the biophysical meanings of inferred networks, this study performed data-driven module detection before network inference. Gene modules were identified by decomposition-based methods. Results ICA-decomposition based module detection methods have been used to detect functional modules directly from transcriptomic data. Experiments about time-series expression, curated and scRNA-seq datasets suggested that the advantages of the proposed ModularBoost method over established methods, especially in the efficiency and accuracy. For scRNA-seq datasets, the ModularBoost method outperformed other candidate inference algorithms. Conclusions As a complicated task, GRN inference can be decomposed into several tasks of reduced complexity. Using identified gene modules as topological constraints, the initial inference problem can be accomplished by inferring intra-modular and inter-modular interactions respectively. Experimental outcomes suggest that the proposed ModularBoost method can improve the accuracy and efficiency of inference algorithms by introducing topological constraints.

Download Full-text

In‐Cell Characterization of the Stable Tyrosyl Radical in E. coli Ribonucleotide Reductase via Advanced EPR Spectroscopy

Angewandte Chemie International Edition ◽

10.1002/anie.202102914 ◽

2021 ◽

Author(s):

Shari Lorraine Meichsner ◽

Yury Kutin ◽

Müge Kasanmascheff

Keyword(s):

Ribonucleotide Reductase ◽

Epr Spectroscopy ◽

Tyrosyl Radical ◽

Cell Characterization ◽

E Coli

Download Full-text