scholarly journals Proteomics: The protein complement of the genome

2003 ◽  
Vol 25 (1) ◽  
pp. 7-9
Author(s):  
Hannes Ponstingl ◽  
Janet M. Thornton

Recent advances in protein separation technology and mass spectrometry (MS) have enabled the systematic identification and quantification of large sets of proteins from an organelle, cell type or organism. In principle, protein isoforms, enzymically modified variants and protein complexes can be studied, for instance, at a certain stage in development or in response to stress or more subtle changes of the environment. An important pre-clinical application is the search for protein markers in body fluids for diagnostic purposes. Such proteomics studies can be performed increasingly at high-throughput rates that are reminiscent of those of genomic sequencing or the monitoring of messenger RNA levels. Thus, large sets of proteins can be monitored simultaneously in a single experiment. Proteomics data will increasingly be followed up by investigations of the three-dimensional structures of proteins and protein complexes at atomic detail in large-scale structural proteomics projects. We attempt in this article to give a flavour of what to us seem important experimental developments and to point to links with bioinformatics resources where appropriate.

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Christoph N Schlaffner ◽  
Konstantin Kahnert ◽  
Jan Muntel ◽  
Ruchi Chauhan ◽  
Bernhard Y Renard ◽  
...  

Improvements in LC-MS/MS methods and technology have enabled the identification of thousands of modified peptides in a single experiment. However, protein regulation by post-translational modifications (PTMs) is not binary, making methods to quantify the modification extent crucial to understanding the role of PTMs. Here, we introduce FLEXIQuant-LF, a software tool for large-scale identification of differentially modified peptides and quantification of their modification extent without knowledge of the types of modifications involved. We developed FLEXIQuant-LF using label-free quantification of unmodified peptides and robust linear regression to quantify the modification extent of peptides. As proof of concept, we applied FLEXIQuant-LF to data-independent-acquisition (DIA) data of the anaphase promoting complex/cyclosome (APC/C) during mitosis. The unbiased FLEXIQuant-LF approach to assess the modification extent in quantitative proteomics data provides a better understanding of the function and regulation of PTMs. The software is available at https://github.com/SteenOmicsLab/FLEXIQuantLF.


2021 ◽  
Author(s):  
Samuel W Olson ◽  
Anne-Marie W Turner ◽  
J Winston Arney ◽  
Irfana Saleem ◽  
Chase A Weidmann ◽  
...  

7SK is a highly conserved non-coding RNA that regulates eukaryotic transcription by sequestering positive transcription elongation factor b (P-TEFb). 7SK regulatory function likely entails changes in RNA structure, but characterizing dynamic RNA-protein complexes in cells has remained an unsolved challenge. We describe a new chemical probing strategy (DANCE-MaP) that uses maximum likelihood deconvolution and probabilistic read assignment to define simultaneously (i) per-nucleotide reactivity profiles, (ii) direct base pairing interactions, and (iii) tertiary and higher-order interactions for each conformation of multi-state RNA structural ensembles, all from a single experiment. We show that human 7SK RNA, despite significant heterogeneity, intrinsically codes for a large-scale structural switch that couples dissolution of the P-TEFb binding site to structural remodeling at distal release factor binding sites. The 7SK structural equilibrium is regulated by cell type, shifts dynamically in response to cell growth and stress, and can be exogenously targeted to modulate transcription in cells. Our data support that the 7SK structural ensemble functions as an integrator of diverse cellular signals to control transcription elongation in environment and cell specific ways, and establishes DANCE-MaP as a powerful strategy for comprehensively defining RNA structure and dynamics in cells.


2020 ◽  
Author(s):  
Yusuke Matsui ◽  
Yuichi Abe ◽  
Kohei Uno ◽  
Satoru Miyano

AbstractMotivationThe full picture of abnormalities in protein complexes in cancer remains largely unknown. Comparing the co-expression structure of each protein complex between tumor and normal groups could help us understand the cancer-specific dysfunction of proteins. However, the technical limitations of mass spectrometry-based proteomics and biological variations contaminating the protein expression with noise lead to non-negligible over- (or under-) estimating co-expression.ResultsWe propose a robust algorithm for identifying protein complex aberrations in cancer based on differential protein co-expression testing. Our method based on a copula is sufficient for improving the identification accuracy with noisy data over a conventional linear correlation-based approach. As an application, we show that important protein complexes can be identified along with regulatory signaling pathways, and even drug targets can be identified using large-scale proteomics data from renal cancer. The proposed approach goes beyond traditional linear correlations to provide insights into higher order differential co-expression structures.Availability and Implementationhttps://github.com/ymatts/[email protected]


2020 ◽  
Author(s):  
Konstantin Kahnert ◽  
Christoph N. Schlaffner ◽  
Jan Muntel ◽  
Ruchi Chauhan ◽  
Bernhard Y. Renard ◽  
...  

AbstractImprovements in LC-MS/MS methods and technology have enabled the identification of thousands of modified peptides in a single experiment. However, protein regulation by post-translational modifications (PTMs) is not binary, making methods to quantify the modification extent crucial to understanding the role of PTMs. Here, we introduce FLEXIQuant-LF, a software tool for large-scale identification of differentially modified peptides and quantification of their modification extent without prior knowledge of the type of modification. We developed FLEXIQuant-LF using label-free quantification of unmodified peptides and robust linear regression to quantify the modification extent of peptides. As proof of concept, we applied FLEXIQuant-LF to data-independent-acquisition (DIA) data of the anaphase promoting complex/cyclosome (APC/C) during mitosis. The unbiased FLEXIQuant-LF approach to assess the modification extent in quantitative proteomics data provides a better understanding of the function and regulation of PTMs. The software is available at https://github.com/SteenOmicsLab/FLEXIQuantLF.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Fernando Pozo ◽  
Laura Martinez-Gomez ◽  
Thomas A Walsh ◽  
José Manuel Rodriguez ◽  
Tomas Di Domenico ◽  
...  

Abstract Alternative splicing of messenger RNA can generate an array of mature transcripts, but it is not clear how many go on to produce functionally relevant protein isoforms. There is only limited evidence for alternative proteins in proteomics analyses and data from population genetic variation studies indicate that most alternative exons are evolving neutrally. Determining which transcripts produce biologically important isoforms is key to understanding isoform function and to interpreting the real impact of somatic mutations and germline variations. Here we have developed a method, TRIFID, to classify the functional importance of splice isoforms. TRIFID was trained on isoforms detected in large-scale proteomics analyses and distinguishes these biologically important splice isoforms with high confidence. Isoforms predicted as functionally important by the algorithm had measurable cross species conservation and significantly fewer broken functional domains. Additionally, exons that code for these functionally important protein isoforms are under purifying selection, while exons from low scoring transcripts largely appear to be evolving neutrally. TRIFID has been developed for the human genome, but it could in principle be applied to other well-annotated species. We believe that this method will generate valuable insights into the cellular importance of alternative splicing.


2021 ◽  
Vol 22 (15) ◽  
pp. 7773
Author(s):  
Neann Mathai ◽  
Conrad Stork ◽  
Johannes Kirchmair

Experimental screening of large sets of compounds against macromolecular targets is a key strategy to identify novel bioactivities. However, large-scale screening requires substantial experimental resources and is time-consuming and challenging. Therefore, small to medium-sized compound libraries with a high chance of producing genuine hits on an arbitrary protein of interest would be of great value to fields related to early drug discovery, in particular biochemical and cell research. Here, we present a computational approach that incorporates drug-likeness, predicted bioactivities, biological space coverage, and target novelty, to generate optimized compound libraries with maximized chances of producing genuine hits for a wide range of proteins. The computational approach evaluates drug-likeness with a set of established rules, predicts bioactivities with a validated, similarity-based approach, and optimizes the composition of small sets of compounds towards maximum target coverage and novelty. We found that, in comparison to the random selection of compounds for a library, our approach generates substantially improved compound sets. Quantified as the “fitness” of compound libraries, the calculated improvements ranged from +60% (for a library of 15,000 compounds) to +184% (for a library of 1000 compounds). The best of the optimized compound libraries prepared in this work are available for download as a dataset bundle (“BonMOLière”).


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Harshi Weerakoon ◽  
Jeremy Potriquet ◽  
Alok K. Shah ◽  
Sarah Reed ◽  
Buddhika Jayakody ◽  
...  

AbstractData independent analysis (DIA) exemplified by sequential window acquisition of all theoretical mass spectra (SWATH-MS) provides robust quantitative proteomics data, but the lack of a public primary human T-cell spectral library is a current resource gap. Here, we report the generation of a high-quality spectral library containing data for 4,833 distinct proteins from human T-cells across genetically unrelated donors, covering ~24% proteins of the UniProt/SwissProt reviewed human proteome. SWATH-MS analysis of 18 primary T-cell samples using the new human T-cell spectral library reliably identified and quantified 2,850 proteins at 1% false discovery rate (FDR). In comparison, the larger Pan-human spectral library identified and quantified 2,794 T-cell proteins in the same dataset. As the libraries identified an overlapping set of proteins, combining the two libraries resulted in quantification of 4,078 human T-cell proteins. Collectively, this large data archive will be a useful public resource for human T-cell proteomic studies. The human T-cell library is available at SWATHAtlas and the data are available via ProteomeXchange (PXD019446 and PXD019542) and PeptideAtlas (PASS01587).


2002 ◽  
Vol 3 (3) ◽  
pp. 221-225

In recent months a bumper crop of genomes has been completed, including the fission yeast (Schizosaccharomyces pombe) and rice (Oryza sativa). Two large-scale studies ofSaccharomyces cerevisiaeprotein complexes provided a picture of the eukaryotic proteome as a network of complexes. Amongst the other stories of interest was a demonstration that proteomic analysis of blood samples can be used to detect ovarian cancer, perhaps even as early as stage I.


Sign in / Sign up

Export Citation Format

Share Document