scholarly journals Exhaustively Identifying Cross-Linked Peptides with a Linear Computational Complexity

2016 ◽  
Author(s):  
Fengchao Yu ◽  
Ning Li ◽  
Weichuan Yu

AbstractChemical cross-linking coupled with mass spectrometry is a powerful tool to study protein-protein interactions and protein conformations. Two linked peptides are ionized and fragmented to produce a tandem mass spectrum. In such an experiment, a tandem mass spectrum contains ions from two peptides. The peptide identification problem becomes a peptide-peptide pair identification problem. Currently, most existing tools don’t search all possible pairs due to the quadratic time complexity. Consequently, a significant percentage of linked peptides are missed. In our earlier work, we developed a tool named ECL to search all pairs of peptides exhaustively. While ECL does not miss any linked peptides, it is very slow due to the quadratic computational complexity, especially when the database is large. Furthermore, ECL uses a score function without statistical calibration, while researchers1,2 have demonstrated that using a statistical calibrated score function can achieve a higher sensitivity than using an uncalibrated one.Here, we propose an advanced version of ECL, named ECL 2.0. It achieves a linear time and space complexity by taking advantage of the additive property of a score function. It can analyze a typical data set containing tens of thousands of spectra using a large-scale database containing thousands of proteins in a few hours. Comparison with other five state-of-the-art tools shows that ECL 2.0 is much faster than pLink, StavroX, ProteinProspector, and ECL. Kojak is the only one tool that is faster than ECL 2.0. But Kojak does not exhaustively search all possible peptide pairs. We also adopt an e-value estimation method to calibrate the original score. Comparison shows that ECL 2.0 has the highest sensitivity among the state-of-the-art tools. The experiment using a large-scale in vivo cross-linking data set demonstrates that ECL 2.0 is the only tool that can find PSMs passing the false discovery rate threshold. The result illustrates that exhaustive search and well calibrated score function are useful to find PSMs from a huge search space.

2021 ◽  
Author(s):  
Yuzhou Zhang ◽  
Yi Mei ◽  
Ke Tang ◽  
Keqin Jiang

In this paper, the Periodic Capacitated Arc Routing Problem (PCARP) is investigated. PCARP is an extension of the well-known CARP from a single period to a multi-period horizon. In PCARP, two objectives are to be minimized. One is the number of required vehicles (nv), and the other is the total cost (tc). Due to the multi-period nature, given the same graph or road network, PCARP can have a much larger solution space than the single-period CARP counterpart. Furthermore, PCARP consists of an additional allocation sub-problem (of the days to serve the arcs), which is interdependent with the routing sub-problem. Although some attempts have been made for solving PCARP, more investigations are yet to be done to further improve their performance especially on large-scale problem instances. It has been shown that optimizing nv and tc separately (hierarchically) is a good way of dealing with the two objectives. In this paper, we further improve this strategy and propose a new Route Decomposition (RD) operator thereby. Then, the RD operator is integrated into a Memetic Algorithm (MA) framework for PCARP, in which novel crossover and local search operators are designed accordingly. In addition, to improve the search efficiency, a hybridized initialization is employed to generate an initial population consisting of both heuristic and random individuals. The MA with RD (MARD) was evaluated and compared with the state-of-the-art approaches on two benchmark sets of PCARP instances and a large data set which is based on a real-world road network. The experimental results suggest that MARD outperforms the compared state-of-the-art algorithms, and improves most of the best-known solutions. The advantage of MARD becomes more obvious when the problem size increases. Thus, MARD is particularly effective in solving large-scale PCARP instances. Moreover, the efficacy of the proposed RD operator in MARD has been empirically verified. Graphical abstract https://ars.els-cdn.com/content/image/1-s2.0-S1568494616304768-fx1_lrg.jpg © This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/


Metabolites ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 803
Author(s):  
Iuliana Popa ◽  
Audrey Solgadi ◽  
Didier Pin ◽  
Adrian L. Watson ◽  
Marek Haftek ◽  
...  

Golden Retrievers may suffer from Pnpl1-related inherited ichthyosis. Our study shows that in the stratum corneum (SC) of ichthyotic dogs, linoleic acid (LA) is also present in the form of 9-keto-octadecadienoic acid (9-KODE) instead of the acylacid form as in normal dogs. The fatty acids purified from SC strips (LA, acylacids) were characterized by liquid chromatography-tandem mass spectrometry (LC-MS) and atmospheric pressure chemical ionization (APCI). Electrospray ionization (ESI) and MS2(MS/MS Tandem mass spectrum/spectra)/M3 (MS/MS/MS Tandem mass spectrum/spectra) fragmentation indicated the positions of the double bonds in 9-KODE. We showed that ichthyotic dogs have a threefold lower LA content in the form of acylacids. The MS2 fragmentation of acyl acids showed in some peaks the presenceof an ion at the m/z 279, instead of an ion at m/z 293 which is characteristic of LA. The detected variant was identified upon MS3 fragmentation as 9-keto-octadecadienoic acid (9-KODE), and the level of this keto-derivative was increased in ichthyotic dogs. We showed by the APCI that such keto forms of LA are produced from hydroperoxy-octadecadienoic acids (HpODE) upon dehydration. In conclusion, the free form of 9-KODE was detected in ichthyotic SC up to fivefold as compared to unaffected dogs, and analyses by HPLC (High performance liquid chromatography) and ESI-MS (Electrospray Ionization-Mass Spectrometry) indicated its production via dehydration of native 9-HpODE.


Author(s):  
Amandip Sangha

We train a machine learning model on large data set for predicting property values in the Norwegian real estate market. Our model is a gradient boosted regression tree. The data set is the largest market data set of properties in Norway considered in the research literature. We achieve state of the art accuracy. A large scale market data set of real estate properties is collected from sales and rental ads on publicly accessible internet sites. The property advertisements show property features and appraisal values made by real estate brokers. We train a gradient boosted regression tree model on selected features of the data set. This is a multivariate regression model built with supervised learning. We do 5-fold cross validation to assess the accuracy and robustness of the model. The gradient boosted regression tree models are already known to give the best prediction accuracy on real estate price valuations. We achieve state of the art pre- diction accuracy using a minimal feature set and only publicly and freely available sales advertisement data. The novelty of our work lies in the fact that we use a minimal feature set in our model, and we have the largest data set in the research literature, and moreover we have used only freely and publicly accessible data which are simple to obtain. This shows that useful estimation models with high accuracy can be built with quite simple resources.


Sign in / Sign up

Export Citation Format

Share Document