Open-pFind enables precise, comprehensive and rapid peptide identification in shotgun proteomics

Mapping Intimacies ◽

10.1101/285395 ◽

2018 ◽

Cited By ~ 6

Author(s):

Hao Chi ◽

Chao Liu ◽

Hao Yang ◽

Wen-Feng Zeng ◽

Long Wu ◽

...

Keyword(s):

Large Scale ◽

Search Algorithm ◽

Large Fraction ◽

Olfactory Receptors ◽

Peptide Identification ◽

Shotgun Proteomics ◽

Isotopic Labeling ◽

Search Space ◽

Global Scale ◽

Search Algorithms

ABSTRACTShotgun proteomics has grown rapidly in recent decades, but a large fraction of tandem mass spectrometry (MS/MS) data in shotgun proteomics are not successfully identified. We have developed a novel database search algorithm, Open-pFind, to efficiently identify peptides even in an ultra-large search space which takes into account unexpected modifications, amino acid mutations, semi- or non-specific digestion and co-eluting peptides. Tested on two metabolically labeled MS/MS datasets, Open-pFind reported 50.5‒117.0% more peptide-spectrum matches (PSMs) than the seven other advanced algorithms. More importantly, the Open-pFind results were more credible judged by the verification experiments using stable isotopic labeling. Tested on four additional large-scale datasets, 70‒85% of the spectra were confidently identified, and high-quality spectra were nearly completely interpreted by Open-pFind. Further, Open-pFind was over 40 times faster than the other three open search algorithms and 2‒3 times faster than three restricted search algorithms. Re-analysis of an entire human proteome dataset consisting of ∼25 million spectra using Open-pFind identified a total of 14,064 proteins encoded by 12,723 genes by requiring at least two uniquely identified peptides. In this search results, Open-pFind also excelled in an independent test for false positives based on the presence or absence of olfactory receptors. Thus, a practical use of the open search strategy has been realized by Open-pFind for the truly global-scale proteomics experiments of today and in the future.

Download Full-text

Finding A Small Vertex Cover in Massive Sparse Graphs: Construct, Local Search, and Preprocess

Journal of Artificial Intelligence Research ◽

10.1613/jair.5443 ◽

2017 ◽

Vol 59 ◽

pp. 463-494 ◽

Cited By ~ 6

Author(s):

Shaowei Cai ◽

Jinkun Lin ◽

Chuan Luo

Keyword(s):

Local Search ◽

Real World ◽

Large Scale ◽

Heuristic Algorithms ◽

Search Algorithm ◽

Vertex Cover ◽

Search Algorithms ◽

Theory And Practice ◽

Sparse Graphs ◽

Massive Graphs

The problem of finding a minimum vertex cover (MinVC) in a graph is a well known NP-hard combinatorial optimization problem of great importance in theory and practice. Due to its NP-hardness, there has been much interest in developing heuristic algorithms for finding a small vertex cover in reasonable time. Previously, heuristic algorithms for MinVC have focused on solving graphs of relatively small size, and they are not suitable for solving massive graphs as they usually have high-complexity heuristics. This paper explores techniques for solving MinVC in very large scale real-world graphs, including a construction algorithm, a local search algorithm and a preprocessing algorithm. Both the construction and search algorithms are based on low-complexity heuristics, and we combine them to develop a heuristic algorithm for MinVC called FastVC. Experimental results on a broad range of real-world massive graphs show that, our algorithms are very fast and have better performance than previous heuristic algorithms for MinVC. We also develop a preprocessing algorithm to simplify graphs for MinVC algorithms. By applying the preprocessing algorithm to local search algorithms, we obtain two efficient MinVC solvers called NuMVC2+p and FastVC2+p, which show further improvement on the massive graphs.

Download Full-text

Facilitating Large-Scale Graph Search Algorithms with Lock-Free Concurrent Pairing Heaps

10.14293/s2199-1006.1.sor-.ppn6ong.v1 ◽

2019 ◽

Author(s):

Jeremy Mayeres ◽

Charles Newton ◽

Helena Arpudaraj

Keyword(s):

Data Structures ◽

Shortest Path ◽

Large Scale ◽

Search Algorithm ◽

Search Algorithms ◽

Graph Search ◽

Dijkstra’S Algorithm ◽

Single Source ◽

Dijkstra's Algorithm ◽

Graph Search Algorithms

This paper introduces a lock-free version of a Pairing heap. Dijkstra's algorithm is a search algorithm to solve the single-source shortest path problem. The performance of Dijkstra's algorithm improves when threads can also perform work concurrently (in particular, when decreaseKey calls occur concurrently.) However, current implementations of decreaseKey on popular backing data structures such as Pairing heaps and Fibonacci heaps severely limit concurrency. Lock-free techniques can improve the concurrency of search structures such as heaps. In this paper we introduce decreaseKey and insert operators for Pairing heaps that provide lock-free guarantees while still running in constant time.

Download Full-text

Simulated Annealing with Exploratory Sensing for Global Optimization

Algorithms ◽

10.3390/a13090230 ◽

2020 ◽

Vol 13 (9) ◽

pp. 230

Author(s):

Majid Almarashi ◽

Wael Deabes ◽

Hesham H. Amin ◽

Abdel-Rahman Hedar

Keyword(s):

Global Optimization ◽

Simulated Annealing ◽

Initial Temperature ◽

Search Algorithm ◽

Random Search ◽

Search Space ◽

Search Algorithms ◽

Two Phases ◽

Number Of Iterations

Simulated annealing is a well-known search algorithm used with success history in many search problems. However, the random walk of the simulated annealing does not benefit from the memory of visited states, causing excessive random search with no diversification history. Unlike memory-based search algorithms such as the tabu search, the search in simulated annealing is dependent on the choice of the initial temperature to explore the search space, which has little indications of how much exploration has been carried out. The lack of exploration eye can affect the quality of the found solutions while the nature of the search in simulated annealing is mainly local. In this work, a methodology of two phases using an automatic diversification and intensification based on memory and sensing tools is proposed. The proposed method is called Simulated Annealing with Exploratory Sensing. The computational experiments show the efficiency of the proposed method in ensuring a good exploration while finding good solutions within a similar number of iterations.

Download Full-text

Online Bridged Pruning for Real-Time Search with Arbitrary Lookaheads

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/72 ◽

2017 ◽

Author(s):

Carlos Hernandez ◽

Adi Botea ◽

Jorge A. Baier ◽

Vadim Bulitko

Keyword(s):

Real Time ◽

Recent Progress ◽

State Of The Art ◽

Search Algorithm ◽

Search Space ◽

Poor Quality ◽

Search Algorithms ◽

Arbitrary Size ◽

Order Of Magnitude ◽

And Robotics

Real-time search algorithms are relevant to time-sensitive decision-making domains such as video games and robotics. In such settings, the agent is required to decide on each action under a constant time bound, regardless of the search space size. Despite recent progress, poor-quality solutions can be produced mainly due to state re-visitation. Different techniques have been developed to reduce such a re-visitation with state pruning showing promise. In this paper, we propose a novel pruning approach applicable to the wide class of real-time search algorithms. Given a local search space of arbitrary size, our technique aggressively prunes away all states in its interior, possibly adding new edges to maintain the connectivity of the search space frontier. An experimental evaluation shows that our pruning often improves the performance of a base real-time search algorithm by over an order of magnitude. This allows our implemented system to outperform state-of-the-art real-time search algorithms used in the evaluation.

Download Full-text

Efficient Singleton Consistency by Combining Forward Checking and Bound Consistency

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213014600173 ◽

2014 ◽

Vol 23 (04) ◽

pp. 1460017

Author(s):

Jinsong Guo ◽

Hongbo Li ◽

Zhanshan Li ◽

Yonggang Zhang ◽

Xianghua Jia

Keyword(s):

Constraint Satisfaction ◽

Search Algorithm ◽

Computational Cost ◽

Search Space ◽

Search Algorithms ◽

Constraint Satisfaction Problems ◽

Local Consistency ◽

Arc Consistency ◽

Strong Bound

Maintaining local consistencies can improve the efficiencies of the search algorithms solving constraint satisfaction problems (CSPs). Comparing with arc consistency which is the most widely used local consistency, stronger local consistencies can make the search space smaller while they require higher computational cost. In this paper, we make an attempt on the compromise between the pruning ability and the computational cost. A new local consistency called singleton strong bound consistency (SSBC) and its light version, light SSBC, are proposed. The search algorithm maintaining light SSBC can outperform MAC on a considerable number of problems.

Download Full-text

Avoiding and Escaping Depressions in Real-Time Heuristic Search

Journal of Artificial Intelligence Research ◽

10.1613/jair.3590 ◽

2012 ◽

Vol 43 ◽

pp. 523-570 ◽

Cited By ~ 9

Author(s):

C. Hernandez ◽

J. A. Baier

Keyword(s):

Real Time ◽

Heuristic Search ◽

Search Algorithm ◽

Search Space ◽

Search Algorithms ◽

Search Spaces ◽

Heuristic Search Algorithms ◽

Order Of Magnitude ◽

Improved Performance ◽

Hard Real Time

Heuristics used for solving hard real-time search problems have regions with depressions. Such regions are bounded areas of the search space in which the heuristic function is inaccurate compared to the actual cost to reach a solution. Early real-time search algorithms, like LRTA*, easily become trapped in those regions since the heuristic values of their states may need to be updated multiple times, which results in costly solutions. State-of-the-art real-time search algorithms, like LSS-LRTA* or LRTA*(k), improve LRTA*'s mechanism to update the heuristic, resulting in improved performance. Those algorithms, however, do not guide search towards avoiding depressed regions. This paper presents depression avoidance, a simple real-time search principle to guide search towards avoiding states that have been marked as part of a heuristic depression. We propose two ways in which depression avoidance can be implemented: mark-and-avoid and move-to-border. We implement these strategies on top of LSS-LRTA* and RTAA*, producing 4 new real-time heuristic search algorithms: aLSS-LRTA*, daLSS-LRTA*, aRTAA*, and daRTAA*. When the objective is to find a single solution by running the real-time search algorithm once, we show that daLSS-LRTA* and daRTAA* outperform their predecessors sometimes by one order of magnitude. Of the four new algorithms, daRTAA* produces the best solutions given a fixed deadline on the average time allowed per planning episode. We prove all our algorithms have good theoretical properties: in finite search spaces, they find a solution if one exists, and converge to an optimal after a number of trials.

Download Full-text

OpenPepXL: An Open-Source Tool for Sensitive Identification of Cross-Linked Peptides in XL-MS

Molecular & Cellular Proteomics ◽

10.1074/mcp.tir120.002186 ◽

2020 ◽

Vol 19 (12) ◽

pp. 2157-2167

Author(s):

Eugen Netz ◽

Tjeerd M. H. Dijkstra ◽

Timo Sachsenberg ◽

Lukas Zimmermann ◽

Mathias Walzer ◽

...

Keyword(s):

Open Source ◽

Structural Information ◽

Search Algorithm ◽

Protein Structures ◽

Peptide Identification ◽

Search Space ◽

Cloud Services ◽

Label Free ◽

Protein Database ◽

Data Set

Cross-linking MS (XL-MS) has been recognized as an effective source of information about protein structures and interactions. In contrast to regular peptide identification, XL-MS has to deal with a quadratic search space, where peptides from every protein could potentially be cross-linked to any other protein. To cope with this search space, most tools apply different heuristics for search space reduction. We introduce a new open-source XL-MS database search algorithm, OpenPepXL, which offers increased sensitivity compared with other tools. OpenPepXL searches the full search space of an XL-MS experiment without using heuristics to reduce it. Because of efficient data structures and built-in parallelization OpenPepXL achieves excellent runtimes and can also be deployed on large compute clusters and cloud services while maintaining a slim memory footprint. We compared OpenPepXL to several other commonly used tools for identification of noncleavable labeled and label-free cross-linkers on a diverse set of XL-MS experiments. In our first comparison, we used a data set from a fraction of a cell lysate with a protein database of 128 targets and 128 decoys. At 5% FDR, OpenPepXL finds from 7% to over 50% more unique residue pairs (URPs) than other tools. On data sets with available high-resolution structures for cross-link validation OpenPepXL reports from 7% to over 40% more structurally validated URPs than other tools. Additionally, we used a synthetic peptide data set that allows objective validation of cross-links without relying on structural information and found that OpenPepXL reports at least 12% more validated URPs than other tools. It has been built as part of the OpenMS suite of tools and supports Windows, macOS, and Linux operating systems. OpenPepXL also supports the MzIdentML 1.2 format for XL-MS identification results. It is freely available under a three-clause BSD license at https://openms.org/openpepxl.

Download Full-text

Towards Reducing the Impact of Localisation Errors on the Behaviour of a Swarm of Autonomous Underwater Vehicles

MENDEL ◽

10.13164/mendel.2020.2.001 ◽

2020 ◽

Vol 26 (2) ◽

pp. 1-8

Author(s):

Tarek El-Mihoub ◽

Christoph Tholen ◽

Lars Nolle

Keyword(s):

Search Algorithm ◽

Autonomous Underwater Vehicles ◽

Search Space ◽

Underwater Vehicles ◽

Search Algorithms ◽

Cooperative Search ◽

Global View ◽

High Level ◽

The Impact ◽

Search Information

Localisation errors have a great impact on Autonomous Underwater Vehicles (AUVs) as search agents. Different approaches for solving the localisation problem can be used and combined together for greater accuracy in estimating AUVs’ locations. The effect of localisation errors on locating a target can be lightened by designing a search algorithm that avoids extensive use of exact lo-cation information. In this paper, two cooperative search algorithms are proposed and evaluated. In these algorithms, a high-level mechanism is employed for building a global view of the search space using minimum possible search information. These algorithms rely on low-level search algorithms with exploring roles. Particle Swarm Optimisation (PSO) and all-to-one Self-Organising Migrating Algorithm (SOMA) are selected as high-level mechanisms. The conducted experiments demonstrate that both algorithms show a robust behaviour within a range of localisation errors.

Download Full-text

An Enhanced Discrete Symbiotic Organism Search Algorithm for Optimal Task Scheduling in the Cloud

Algorithms ◽

10.3390/a14070200 ◽

2021 ◽

Vol 14 (7) ◽

pp. 200

Author(s):

Suleiman Sa’ad ◽

Abdullah Muhammed ◽

Mohammed Abdullahi ◽

Azizol Abdullah ◽

Fahrul Hakim Ayob

Keyword(s):

Cloud Computing ◽

Response Time ◽

Task Scheduling ◽

Large Scale ◽

Virtual Machines ◽

Search Algorithm ◽

Search Space ◽

Cloud Environment ◽

Nature Inspired Algorithm ◽

Symbiotic Organism Search

Recently, cloud computing has begun to experience tremendous growth because government agencies and private organisations are migrating to the cloud environment. Hence, having a task scheduling strategy that is efficient is paramount for effectively improving the prospects of cloud computing. Typically, a certain number of tasks are scheduled to use diverse resources (virtual machines) to minimise the makespan and achieve the optimum utilisation of the system by reducing the response time within the cloud environment. The task scheduling problem is NP-complete; as such, obtaining a precise solution is difficult, particularly for large-scale tasks. Therefore, in this paper, we propose a metaheuristic enhanced discrete symbiotic organism search (eDSOS) algorithm for optimal task scheduling in the cloud computing setting. Our proposed algorithm is an extension of the standard symbiotic organism search (SOS), a nature-inspired algorithm that has been implemented to solve various numerical optimisation problems. This algorithm imitates the symbiotic associations (mutualism, commensalism, and parasitism stages) displayed by organisms in an ecosystem. Despite the improvements made with the discrete symbiotic organism search (DSOS) algorithm, it still becomes trapped in local optima due to the large size of the values of the makespan and response time. The local search space of the DSOS is diversified by substituting the best value with any candidate in the population at the mutualism phase of the DSOS algorithm, which makes it worthy for use in task scheduling problems in the cloud. Thus, the eDSOS strategy converges faster when the search space is larger or more prominent due to diversification. The CloudSim simulator was used to conduct the experiment, and the simulation results show that the proposed eDSOS was able to produce a solution with a good quality when compared with that of the DSOS. Lastly, we analysed the proposed strategy by using a two-sample t-test, which revealed that the performance of eDSOS was of significance compared to the benchmark strategy (DSOS), particularly for large search spaces. The percentage improvements were 26.23% for the makespan and 63.34% for the response time.

Download Full-text

Large-scale recovery of an endangered amphibian despite ongoing exposure to multiple stressors

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1600983113 ◽

2016 ◽

Vol 113 (42) ◽

pp. 11889-11894 ◽

Cited By ~ 63

Author(s):

Roland A. Knapp ◽

Gary M. Fellers ◽

Patrick M. Kleeman ◽

David A. W. Miller ◽

Vance T. Vredenburg ◽

...

Keyword(s):

Sierra Nevada ◽

Large Scale ◽

Multiple Stressors ◽

Spatial Scales ◽

Large Fraction ◽

Regional Scale ◽

Global Scale ◽

Amphibian Declines ◽

Active Management ◽

Introduced Fish

Amphibians are one of the most threatened animal groups, with 32% of species at risk for extinction. Given this imperiled status, is the disappearance of a large fraction of the Earth’s amphibians inevitable, or are some declining species more resilient than is generally assumed? We address this question in a species that is emblematic of many declining amphibians, the endangered Sierra Nevada yellow-legged frog (Rana sierrae). Based on >7,000 frog surveys conducted across Yosemite National Park over a 20-y period, we show that, after decades of decline and despite ongoing exposure to multiple stressors, including introduced fish, the recently emerged disease chytridiomycosis, and pesticides, R. sierrae abundance increased sevenfold during the study and at a rate of 11% per year. These increases occurred in hundreds of populations throughout Yosemite, providing a rare example of amphibian recovery at an ecologically relevant spatial scale. Results from a laboratory experiment indicate that these increases may be in part because of reduced frog susceptibility to chytridiomycosis. The disappearance of nonnative fish from numerous water bodies after cessation of stocking also contributed to the recovery. The large-scale increases in R. sierrae abundance that we document suggest that, when habitats are relatively intact and stressors are reduced in their importance by active management or species’ adaptive responses, declines of some amphibians may be partially reversible, at least at a regional scale. Other studies conducted over similarly large temporal and spatial scales are critically needed to provide insight and generality about the reversibility of amphibian declines at a global scale.

Download Full-text