Rule Learning for Disease-Specific Biomarker Discovery from Clinical Proteomic Mass Spectra

Author(s):  
Vanathi Gopalakrishnan ◽  
Philip Ganchev ◽  
Srikanth Ranganathan ◽  
Robert Bowser
2005 ◽  
Vol 51 (10) ◽  
pp. 1946-1954 ◽  
Author(s):  
Mary F Lopez ◽  
Alvydas Mikulskis ◽  
Scott Kuzdzal ◽  
David A Bennett ◽  
Jeremiah Kelly ◽  
...  

Abstract Background: Researchers typically search for disease markers using a “targeted” approach in which a hypothesis about the disease mechanism is tested and experimental results either confirm or disprove the involvement of a particular gene or protein in the disease. Recently, there has been interest in developing disease diagnostics based on unbiased quantification of differences in global patterns of protein and peptide masses, typically in blood from individuals with and without disease. We combined a suite of methods and technologies, including novel sample preparation based on carrier-protein capture and biomarker enrichment, high-resolution mass spectrometry, a unique cohort of well-characterized persons with and without Alzheimer disease (AD), and powerful bioinformatic analysis, that add statistical and procedural robustness to biomarker discovery from blood. Methods: Carrier-protein–bound peptides were isolated from serum samples by affinity chromatography, and peptide mass spectra were acquired by a matrix-assisted laser desorption/ionization (MALDI) orthogonal time-of-flight (O-TOF) mass spectrometer capable of collecting data over a broad mass range (100 to >300 000 Da) in a single acquisition. Discriminatory analysis of mass spectra was used to process and analyze the raw mass spectral data. Results: Coupled with the biomarker enrichment protocol, the high-resolution MALDI O-TOF mass spectra provided informative, reproducible peptide signatures. The raw mass spectra were analyzed and used to build discriminant disease models that were challenged with blinded samples for classification. Conclusions: Carrier-protein enrichment of disease biomarkers coupled with high-resolution mass spectrometry and discriminant pattern analysis is a powerful technology for diagnostics and population screening. The mass fingerprint model successfully classified blinded AD patient and control samples with high sensitivity and specificity.


PROTEOMICS ◽  
2004 ◽  
Vol 4 (8) ◽  
pp. 2320-2332 ◽  
Author(s):  
Julien Prados ◽  
Alexandros Kalousis ◽  
Jean-Charles Sanchez ◽  
Laure Allard ◽  
Odile Carrette ◽  
...  

2004 ◽  
Vol 19 (4-5) ◽  
pp. 169-183 ◽  
Author(s):  
Que N. Van ◽  
John R. Klose ◽  
David A. Lucas ◽  
DaRue A. Prieto ◽  
Brian Luke ◽  
...  

The advent of systems biology approaches that have stemmed from the sequencing of the human genome has led to the search for new methods to diagnose diseases. While much effort has been focused on the identification of disease-specific biomarkers, recent efforts are underway toward the use of proteomic and metabonomic patterns to indicate disease. We have developed and contrasted the use of both proteomic and metabonomic patterns in urine for the detection of interstitial cystitis (IC). The methodology relies on advanced bioinformatics to scrutinize information contained within mass spectrometry (MS) and high-resolution proton nuclear magnetic resonance (1H-NMR) spectral patterns to distinguish IC-affected from non-affected individuals as well as those suffering from bacterial cystitis (BC). We have applied a novel pattern recognition tool that employs an unsupervised system (self-organizing-type cluster mapping) as a fitness test for a supervised system (a genetic algorithm). With this approach, a training set comprised of mass spectra and1H-NMR spectra from urine derived from either unaffected individuals or patients with IC is employed so that the most fit combination of relative, normalized intensity features defined at precisem/zor chemical shift values plotted inn-space can reliably distinguish the cohorts used in training. Using this bioinformatic approach, we were able to discriminate spectral patterns associated with IC-affected, BC-affected, and unaffected patients with a success rate of approximately 84%.


Author(s):  
Henry A. Ogoe ◽  
Mahbaneh Eshaghzadeh Torbati ◽  
Vanathi Gopalakrishnan

Background: Ongoing molecular profiling studies enabled by advances in biomedical technologies are producing vast amounts of ‘omic’ data for early detection, monitoring, and prognosis of diverse diseases. A major common limitation is the scarcity of biological samples, necessitating integrative modeling frameworks that can make optimal use of available data for disease classification tasks. Related data sets are often available from different studies, but may have been generated using different technology platforms. Thus, there is a critical need for flexible modeling methods that can handle data from diverse sources to facilitate the discovery of robust biomarkers that underlie disease regulatory processes. Results: In this paper, we introduce a novel framework called Knowledge Augmented Rule Learning (KARL), which incorporates two sources of knowledge, domain, and data, for pattern discovery from small and high-dimensional datasets, such as transcriptomic data. We propose KARL as a transfer rule learning framework in which knowledge of the domain is transferred to the learning process on data in order to 1) improve the reliability of the discovered patterns, and 2) study the knowledge of the domain when used along with data for modeling. In this work, we generated KARL models on gene expression datasets for five types of cancer, including brain, breast, colon, lung, and prostate. As our knowledge of the domain, we used the Ingenuity Knowledge Base (IKB) to extract genes related to hallmarks of cancer and annotated these prior relationships before learning classifiers from these datasets. Conclusions: Our results show that KARL produces, on average, rule models that are more robust classifiers than the baseline without such background knowledge, for our tasks of cancer prediction using 25 publicly available gene expression datasets. Moreover, KARL helped us learn insights about previously known relationships in these gene expression datasets, along with new relationships not input as known, to enable informed biomarker discovery for cancer prediction tasks. KARL can be applied to modeling similar data from any other domain and classification task. Future work would involve extensions to KARL to handle hierarchical knowledge to derive more general hypotheses to drive biomedicine.


Author(s):  
Lily C. Wong-Kisiel ◽  
Andrew McKeon ◽  
Elaine C. Wirrell

Recognition of autoimmune encephalopathies and epilepsies in children and teenagers with acute or subacute onset of central nervous system dysfunction, through detection of the pertinent antibody on serum or cerebral spinal fluid, or through a response to immunotherapy may lead to an early diagnosis, and thus expedited implementation of immunotherapy and improved neurological outcome. The epidemiology of pediatric autoimmune encephalopathy and epilepsy is not well established, but advances in disease-specific biomarker discovery have lead to identification of disorders with either a cytotoxic T cell mediated pathogenesis or (more recently) possible autoantibody mediated disorders. This review summarizes the clinical presentations and recommended evaluations and treatment of pediatric epileptic encephalopathy suspected to be of autoimmune etiology.


2019 ◽  
Author(s):  
Jeya Balaji Balasubramanian ◽  
Kevin E. Kip ◽  
Steven E. Reis ◽  
Vanathi Gopalakrishnan

AbstractBiomarker discovery is critical for both biomedical research and for clinical diagnostic, prognostic, and therapeutic decision-making. They help improve our understanding of the underlying physiological processes within an individual. Discovery of biomarkers from complex biomedical datasets is done using data mining algorithms. Hundreds of thousands of biomarkers have been discovered and reported in literature but only a few dozen have been found to be clinically useful. This discrepancy is because statistical significance is not clinical relevance. Statistical significance only accounts for the correctness of the learned associations. Clinical relevance, in addition to statistical significance, also accounts for clinical utility such as cost-effectiveness, non-invasiveness, efficacy, and safety of the proposed biomarkers. We need models that are statistically significant and clinically relevant, all the while keeping it interpretable. Interpretable classifiers are more actionable in medicine because they offer human-readable explanations for their predictions. Traditional data mining methods cannot account for clinical relevance. We formulate this as a knowledge discovery problem. In computer science, knowledge discovery in databases is “a non-trivial process of the extraction of valid, novel, potentially useful, and ultimately understandable patterns in data”. Bayesian Rule Learning (BRL) finds an optimal Bayesian network to explain the training data and translates that into an interpretable rule model. In this paper, we extend BRL for knowledge discovery (BRL-KD) to enable BRL to incorporate a clinical utility function to learn models that are clinically more relevant. We demonstrate this using a real-world dataset to predict cardiovascular disease outcome. We evaluate predictive performance with the area under the receiver operating characteristic curve (AUROC) and clinical utility with the cost of the model. We show that BRL-KD successfully generates a set of models offering different trade-offs between AUROC and cost. Based on the clinical standard, a model with an acceptable trade-off can then be chosen.


2009 ◽  
Vol 75A (9) ◽  
pp. 727-733 ◽  
Author(s):  
Henning Ulrich ◽  
Carsten Wrenger

2004 ◽  
Vol 2 (5) ◽  
pp. 214-222 ◽  
Author(s):  
Melanie Hilario ◽  
Alexandros Kalousis ◽  
Julien Prados ◽  
Pierre-Alain Binz

Sign in / Sign up

Export Citation Format

Share Document