Rule Learning for Disease-Specific Biomarker Discovery from Clinical Proteomic Mass Spectra

High-Resolution Serum Proteomic Profiling of Alzheimer Disease Samples Reveals Disease-Specific, Carrier-Protein–Bound Mass Signatures

Clinical Chemistry ◽

10.1373/clinchem.2005.053090 ◽

2005 ◽

Vol 51 (10) ◽

pp. 1946-1954 ◽

Cited By ~ 105

Author(s):

Mary F Lopez ◽

Alvydas Mikulskis ◽

Scott Kuzdzal ◽

David A Bennett ◽

Jeremiah Kelly ◽

...

Keyword(s):

Mass Spectrometry ◽

High Resolution ◽

Alzheimer Disease ◽

Mass Spectra ◽

Biomarker Discovery ◽

High Resolution Mass Spectrometry ◽

Carrier Protein ◽

High Resolution Mass ◽

Peptide Mass ◽

Resolution Mass

Abstract Background: Researchers typically search for disease markers using a “targeted” approach in which a hypothesis about the disease mechanism is tested and experimental results either confirm or disprove the involvement of a particular gene or protein in the disease. Recently, there has been interest in developing disease diagnostics based on unbiased quantification of differences in global patterns of protein and peptide masses, typically in blood from individuals with and without disease. We combined a suite of methods and technologies, including novel sample preparation based on carrier-protein capture and biomarker enrichment, high-resolution mass spectrometry, a unique cohort of well-characterized persons with and without Alzheimer disease (AD), and powerful bioinformatic analysis, that add statistical and procedural robustness to biomarker discovery from blood. Methods: Carrier-protein–bound peptides were isolated from serum samples by affinity chromatography, and peptide mass spectra were acquired by a matrix-assisted laser desorption/ionization (MALDI) orthogonal time-of-flight (O-TOF) mass spectrometer capable of collecting data over a broad mass range (100 to >300 000 Da) in a single acquisition. Discriminatory analysis of mass spectra was used to process and analyze the raw mass spectral data. Results: Coupled with the biomarker enrichment protocol, the high-resolution MALDI O-TOF mass spectra provided informative, reproducible peptide signatures. The raw mass spectra were analyzed and used to build discriminant disease models that were challenged with blinded samples for classification. Conclusions: Carrier-protein enrichment of disease biomarkers coupled with high-resolution mass spectrometry and discriminant pattern analysis is a powerful technology for diagnostics and population screening. The mass fingerprint model successfully classified blinded AD patient and control samples with high sensitivity and specificity.

Download Full-text

Mining mass spectra for diagnosis and biomarker discovery of cerebral accidents

PROTEOMICS ◽

10.1002/pmic.200400857 ◽

2004 ◽

Vol 4 (8) ◽

pp. 2320-2332 ◽

Cited By ~ 51

Author(s):

Julien Prados ◽

Alexandros Kalousis ◽

Jean-Charles Sanchez ◽

Laure Allard ◽

Odile Carrette ◽

...

Keyword(s):

Mass Spectra ◽

Biomarker Discovery

Download Full-text

The Use of Urine Proteomic and Metabonomic Patterns for the Diagnosis of Interstitial Cystitis and Bacterial Cystitis

Disease Markers ◽

10.1155/2004/530647 ◽

2004 ◽

Vol 19 (4-5) ◽

pp. 169-183 ◽

Cited By ~ 25

Author(s):

Que N. Van ◽

John R. Klose ◽

David A. Lucas ◽

DaRue A. Prieto ◽

Brian Luke ◽

...

Keyword(s):

Interstitial Cystitis ◽

Mass Spectra ◽

Proton Nuclear Magnetic Resonance ◽

Training Set ◽

H Nmr ◽

New Methods ◽

System A ◽

Spectral Patterns ◽

Bioinformatic Approach ◽

Disease Specific

The advent of systems biology approaches that have stemmed from the sequencing of the human genome has led to the search for new methods to diagnose diseases. While much effort has been focused on the identification of disease-specific biomarkers, recent efforts are underway toward the use of proteomic and metabonomic patterns to indicate disease. We have developed and contrasted the use of both proteomic and metabonomic patterns in urine for the detection of interstitial cystitis (IC). The methodology relies on advanced bioinformatics to scrutinize information contained within mass spectrometry (MS) and high-resolution proton nuclear magnetic resonance (1H-NMR) spectral patterns to distinguish IC-affected from non-affected individuals as well as those suffering from bacterial cystitis (BC). We have applied a novel pattern recognition tool that employs an unsupervised system (self-organizing-type cluster mapping) as a fitness test for a supervised system (a genetic algorithm). With this approach, a training set comprised of mass spectra and1H-NMR spectra from urine derived from either unaffected individuals or patients with IC is employed so that the most fit combination of relative, normalized intensity features defined at precisem/zor chemical shift values plotted inn-space can reliably distinguish the cohorts used in training. Using this bioinformatic approach, we were able to discriminate spectral patterns associated with IC-affected, BC-affected, and unaffected patients with a success rate of approximately 84%.

Download Full-text

Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery

World Journal of Clinical Oncology ◽

10.5306/wjco.v9.i5.98 ◽

2018 ◽

Vol 9 (5) ◽

pp. 98-109 ◽

Cited By ~ 2

Author(s):

Jeya Balaji Balasubramanian ◽

Vanathi Gopalakrishnan

Keyword(s):

Biomarker Discovery ◽

Rule Learning ◽

Tunable Structure

Download Full-text

KARL: Knowledge Augmented Rule Learning for Informed Biomarker Discovery

10.20944/preprints201908.0100.v1 ◽

2019 ◽

Author(s):

Henry A. Ogoe ◽

Mahbaneh Eshaghzadeh Torbati ◽

Vanathi Gopalakrishnan

Keyword(s):

Gene Expression ◽

Biomarker Discovery ◽

Rule Learning ◽

Disease Classification ◽

Similar Data ◽

Knowledge Domain ◽

Cancer Prediction ◽

Related Data ◽

Regulatory Processes ◽

Average Rule

Background: Ongoing molecular profiling studies enabled by advances in biomedical technologies are producing vast amounts of ‘omic’ data for early detection, monitoring, and prognosis of diverse diseases. A major common limitation is the scarcity of biological samples, necessitating integrative modeling frameworks that can make optimal use of available data for disease classification tasks. Related data sets are often available from different studies, but may have been generated using different technology platforms. Thus, there is a critical need for flexible modeling methods that can handle data from diverse sources to facilitate the discovery of robust biomarkers that underlie disease regulatory processes. Results: In this paper, we introduce a novel framework called Knowledge Augmented Rule Learning (KARL), which incorporates two sources of knowledge, domain, and data, for pattern discovery from small and high-dimensional datasets, such as transcriptomic data. We propose KARL as a transfer rule learning framework in which knowledge of the domain is transferred to the learning process on data in order to 1) improve the reliability of the discovered patterns, and 2) study the knowledge of the domain when used along with data for modeling. In this work, we generated KARL models on gene expression datasets for five types of cancer, including brain, breast, colon, lung, and prostate. As our knowledge of the domain, we used the Ingenuity Knowledge Base (IKB) to extract genes related to hallmarks of cancer and annotated these prior relationships before learning classifiers from these datasets. Conclusions: Our results show that KARL produces, on average, rule models that are more robust classifiers than the baseline without such background knowledge, for our tasks of cancer prediction using 25 publicly available gene expression datasets. Moreover, KARL helped us learn insights about previously known relationships in these gene expression datasets, along with new relationships not input as known, to enable informed biomarker discovery for cancer prediction tasks. KARL can be applied to modeling similar data from any other domain and classification task. Future work would involve extensions to KARL to handle hierarchical knowledge to derive more general hypotheses to drive biomedicine.

Download Full-text

Autoimmune Encephalopathies and Epilepsies in Children and Teenagers

Canadian Journal of Neurological Sciences / Journal Canadien des Sciences Neurologiques ◽

10.1017/s0317167100013147 ◽

2012 ◽

Vol 39 (2) ◽

pp. 134-144 ◽

Cited By ~ 15

Author(s):

Lily C. Wong-Kisiel ◽

Andrew McKeon ◽

Elaine C. Wirrell

Keyword(s):

Central Nervous System ◽

Cerebral Spinal Fluid ◽

Biomarker Discovery ◽

Epileptic Encephalopathy ◽

Cytotoxic T Cell ◽

Central Nervous System Dysfunction ◽

Clinical Presentations ◽

Autoimmune Encephalopathy ◽

Disease Specific ◽

Autoimmune Etiology

Recognition of autoimmune encephalopathies and epilepsies in children and teenagers with acute or subacute onset of central nervous system dysfunction, through detection of the pertinent antibody on serum or cerebral spinal fluid, or through a response to immunotherapy may lead to an early diagnosis, and thus expedited implementation of immunotherapy and improved neurological outcome. The epidemiology of pediatric autoimmune encephalopathy and epilepsy is not well established, but advances in disease-specific biomarker discovery have lead to identification of disorders with either a cytotoxic T cell mediated pathogenesis or (more recently) possible autoantibody mediated disorders. This review summarizes the clinical presentations and recommended evaluations and treatment of pediatric epileptic encephalopathy suspected to be of autoimmune etiology.

Download Full-text

Knowledge discovery with Bayesian Rule Learning for actionable biomedicine

10.1101/785279 ◽

2019 ◽

Author(s):

Jeya Balaji Balasubramanian ◽

Kevin E. Kip ◽

Steven E. Reis ◽

Vanathi Gopalakrishnan

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

Clinical Relevance ◽

Clinical Utility ◽

Biomarker Discovery ◽

Statistical Significance ◽

Rule Learning ◽

Training Data ◽

Knowledge Discovery In Databases ◽

Trade Offs

AbstractBiomarker discovery is critical for both biomedical research and for clinical diagnostic, prognostic, and therapeutic decision-making. They help improve our understanding of the underlying physiological processes within an individual. Discovery of biomarkers from complex biomedical datasets is done using data mining algorithms. Hundreds of thousands of biomarkers have been discovered and reported in literature but only a few dozen have been found to be clinically useful. This discrepancy is because statistical significance is not clinical relevance. Statistical significance only accounts for the correctness of the learned associations. Clinical relevance, in addition to statistical significance, also accounts for clinical utility such as cost-effectiveness, non-invasiveness, efficacy, and safety of the proposed biomarkers. We need models that are statistically significant and clinically relevant, all the while keeping it interpretable. Interpretable classifiers are more actionable in medicine because they offer human-readable explanations for their predictions. Traditional data mining methods cannot account for clinical relevance. We formulate this as a knowledge discovery problem. In computer science, knowledge discovery in databases is “a non-trivial process of the extraction of valid, novel, potentially useful, and ultimately understandable patterns in data”. Bayesian Rule Learning (BRL) finds an optimal Bayesian network to explain the training data and translates that into an interpretable rule model. In this paper, we extend BRL for knowledge discovery (BRL-KD) to enable BRL to incorporate a clinical utility function to learn models that are clinically more relevant. We demonstrate this using a real-world dataset to predict cardiovascular disease outcome. We evaluate predictive performance with the area under the receiver operating characteristic curve (AUROC) and clinical utility with the cost of the model. We show that BRL-KD successfully generates a set of models offering different trade-offs between AUROC and cost. Based on the clinical standard, a model with an acceptable trade-off can then be chosen.

Download Full-text

Proteomic profile analysis and biomarker discovery from mass spectra using independent component analysis combined with uncorrelated linear discriminant analysis

Chemometrics and Intelligent Laboratory Systems ◽

10.1016/j.chemolab.2011.01.007 ◽

2011 ◽

Vol 105 (2) ◽

pp. 207-214 ◽

Cited By ~ 9

Author(s):

Mingjin Zhang ◽

Peijin Tong ◽

Wenming Wang ◽

Jinpei Geng ◽

Yiping Du

Keyword(s):

Discriminant Analysis ◽

Independent Component Analysis ◽

Linear Discriminant Analysis ◽

Mass Spectra ◽

Biomarker Discovery ◽

Profile Analysis ◽

Component Analysis ◽

Independent Component ◽

Proteomic Profile ◽

Linear Discriminant

Download Full-text

Disease-specific biomarker discovery by aptamers

Cytometry Part A ◽

10.1002/cyto.a.20766 ◽

2009 ◽

Vol 75A (9) ◽

pp. 727-733 ◽

Cited By ~ 47

Author(s):

Henning Ulrich ◽

Carsten Wrenger

Keyword(s):

Biomarker Discovery ◽

Disease Specific

Download Full-text

Data mining for mass-spectra based diagnosis and biomarker discovery

Drug Discovery Today BIOSILICO ◽

10.1016/s1741-8364(04)02416-3 ◽

2004 ◽

Vol 2 (5) ◽

pp. 214-222 ◽

Cited By ~ 7

Author(s):

Melanie Hilario ◽

Alexandros Kalousis ◽

Julien Prados ◽

Pierre-Alain Binz

Keyword(s):

Data Mining ◽

Mass Spectra ◽

Biomarker Discovery

Download Full-text