New interpretable machine learning method for single-cell data reveals correlates of clinical response to cancer immunotherapy

Mapping Intimacies ◽

10.1101/702118 ◽

2019 ◽

Cited By ~ 3

Author(s):

Evan Greene ◽

Greg Finak ◽

Leonard A. D’Amico ◽

Nina Bhardwaj ◽

Candice D. Church ◽

...

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

T Cell ◽

Single Cell ◽

Cancer Immunotherapy ◽

Effector Memory ◽

Machine Learning Method ◽

Learning Method ◽

Modeling Framework ◽

Interpretable Machine Learning

AbstractHigh-dimensional single-cell cytometry is routinely used to characterize patient responses to cancer immunotherapy and other treatments. This has produced a wealth of datasets ripe for exploration but whose biological and technical heterogeneity make them difficult to analyze with current tools. We introduce a new interpretable machine learning method for single-cell mass and flow cytometry studies, FAUST, that robustly performs unbiased cell population discovery and annotation. FAUST processes data on a per-sample basis and returns biologically interpretable cell phenotypes that can be compared across studies, making it well-suited for the analysis and integration of complex datasets. We demonstrate how FAUST can be used for candidate biomarker discovery and validation by applying it to a flow cytometry dataset from a Merkel cell carcinoma anti-PD-1 trial and discover new CD4+ and CD8+ effector-memory T cell correlates of outcome co-expressing PD-1, HLA-DR, and CD28. We then use FAUST to validate these correlates in an independent CyTOF dataset from a published metastatic melanoma trial. Importantly, existing state-of-the-art computational discovery approaches as well as prior manual analysis did not detect these or any other statistically significant T cell sub-populations associated with anti-PD-1 treatment in either data set. We further validate our methodology by using FAUST to replicate the discovery of a previously reported myeloid correlate in a different published melanoma trial, and validate the correlate by identifying it de novo in two additional independent trials. FAUST’s phenotypic annotations can be used to perform cross-study data integration in the presence of heterogeneous data and diverse immunophenotyping staining panels, enabling hypothesis-driven inference about cell sub-population abundance through a multivariate modeling framework we call Phenotypic and Functional Differential Abundance (PFDA). We demonstrate this approach on data from myeloid and T cell panels across multiple trials. Together, these results establish FAUST as a powerful and versatile new approach for unbiased discovery in single-cell cytometry.

Download Full-text

New interpretable machine-learning method for single-cell data reveals correlates of clinical response to cancer immunotherapy

Patterns ◽

10.1016/j.patter.2021.100372 ◽

2021 ◽

pp. 100372

Author(s):

Evan Greene ◽

Greg Finak ◽

Leonard A. D'Amico ◽

Nina Bhardwaj ◽

Candice D. Church ◽

...

Keyword(s):

Machine Learning ◽

Single Cell ◽

Clinical Response ◽

Cancer Immunotherapy ◽

Machine Learning Method ◽

Learning Method ◽

Interpretable Machine Learning ◽

Cell Data

Download Full-text

Correction: Predicting Antituberculosis Drug–Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study

JMIR Medical Informatics ◽

10.2196/32415 ◽

2021 ◽

Vol 9 (8) ◽

pp. e32415

Author(s):

Tao Zhong ◽

Zian Zhuang ◽

Xiaoli Dong ◽

Ka Hing Wong ◽

Wing Tak Wong ◽

...

Keyword(s):

Machine Learning ◽

Model Development ◽

Machine Learning Method ◽

Learning Method ◽

Antituberculosis Drug ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

Interpretable Machine Learning ◽

Development And Validation ◽

Method Model

Download Full-text

UrineCART, a machine learning method for establishment of review rules based on UF-1000i flow cytometry and dipstick or reflectance photometer

Clinical Chemistry and Laboratory Medicine (CCLM) ◽

10.1515/cclm-2012-0272 ◽

2012 ◽

Vol 50 (12) ◽

Cited By ~ 2

Author(s):

Cao Yuan ◽

Cheng Ming ◽

Hu Chengjin

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Machine Learning Method ◽

Learning Method

Download Full-text

An interpretable machine learning method for detecting novel pathogens

10.21203/rs.2.20477/v1 ◽

2020 ◽

Author(s):

Xiaoyong Zhao ◽

Ningning Wang

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

Bacterial Pathogen ◽

World Health ◽

Machine Learning Method ◽

Learning Method ◽

Human Pathogens ◽

Healthcare Applications ◽

Interpretable Machine Learning ◽

Shapley Values

Abstract Background: According to the World Health Organization (WHO), infectious diseases continue to one of the leading causes of death worldwide. Since the core microbiota flora of humans is largely diverse and horizontal gene transfer (HGT), it is very challenging to determine whether a particular bacterial strain is commensal or pathogenic to humans. With the latest advances in next-generation sequencing (NGS) technology, bioinformatics tools and techniques using NGS data have increasingly been used for the diagnosis and monitoring of infectious diseases. Even if the biological background is not available, the machine learning method can still infer the pathogenic phenotype from the NGS readings, independent of the database of known organisms, and being studied intensively.However, previous methods have not considered opportunistic pathogenic and interpretability of black box model, are not well suited for clinical requirements. Results:In this study, we proposed a novel interpretable machine learning approach (IMLA) to identify the pathogenicity of bacterial genomes: human pathogens (HP), opportunistic pathogenicity (OHP) or non-pathogenicity(NHP), then use the following model-agnostic interpretation methods to interpret model: feature importance, accumulated local effects and Shapley values, due to the model interpretability is essential for healthcare applications. To our knowledge, our paper is the first attempt to infer opportunistic pathogenicity and explain the model. Conclusions: According to the simulation results, our approach IMLA can be a great addition to detect novel pathogens. Keywords: interpretable; machine learning; bacterial pathogen;

Download Full-text

An interpretable machine learning method for detecting novel pathogens

10.21203/rs.2.20477/v2 ◽

2020 ◽

Author(s):

Xiaoyong Zhao ◽

Ningning Wang

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

World Health ◽

Machine Learning Method ◽

Learning Method ◽

Human Pathogens ◽

Healthcare Applications ◽

Interpretable Machine Learning ◽

Shapley Values ◽

Opportunistic Pathogenic

Abstract Background: According to the World Health Organization (WHO), infectious diseases continue to one of the leading causes of death worldwide. Since the core microbiota flora of humans is largely diverse and horizontal gene transfer (HGT), it is very challenging to determine whether a particular bacterial strain is commensal or pathogenic to humans. With the latest advances in next-generation sequencing (NGS) technology, bioinformatics tools and techniques using NGS data have increasingly been used for the diagnosis and monitoring of infectious diseases. Even if the biological background is not available, the machine learning method can still infer the pathogenic phenotype from the NGS readings, independent of the database of known organisms, and being studied intensively.However, previous methods have not considered opportunistic pathogenic and interpretability of black box model, are not well suited for clinical requirements. Results :In this study, we proposed a novel interpretable machine learning approach (IMLA) to identify the pathogenicity of bacterial genomes: human pathogens (HP), opportunistic pathogenicity (OHP) or non-pathogenicity(NHP), then use the following model-agnostic interpretation methods to interpret model: feature importance, accumulated local effects and Shapley values, due to the model interpretability is essential for healthcare applications. To our knowledge, our paper is the first attempt to infer opportunistic pathogenicity and explain the model. Conclusions: According to the simulation results, our approach IMLA can be a great addition to detect novel pathogens.

Download Full-text

Taxonomy and Survey of Interpretable Machine Learning Method

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308404 ◽

2020 ◽

Author(s):

Saikat Das ◽

Namita Agarwal ◽

Deepak Venugopal ◽

Frederick T. Sheldon ◽

Sajjan Shiva

Keyword(s):

Machine Learning ◽

Machine Learning Method ◽

Learning Method ◽

Interpretable Machine Learning

Download Full-text

An interpretable machine learning method for forecasting the SYM-H index

10.1002/essoar.10508063.2 ◽

2021 ◽

Author(s):

Daniel Iong ◽

Yang Chen ◽

Gabor Toth ◽

Shasha Zou ◽

Tuija I. Pulkkinen ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Method ◽

Learning Method ◽

H Index ◽

Interpretable Machine Learning

Download Full-text

An interpretable machine learning method for supporting ecosystem management: Application to species distribution models of freshwater macroinvertebrates

Journal of Environmental Management ◽

10.1016/j.jenvman.2021.112719 ◽

2021 ◽

Vol 291 ◽

pp. 112719

Author(s):

YoonKyung Cha ◽

Jihoon Shin ◽

ByeongGeon Go ◽

Dae-Seong Lee ◽

YoungWoo Kim ◽

...

Keyword(s):

Machine Learning ◽

Ecosystem Management ◽

Species Distribution ◽

Species Distribution Models ◽

Machine Learning Method ◽

Learning Method ◽

Freshwater Macroinvertebrates ◽

Distribution Models ◽

Interpretable Machine Learning ◽

Management Application

Download Full-text

A machine learning method for the discovery of minimum marker gene combinations for cell-type identification from single-cell RNA sequencing

Genome Research ◽

10.1101/gr.275569.121 ◽

2021 ◽

pp. gr.275569.121

Author(s):

Brian D Aevermann ◽

Yun Zhang ◽

Mark Novotny ◽

Mohamed Keshk ◽

Trygve E Bakken ◽

...

Keyword(s):

Machine Learning ◽

Single Cell ◽

Rna Sequencing ◽

Marker Gene ◽

Machine Learning Method ◽

Learning Method ◽

Cell Type ◽

Single Cell Rna Sequencing

Download Full-text

Explainable t-SNE for single-cell RNA-seq data analysis

10.1101/2022.01.12.476084 ◽

2022 ◽

Author(s):

Henry Han ◽

Tianyu Zhang ◽

Mary Lauren Benton ◽

Chun Li ◽

Juan Wang ◽

...

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Data Analysis ◽

Dimension Reduction ◽

Single Cell ◽

Method Development ◽

Robustness Analysis ◽

High Dimensional ◽

Machine Learning Method ◽

Learning Method

Single-cell RNA (scRNA-seq) sequencing technologies trigger the study of individual cell gene expression and reveal the diversity within cell populations. To measure cell-to-cell similarity based on their transcription and gene expression, many dimension reduction methods are employed to retrieve the corresponding low-dimensional embeddings of input scRNA-seq data to conduct clustering. However, the methods lack explainability and may not perform well with scRNA-seq data because they are often migrated from other fields and not customized for high-dimensional sparse scRNA-seq data. In this study, we propose an explainable t-SNE: cell-driven t-SNE (c-TSNE) that fuses the cell differences reflected from biologically meaningful distance metrics for input scRNA-seq data. Our study shows that the proposed method not only enhances the interpretation of the original t-SNE visualization for scRNA-seq data but also demonstrates favorable single cell segregation performance on benchmark datasets compared to the state-of-the-art peers. The robustness analysis shows that the proposed cell-driven t-SNE demonstrates robustness to dropout and noise in dimension reduction and clustering. It provides a novel and practical way to investigate the interpretability of t-SNE in scRNA-seq data analysis. Unlike the general assumption that the explainanbility of a machine learning method needs to compromise with the learning efficiency, the proposed explainable t-SNE improves both clustering efficiency and explainanbility in scRNA-seq analysis. More importantly, our work suggests that widely used t-SNE can be easily misused in the existing scRNA-seq analysis, because its default Euclidean distance can bring biases or meaningless results in cell difference evaluation for high-dimensional sparse scRNA-seq data. To the best of our knowledge, it is the first explainable t-SNE proposed in scRNA-seq analysis and will inspire other explainable machine learning method development in the field.

Download Full-text