Single-cell tumor phylogeny inference with copy-number constrained mutation losses

Mapping Intimacies ◽

10.1101/840355 ◽

2019 ◽

Cited By ~ 1

Author(s):

Gryte Satas ◽

Simone Zaccaria ◽

Geoffrey Mon ◽

Benjamin J. Raphael

Keyword(s):

Single Cell ◽

Copy Number ◽

Phylogenetic Trees ◽

Colorectal Cancer Patient ◽

Simulated Data ◽

Cell Tumor ◽

Tumor Evolution ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Single Cell Sequencing

AbstractMotivationSingle-cell DNA sequencing enables the measurement of somatic mutations in individual tumor cells, and provides data to reconstruct the evolutionary history of the tumor. Nearly all existing methods to construct phylogenetic trees from single-cell sequencing data use single-nucleotide variants (SNVs) as markers. However, most solid tumors contain copy-number aberrations (CNAs) which can overlap loci containing SNVs. Particularly problematic are CNAs that delete an SNV, thus returning the SNV locus to the unmutated state. Such mutation losses are allowed in some models of SNV evolution, but these models are generally too permissive, allowing mutation losses without evidence of a CNA overlapping the locus.ResultsWe introduce a novel loss-supported evolutionary model, a generalization of the infinite sites and Dollo models, that constrains mutation losses to loci with evidence of a decrease in copy number. We design a new algorithm, Single-Cell Algorithm for Reconstructing the Loss-supported Evolution of Tumors (Scarlet), that infers phylogenies from single-cell tumor sequencing data using the loss-supported model and a probabilistic model of sequencing errors and allele dropout. On simulated data, we show that Scarlet outperforms current single-cell phylogeny methods, recovering more accurate trees and correcting errors in SNV data. On single-cell sequencing data from a metastatic colorectal cancer patient, Scarlet constructs a phylogeny that is both more consistent with the observed copy-number data and also reveals a simpler monooclonal seeding of the metastasis, contrasting with published reports of polyclonal seeding in this patient. Scarlet substantially improves single-cell phylogeny inference in tumors with CNAs, yielding new insights into the analysis of tumor evolution.AvailabilitySoftware is available at github.com/raphael-group/[email protected]

Download Full-text

Tumor Copy Number Deconvolution Integrating Bulk and Single-Cell Sequencing Data

10.1101/519892 ◽

2019 ◽

Cited By ~ 2

Author(s):

Haoyun Lei ◽

Bochuan Lyu ◽

E. Michael Gertz ◽

Alejandro A. Schäffer ◽

Xulian Shi ◽

...

Keyword(s):

Single Cell ◽

Copy Number ◽

Simulated Data ◽

Mixed Integer ◽

Intratumor Heterogeneity ◽

Tumor Evolution ◽

Sequencing Data ◽

Minimum Evolution ◽

Promising Alternative ◽

Single Cell Sequencing

AbstractCharacterizing intratumor heterogeneity (ITH) is crucial to understanding cancer development, but it is hampered by limits of available data sources. Bulk DNA sequencing is the most common technology to assess ITH, but mixes many genetically distinct cells in each sample, which must then be computationally deconvolved. Single-cell sequencing (SCS) is a promising alternative, but its limitations — e.g., high noise, difficulty scaling to large populations, technical artifacts, and large data sets — have so far made it impractical for studying cohorts of sufficient size to identify statistically robust features of tumor evolution. We have developed strategies for deconvolution and tumor phylogenetics combining limited amounts of bulk and single-cell data to gain some advantages of single-cell resolution with much lower cost, with specific focus on deconvolving genomic copy number data. We developed a mixed membership model for clonal deconvolution via non-negative matrix factorization (NMF) balancing deconvolution quality with similarity to single-cell samples via an associated efficient coordinate descent algorithm. We then improve on that algorithm by integrating deconvolution with clonal phylogeny inference, using a mixed integer linear programming (MILP) model to incorporate a minimum evolution phylogenetic tree cost in the problem objective. We demonstrate the effectiveness of these methods on semi-simulated data of known ground truth, showing improved deconvolution accuracy relative to bulk data alone.

Download Full-text

Decomposing the subclonal structure of tumors with two-way mixture models on copy number aberrations

10.1101/278887 ◽

2018 ◽

Author(s):

An-Shun Tai ◽

Chien-Hua Peng ◽

Shih-Chi Peng ◽

Wen-Ping Hsieh

Keyword(s):

Head And Neck Cancer ◽

Head And Neck ◽

Neck Cancer ◽

Copy Number ◽

Tumor Heterogeneity ◽

Tumor Evolution ◽

Depth Information ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Copy Number Aberrations

AbstractMultistage tumorigenesis is a dynamic process characterized by the accumulation of mutations. Thus, a tumor mass is composed of genetically divergent cell subclones. With the advancement of next-generation sequencing (NGS), mathematical models have been recently developed to decompose tumor subclonal architecture from a collective genome sequencing data. Most of the methods focused on single-nucleotide variants (SNVs). However, somatic copy number aberrations (CNAs) also play critical roles in carcinogenesis. Therefore, further modeling subclonal CNAs composition would hold the promise to improve the analysis of tumor heterogeneity and cancer evolution. To address this issue, we developed a two-way mixture Poisson model, named CloneDeMix for the deconvolution of read-depth information. It can infer the subclonal copy number, mutational cellular prevalence (MCP), subclone composition, and the order in which mutations occurred in the evolutionary hierarchy. The performance of CloneDeMix was systematically assessed in simulations. As a result, the accuracy of CNA inference was nearly 93% and the MCP was also accurately restored. Furthermore, we also demonstrated its applicability using head and neck cancer samples from TCGA. Our results inform about the extent of subclonal CNA diversity, and a group of candidate genes that probably initiate lymph node metastasis during tumor evolution was also discovered. Most importantly, these driver genes are located at 11q13.3 which is highly susceptible to copy number change in head and neck cancer genomes. This study successfully estimates subclonal CNAs and exhibit the evolutionary relationships of mutation events. By doing so, we can track tumor heterogeneity and identify crucial mutations during evolution process. Hence, it facilitates not only understanding the cancer development but finding potential therapeutic targets. Briefly, this framework has implications for improved modeling of tumor evolution and the importance of inclusion of subclonal CNAs.

Download Full-text

Tumor Copy Number Deconvolution Integrating Bulk and Single-Cell Sequencing Data

Journal of Computational Biology ◽

10.1089/cmb.2019.0302 ◽

2020 ◽

Vol 27 (4) ◽

pp. 565-598 ◽

Cited By ~ 1

Author(s):

Haoyun Lei ◽

Bochuan Lyu ◽

E. Michael Gertz ◽

Alejandro A. Schäffer ◽

Xulian Shi ◽

...

Keyword(s):

Single Cell ◽

Copy Number ◽

Sequencing Data ◽

Single Cell Sequencing

Download Full-text

nbCNV: a multi-constrained optimization model for discovering copy number variants in single-cell sequencing data

BMC Bioinformatics ◽

10.1186/s12859-016-1239-7 ◽

2016 ◽

Vol 17 (1) ◽

Cited By ~ 9

Author(s):

Changsheng Zhang ◽

Hongmin Cai ◽

Jingying Huang ◽

Yan Song

Keyword(s):

Constrained Optimization ◽

Single Cell ◽

Optimization Model ◽

Copy Number ◽

Copy Number Variants ◽

Sequencing Data ◽

Single Cell Sequencing

Download Full-text

A statistical test on single-cell data reveals widespread recurrent mutations in tumor evolution

10.1101/094722 ◽

2016 ◽

Cited By ~ 3

Author(s):

Jack Kuipers ◽

Katharina Jahn ◽

Benjamin J. Raphael ◽

Niko Beerenwinkel

Keyword(s):

Single Cell ◽

Large Scale ◽

Tumor Evolution ◽

Sequencing Data ◽

General Validity ◽

Genomic Deletions ◽

Single Cell Sequencing ◽

Statistical Framework ◽

Recurrent Mutations ◽

Complex Models

The infinite sites assumption, which states that every genomic position mutates at most once over the lifetime of a tumor, is central to current approaches for reconstructing mutation histories of tumors, but has never been tested explicitly. We developed a rigorous statistical framework to test the assumption with single-cell sequencing data. The framework accounts for the high noise and contamination present in such data. We found strong evidence for recurrent mutations at the same site in 8 out of 9 single-cell sequencing datasets from human tumors. Six cases involved the loss of earlier mutations, five of which occurred at sites unaffected by large scale genomic deletions. Two cases exhibited parallel mutation, including the dataset with the strongest evidence of recurrence. Our results refute the general validity of the infinite sites assumption and indicate that more complex models are needed to adequately quantify intra-tumor heterogeneity.

Download Full-text

Single-cell copy number calling and event history reconstruction

10.1101/2020.04.28.065755 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jack Kuipers ◽

Mustafa Anıl Tuncel ◽

Pedro Ferreira ◽

Katharina Jahn ◽

Niko Beerenwinkel

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Copy Number ◽

Driving Forces ◽

Simulated Data ◽

Read Depth ◽

Cancer Diagnostics ◽

Whole Genome ◽

Copy Number Alterations ◽

Sequencing Data

Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling down to the level of individual cells. Recent high-throughput protocols allow for the processing of hundreds of cells through shallow whole-genome DNA sequencing. The resulting low read-depth data poses substantial statistical and computational challenges to the identification of copy number alterations. We developed SCICoNE, a statistical model and MCMC algorithm tailored to single-cell copy number profiling from shallow whole-genome DNA sequencing data. SCICoNE reconstructs the history of copy number events in the tumour and uses these evolutionary relationships to identify the copy number profiles of the individual cells. We show the accuracy of this approach in evaluations on simulated data and demonstrate its practicability in applications to a xenograft breast cancer sample.

Download Full-text