Inferring Cancer Progression from Single-cell Sequencing while Allowing Mutation Losses

Mapping Intimacies ◽

10.1101/268243 ◽

2018 ◽

Cited By ~ 11

Author(s):

Simone Ciccolella ◽

Mauricio Soto Gomez ◽

Murray Patterson ◽

Gianluca Della Vedova ◽

Iman Hajirasouliha ◽

...

Keyword(s):

Simulated Annealing ◽

Single Cell ◽

Cancer Progression ◽

Evolutionary History ◽

Real Data ◽

Simple Extension ◽

Data Sets ◽

Fundamental Feature ◽

Single Cell Sequencing ◽

History Of

AbstractMotivationIn recent years, the well-known Infinite Sites Assumption (ISA) has been a fundamental feature of computational methods devised for reconstructing tumor phylogenies and inferring cancer progressions seen as an accumulation of mutations. However, recent studies (Kuiperset al., 2017) leveraging Single-cell Sequencing (SCS) techniques have shown evidence of the widespread recurrence and, especially, loss of mutations in several tumor samples. Still, established methods that can infer phylogenies with mutation losses are however lacking.ResultsWe present theSASC(Simulated Annealing Single-Cell inference) tool which is a new and robust approach based on simulated annealing for the inference of cancer progression from SCS data. More precisely, we introduce a simple extension of the model of evolution where mutations are only accumulated, by allowing also a limited amount of back mutations in the evolutionary history of the tumor: the Dollo-kmodel. We demonstrate thatSASCachieves high levels of accuracy when tested on both simulated and real data sets and in comparison with some other available methods.AvailabilityThe Simulated Annealing Single-cell inference (SASC) tool is open source and available athttps://github.com/sciccolella/[email protected]

Download Full-text

Inferring cancer progression from Single-Cell Sequencing while allowing mutation losses

Bioinformatics ◽

10.1093/bioinformatics/btaa722 ◽

2020 ◽

Author(s):

Simone Ciccolella ◽

Camir Ricketts ◽

Mauricio Soto Gomez ◽

Murray Patterson ◽

Dana Silverbush ◽

...

Keyword(s):

Simulated Annealing ◽

Single Cell ◽

Computational Methods ◽

Cancer Progression ◽

Evolutionary History ◽

Supplementary Information ◽

Fundamental Feature ◽

Robust Approach ◽

Single Cell Sequencing ◽

History Of

Abstract Motivation In recent years, the well-known Infinite Sites Assumption has been a fundamental feature of computational methods devised for reconstructing tumor phylogenies and inferring cancer progressions. However, recent studies leveraging single-cell sequencing (SCS) techniques have shown evidence of the widespread recurrence and, especially, loss of mutations in several tumor samples. While there exist established computational methods that infer phylogenies with mutation losses, there remain some advancements to be made. Results We present Simulated Annealing Single-Cell inference (SASC): a new and robust approach based on simulated annealing for the inference of cancer progression from SCS datasets. In particular, we introduce an extension of the model of evolution where mutations are only accumulated, by allowing also a limited amount of mutation loss in the evolutionary history of the tumor: the Dollo-k model. We demonstrate that SASC achieves high levels of accuracy when tested on both simulated and real datasets and in comparison with some other available methods. Availability and implementation The SASC tool is open source and available at https://github.com/sciccolella/sasc. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Efficient and scalable integration of single-cell data using domain-adversarial and variational approximation

10.1101/2021.04.06.438733 ◽

2021 ◽

Author(s):

Jialu Hu ◽

Yuanke Zhong ◽

Xuequn Shang

Keyword(s):

Single Cell ◽

Latent Variable ◽

Real Data ◽

Variational Approximation ◽

Data Sets ◽

Single Cell Sequencing ◽

Sequencing Technologies ◽

Non Linear ◽

Post Hoc ◽

Cell Data

Single-cell data provides us new ways of discovering biological truth at the level of individual cells, such as identification of cellular sub-populations and cell development. With the development of single-cell sequencing technologies, a key analytical challenge is to integrate these data sets to uncover biological insights. Here, we developed a domain-adversarial and variational approximation framework, DAVAE, to integrate multiple single-cell data across samples, technologies and modalities without any post hoc data processing. We fit normalized gene expression into a non-linear model, which transforms a latent variable of a lower-dimension into expression space with a non-linear function, a KL regularizier and a domain-adversarial regularizer. Results on five real data integration applications demonstrated the effectiveness and scalability of DAVAE in batch-effect removing, transfer learning, and cell type predictions for multiple single-cell data sets across samples, technologies and modalities. DAVAE was implemented in the toolkit package scbean in the pypi repository, and the source code can be also freely accessible at https://github.com/jhu99/scbean.

Download Full-text

Studying the history of tumor evolution from single-cell sequencing data by exploring the space of binary matrices

10.1101/2020.07.15.204081 ◽

2020 ◽

Cited By ~ 1

Author(s):

Salem Malikić ◽

Farid Rashidi Mehrabadi ◽

Erfan Sadeqi Azer ◽

Mohammad Haghir Ebrahimabadi ◽

S. Cenk Sahinalp

Keyword(s):

Single Cell ◽

Evolutionary History ◽

Tumor Evolution ◽

Sequencing Data ◽

Single Cell Sequencing ◽

Linear Programming Formulation ◽

History Of ◽

Binary Matrices ◽

Constraint Satisfaction Programming ◽

Integer Linear Programming Formulation

AbstractSingle-cell sequencing data has great potential in reconstructing the evolutionary history of tumors. Rapid advances in single-cell sequencing technology in the past decade were followed by the design of various computational methods for inferring trees of tumor evolution. Some of the earliest of these methods were based on the direct search in the space of trees. However, it can be shown that instead of this tree search strategy we can perform a search in the space of binary matrices and obtain the most likely tree directly from the most likely among the candidate binary matrices. The search in the space of binary matrices can be expressed as an instance of integer linear or constraint satisfaction programming and solved by some of the available solvers, which typically provide a guarantee of optimality of the reported solution. In this review, we first describe one convenient tree representation of tumor evolutionary history and present tree scoring model that is most commonly used in the available methods. We then provide proof showing that the most likely tree of tumor evolution can be obtained directly from the most likely matrix from the space of candidate binary matrices. Next, we provide integer linear programming formulation to search for such matrix and summarize the existing methods based on this formulation or its extensions. Lastly, we present one use-case which illustrates how binary matrices can be used as a basis for developing a fast deep learning method for inferring some topological properties of the most likely tree of tumor evolution.

Download Full-text

Ignoring errors causes inaccurate timing of single-cell phylogenies

10.1101/2021.03.17.435906 ◽

2021 ◽

Author(s):

Kylie Chen ◽

David Welch ◽

Alexei J. Drummond

Keyword(s):

Single Cell ◽

Evolutionary History ◽

Dynamic Models ◽

Genetic Material ◽

Sequencing Error ◽

Molecular Clocks ◽

Single Cell Sequencing ◽

Evolutionary Inference ◽

History Of ◽

Cell Data

Single-cell sequencing provides a new way to explore the evolutionary history of cancers. Compared to traditional bulk sequencing, which samples multiple heterogeneous cells, single-cell sequencing isolates and amplifies genetic material from a single cell. The ability to isolate a single cell makes it ideal for evolutionary inference. However, single-cell data is more error-prone due to the limited genomic material available per cell. Previous work using single-cell data to reconstruct the evolutionary history of cancers has not been integrated with standard evolutionary models. Here, we present error and mutation models for evolutionary inference of single-cell data within a mature and extensible Bayesian framework, BEAST2. Our framework enables integration with biologically informative models such as relaxed molecular clocks and population dynamic models. We reconstruct the phylogenetic history for a myeloproliferative cancer patient and two colorectal cancer patients. We find that the estimated times of terminal splitting events are shifted forward in time compared to models which ignore errors. Furthermore, we estimate 50% - 70% of the evolutionary distance between samples can be explained by sequencing error. Our simulation studies show that ignoring errors leads to inaccurate estimates of divergence times, mutation parameters and population parameters. Our work opens the potential for integrative Bayesian models capable of combining multiple sources of data.

Download Full-text

gpps: An ILP-based approach for inferring cancer progression with mutation losses from single cell data

10.1101/365635 ◽

2018 ◽

Cited By ~ 1

Author(s):

Simone Ciccolella ◽

Mauricio Soto Gomez ◽

Murray Patterson ◽

Gianluca Della Vedova ◽

Iman Hajirasouliha ◽

...

Keyword(s):

Single Cell ◽

Open Source ◽

Computational Methods ◽

Cancer Progression ◽

Fixed Number ◽

Inference Problem ◽

Fundamental Feature ◽

Single Cell Sequencing ◽

Cell Data ◽

Tumor Phylogeny

Download Full-text

Using single cell sequencing data to model the evolutionary history of a tumor

BMC Bioinformatics ◽

10.1186/1471-2105-15-27 ◽

2014 ◽

Vol 15 (1) ◽

Cited By ~ 43

Author(s):

Kyung In Kim ◽

Richard Simon

Keyword(s):

Single Cell ◽

Evolutionary History ◽

Sequencing Data ◽

Single Cell Sequencing ◽

History Of

Download Full-text

Identification Of Gene Signature For Renal Cell Carcinoma-Associated Fibroblasts Mediating Cancer Progression And Affecting Prognosis

10.21203/rs.3.rs-49601/v1 ◽

2020 ◽

Author(s):

Bitian Liu ◽

Xiaonan Chen ◽

Yunhong Zhan ◽

Bin Wu ◽

Shen Pan

Keyword(s):

Renal Cell Carcinoma ◽

Cell Carcinoma ◽

Single Cell ◽

Clinical Significance ◽

Cell Lines ◽

Cancer Progression ◽

Renal Cell ◽

Gene Signature ◽

Pathological Grade ◽

Single Cell Sequencing

Abstract Background: Cancer-associated fibroblasts (CAFs) are most abundant in stroma and are critically involved in cancer progression. However, the specific signature of CAFs and related clinicopathological parameters in renal cell carcinoma (RCC) remain unclear. Methods: In this work, methods using recognized gene signatures were employed to roughly assess the infiltration level of the stroma and CAFs in RCC based on the data in The Cancer Genome Atlas. Weighted gene co-expression network analysis (WGCNA) was used to cluster transcriptomes and correlate with CAFs to identify specific markers. A comparison of fibroblast versus urothelial carcinoma cell lines and correlation with previously reported CAF markers were performed to demonstrate the specific expressed of the gene signature. The gene signature was used to compare fibroblast infiltration of each sample through single sample gene set enrichment analysis, and the clinical significance of fibroblasts was analyzed via Cox risk assessment and the chi-square test. Finally, we used validation data to verify the clinical significance of the fibroblast gene signature in RCC. Results: Roughly calculated tumor matrix and CAF levels were significantly higher in kidney cancer than in normal tissues. More than 85% of fibroblast-specific markers identified by WGCNA were consistent with markers obtained via single-cell sequencing. These markers were more highly expressed in fibroblast cell lines and were significantly correlated with canonical CAFs makers. Data validation also showed that CAFs were significant correlation with survival and pathological grade. Conclusions: In summary, our findings indicate that the gene signature potentially serves as a biomarker of CAFs in RCC and that infiltration of fibroblasts in RCC is an independent prognostic factor associated with pathological grade and stage of tumor. The ability to recognize specific CAF markers using WGCNA is comparable to single-cell sequencing.

Download Full-text

Single-cell sequencing unveils the lifestyle and CRISPR-based population history of Hydrotalea sp. in acid mine drainage

Molecular Ecology ◽

10.1111/mec.14294 ◽

2017 ◽

Vol 26 (20) ◽

pp. 5541-5551 ◽

Cited By ~ 3

Author(s):

J. D. Medeiros ◽

L. R. Leite ◽

V. S. Pylro ◽

F. S. Oliveira ◽

V. M. Almeida ◽

...

Keyword(s):

Acid Mine Drainage ◽

Single Cell ◽

Mine Drainage ◽

Population History ◽

Single Cell Sequencing ◽

Acid Mine ◽

History Of

Download Full-text

A transcriptome-based phylogenetic study of hard ticks (Ixodidae)

Scientific Reports ◽

10.1038/s41598-019-49641-9 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 6

Author(s):

N. Pierre Charrier ◽

Axelle Hermouet ◽

Caroline Hervet ◽

Albert Agoulon ◽

Stephen C. Barker ◽

...

Keyword(s):

Evolutionary History ◽

Nuclear Genome ◽

Single Copy ◽

Phylogenetic Study ◽

Data Sets ◽

Hard Tick ◽

Rna Seq ◽

Hard Ticks ◽

History Of ◽

Mitochondrial Sequences

Abstract Hard ticks are widely distributed across temperate regions, show strong variation in host associations, and are potential vectors of a diversity of medically important zoonoses, such as Lyme disease. To address unresolved issues with respect to the evolutionary relationships among certain species or genera, we produced novel RNA-Seq data sets for nine different Ixodes species. We combined this new data with 18 data sets obtained from public databases, both for Ixodes and non-Ixodes hard tick species, using soft ticks as an outgroup. We assembled transcriptomes (for 27 species in total), predicted coding sequences and identified single copy orthologues (SCO). Using Maximum-likelihood and Bayesian frameworks, we reconstructed a hard tick phylogeny for the nuclear genome. We also obtained a mitochondrial DNA-based phylogeny using published genome sequences and mitochondrial sequences derived from the new transcriptomes. Our results confirm previous studies showing that the Ixodes genus is monophyletic and clarify the relationships among Ixodes sub-genera. This work provides a baseline for studying the evolutionary history of ticks: we indeed found an unexpected acceleration of substitutions for mitochondrial sequences of Prostriata, and for nuclear and mitochondrial genes of two species of Rhipicephalus, which we relate with patterns of genome architecture and changes of life-cycle, respectively.

Download Full-text

Cellsnp-lite: an efficient tool for genotyping single cells

10.1101/2020.12.31.424913 ◽

2021 ◽

Author(s):

Xianjie Huang ◽

Yuanhua Huang

Keyword(s):

Single Cell ◽

Single Cells ◽

Basic Research ◽

Substantial Improvement ◽

Data Sets ◽

Sequencing Data ◽

Single Cell Sequencing ◽

Memory Efficiency ◽

Computational Speed ◽

Cell Data

AbstractSummarySingle-cell sequencing is an increasingly used technology and has promising applications in basic research and clinical translations. However, genotyping methods developed for bulk sequencing data have not been well adapted for single-cell data, in terms of both computational parallelization and simplified user interface. Here we introduce a software, cellsnp-lite, implemented in C/C++ and based on well supported package htslib, for genotyping in single-cell sequencing data for both droplet and well based platforms. On various experimental data sets, it shows substantial improvement in computational speed and memory efficiency with retaining highly concordant results compared to existing methods. Cellsnp-lite therefore lightens the genetic analysis for increasingly large single-cell data.AvailabilityThe source code is freely available at https://github.com/single-cell-genetics/[email protected]

Download Full-text