sCNAphase: using haplotype resolved read depth to genotype somatic copy number alterations from low cellularity aneuploid tumors

Mapping Intimacies ◽

10.1101/038828 ◽

2016 ◽

Author(s):

Wenhan Chen ◽

Alan J. Robertson ◽

Devika Ganesamoorthy ◽

Lachlan J.M. Coin

Keyword(s):

Cell Line ◽

Tumor Cells ◽

Copy Number ◽

High Throughput Sequencing ◽

Haplotype Frequency ◽

Read Depth ◽

Breast Cancer Cell Lines ◽

Copy Number Alterations ◽

Sequencing Data ◽

Accurate Identification

AbstractAccurate identification of copy number alterations is an essential step in understanding the events driving tumor progression. While a variety of algorithms have been developed to use high-throughput sequencing data to profile copy number changes, no tool is able to reliably characterize ploidy and genotype absolute copy number from tumor samples which contain less than 40% tumor cells. To increase our power to resolve the copy number profile from low-cellularity tumor samples, we developed a novel approach which pre-phases heterozygote germline SNPs in order to replace the commonly used ‘B-allele frequency’ with a more powerful ‘parental-haplotype frequency’. We apply our tool - sCNAphase - to characterize the copy number and loss-of-heterozygosity profiles of four publicly available breast cancer cell-lines. Comparisons to previous spectral karyotyping and microarray studies revealed that sCNAphase reliably identified overall ploidy as well as the individual copy number mutations from each cell-line. Analysis of artificial cell-line mixtures demonstrated the capacity of this method to determine the level of tumor cellularity, consistently identify sCNAs and characterize ploidy in samples with as little as 10% tumor cells. This novel methodology has the potential to bring sCNA profiling to low-cellularity tumors, a form of cancer unable to be accurately studied by current methods.

Download Full-text

Single-cell copy number calling and event history reconstruction

10.1101/2020.04.28.065755 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jack Kuipers ◽

Mustafa Anıl Tuncel ◽

Pedro Ferreira ◽

Katharina Jahn ◽

Niko Beerenwinkel

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Copy Number ◽

Driving Forces ◽

Simulated Data ◽

Read Depth ◽

Cancer Diagnostics ◽

Whole Genome ◽

Copy Number Alterations ◽

Sequencing Data

Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling down to the level of individual cells. Recent high-throughput protocols allow for the processing of hundreds of cells through shallow whole-genome DNA sequencing. The resulting low read-depth data poses substantial statistical and computational challenges to the identification of copy number alterations. We developed SCICoNE, a statistical model and MCMC algorithm tailored to single-cell copy number profiling from shallow whole-genome DNA sequencing data. SCICoNE reconstructs the history of copy number events in the tumour and uses these evolutionary relationships to identify the copy number profiles of the individual cells. We show the accuracy of this approach in evaluations on simulated data and demonstrate its practicability in applications to a xenograft breast cancer sample.

Download Full-text

DEFOR: depth- and frequency-based somatic copy number alteration detector

Bioinformatics ◽

10.1093/bioinformatics/btz170 ◽

2019 ◽

Vol 35 (19) ◽

pp. 3824-3825 ◽

Cited By ~ 1

Author(s):

He Zhang ◽

Xiaowei Zhan ◽

James Brugarolas ◽

Yang Xie

Keyword(s):

Exome Sequencing ◽

Copy Number ◽

Copy Number Alteration ◽

High Throughput Sequencing ◽

Supplementary Information ◽

Copy Number Alterations ◽

Sequencing Data ◽

Sequencing Technology ◽

Somatic Copy Number Alterations ◽

Somatic Copy Number Alteration

Abstract Motivation Detection of somatic copy number alterations (SCNAs) using high-throughput sequencing has become popular because of rapid developments in sequencing technology. Existing methods do not perform well in calling SCNAs for the unstable tumor genomes. Results We developed a new method, DEFOR, to detect SCNAs in tumor samples from exome-sequencing data. The evaluation showed that DEFOR has a higher accuracy for SCNA detection from exome sequencing compared with the five existing tools. This advantage is especially apparent in unstable tumor genomes with a large proportion of SCNAs. Availability and implementation DEFOR is available at https://github.com/drzh/defor. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

BIC-seq: a fast algorithm for detection of copy number alterations based on high-throughput sequencing data

Genome Biology ◽

10.1186/1465-6906-11-s1-o10 ◽

2010 ◽

Vol 11 (Suppl 1) ◽

pp. O10 ◽

Cited By ~ 21

Author(s):

Ruibin Xi ◽

Joe Luquette ◽

Angela Hadjipanayis ◽

Tae-Min Kim ◽

Peter J Park

Keyword(s):

High Throughput ◽

Fast Algorithm ◽

Copy Number ◽

High Throughput Sequencing ◽

Copy Number Alterations ◽

Sequencing Data ◽

High Throughput Sequencing Data

Download Full-text

Accucopy: accurate and fast inference of allele-specific copy number alterations from low-coverage low-purity tumor sequencing data

BMC Bioinformatics ◽

10.1186/s12859-020-03924-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xinping Fan ◽

Guanghao Luo ◽

Yu S. Huang

Keyword(s):

Copy Number ◽

Bayesian Learning ◽

Kernel Smoothing ◽

Gaussian Mixture ◽

Copy Number Alterations ◽

Sequencing Data ◽

Copy Numbers ◽

Allele Specific ◽

Tumor Sequencing ◽

Low Coverage

Abstract Background Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task. Results We introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the expectation–maximization algorithm, and sparse Bayesian learning were customized and built into the model. Accucopy is implemented in C++ /Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/. Conclusions We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.

Download Full-text

Use of whole genome amplification and comparative genomic hybridisation to detect chromosomal copy number alterations in cell line material and tumour tissue

Cytogenetic and Genome Research ◽

10.1159/000078004 ◽

2004 ◽

Vol 105 (1) ◽

pp. 18-24 ◽

Cited By ~ 21

Author(s):

S. Hughes ◽

G. Lim ◽

B. Beheshti ◽

J. Bayani ◽

P. Marrano ◽

...

Keyword(s):

Cell Line ◽

Copy Number ◽

Whole Genome Amplification ◽

Tumour Tissue ◽

Comparative Genomic Hybridisation ◽

Comparative Genomic ◽

Whole Genome ◽

Copy Number Alterations ◽

Line Material ◽

Chromosomal Copy Number

Download Full-text

CNV-P: a machine-learning framework for predicting high confident copy number variations

PeerJ ◽

10.7717/peerj.12564 ◽

2021 ◽

Vol 9 ◽

pp. e12564

Author(s):

Taifu Wang ◽

Jinghua Sun ◽

Xiuqing Zhang ◽

Wen-Jing Wang ◽

Qing Zhou

Keyword(s):

Machine Learning ◽

False Positive ◽

Copy Number ◽

Genetic Disorders ◽

Genetic Diseases ◽

Basic Research ◽

Read Depth ◽

Copy Number Variations ◽

Sequencing Data ◽

Learning Framework

Background Copy-number variants (CNVs) have been recognized as one of the major causes of genetic disorders. Reliable detection of CNVs from genome sequencing data has been a strong demand for disease research. However, current software for detecting CNVs has high false-positive rates, which needs further improvement. Methods Here, we proposed a novel and post-processing approach for CNVs prediction (CNV-P), a machine-learning framework that could efficiently remove false-positive fragments from results of CNVs detecting tools. A series of CNVs signals such as read depth (RD), split reads (SR) and read pair (RP) around the putative CNV fragments were defined as features to train a classifier. Results The prediction results on several real biological datasets showed that our models could accurately classify the CNVs at over 90% precision rate and 85% recall rate, which greatly improves the performance of state-of-the-art algorithms. Furthermore, our results indicate that CNV-P is robust to different sizes of CNVs and the platforms of sequencing. Conclusions Our framework for classifying high-confident CNVs could improve both basic research and clinical diagnosis of genetic diseases.

Download Full-text

SEG - A Software Program for Finding Somatic Copy Number Alterations in Whole Genome Sequencing Data of Cancer

Computational and Structural Biotechnology Journal ◽

10.1016/j.csbj.2018.09.001 ◽

2018 ◽

Vol 16 ◽

pp. 335-341 ◽

Cited By ~ 2

Author(s):

Mucheng Zhang ◽

Deli Liu ◽

Jie Tang ◽

Yuan Feng ◽

Tianfang Wang ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Copy Number Alterations ◽

Sequencing Data ◽

Software Program ◽

Somatic Copy Number Alterations

Download Full-text

Combinatorial Detection Algorithm for Copy Number Variations Using High-throughput Sequencing Reads

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419500228 ◽

2019 ◽

Vol 33 (14) ◽

pp. 1950022

Author(s):

Hai Yang ◽

Daming Zhu

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

High Throughput ◽

Copy Number ◽

High Throughput Sequencing ◽

Hidden Markov ◽

Read Depth ◽

Detection Algorithm ◽

Gc Bias ◽

Unmapped Reads

Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1[Formula: see text]kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms.

Download Full-text

Abstract 2517: Validation of a targeted sequencing workflow for sequence variants and focal copy number alterations (CNAs) in single circulating tumor cells (CTCs)

10.1158/1538-7445.am2019-2517 ◽

2019 ◽

Author(s):

Paola Tononi ◽

Valentina del Monaco ◽

Alberto Ferrarini ◽

Genny Buson ◽

Marianna Garonzi ◽

...

Keyword(s):

Tumor Cells ◽

Circulating Tumor Cells ◽

Copy Number ◽

Targeted Sequencing ◽

Sequence Variants ◽

Copy Number Alterations

Download Full-text

Estimation of Copy Number Alterations from Exome Sequencing Data

PLoS ONE ◽

10.1371/journal.pone.0051422 ◽

2012 ◽

Vol 7 (12) ◽

pp. e51422 ◽

Cited By ~ 11

Author(s):

Rafael Valdés-Mas ◽

Silvia Bea ◽

Diana A. Puente ◽

Carlos López-Otín ◽

Xose S. Puente

Keyword(s):

Exome Sequencing ◽

Copy Number ◽

Copy Number Alterations ◽

Sequencing Data ◽

Exome Sequencing Data

Download Full-text