Hierarchical Discovery of Large-scale and Focal Copy Number Alterations in Low-coverage Cancer Genomes

Mapping Intimacies ◽

10.1101/639294 ◽

2019 ◽

Author(s):

Ahmed Ibrahim Samir Khalil ◽

Costerwell Khyriem ◽

Anupam Chattopadhyay ◽

Amartya Sanyal

Keyword(s):

Copy Number ◽

Large Scale ◽

Simulated Data ◽

Change Points ◽

Copy Number Alterations ◽

Biological Origin ◽

Robust Detection ◽

Cancer Data ◽

Cancer Genomes ◽

Low Coverage

AbstractMotivationDetection of copy number alterations (CNA) is critical to understand genetic diversity, genome evolution and pathological conditions such as cancer. Cancer genomes are plagued with widespread multi-level structural aberrations of chromosomes that pose challenges to discover CNAs of different length scales with distinct biological origin and function. Although several tools are available to identify CNAs using read depth (RD) of coverage, they fail to distinguish between large-scale and focal alterations due to inaccurate modeling of the RD signal of cancer genomes. These tools are also affected by RD signal variations, pronounced in low-coverage data, which significantly inflate false detection of change points and inaccurate CNA calling.ResultsWe have developed CNAtra to hierarchically discover and classify ‘large-scale’ and ‘focal’ copy number gain/loss from whole-genome sequencing (WGS) data. CNAtra provides an analytical and visualization framework for CNV profiling using single sequencing sample. CNAtra first utilizes multimodal distribution to estimate the copy number (CN) reference from the complex RD profile of the cancer genome. We utilized Savitzy-Golay filter and Modified Varri segmentation to capture the change points. We then developed a CN state-driven merging algorithm to identify the large segments with distinct copy number. Next, focal alterations were identified in each large segment using coverage-based thresholding to mitigate the adverse effects of signal variations. We tested CNAtra calls using experimentally verified segmental aneuploidies and focal alterations which confirmed CNAtra’s ability to detect and distinguish the two alteration phenomena. We used realistic simulated data for benchmarking the performance of CNAtra against other detection tools where we artificially spiked-in CNAs in the original cancer profiles. We found that CNAtra is superior in terms of precision, recall, and f-measure. CNAtra shows the highest sensitivity of 93% and 97% for detecting focal and large-scale alterations respectively. Visual inspection of CNAs showed that CNAtra is the most robust detection tool for low-coverage cancer data.Availability and implementationCNAtra is an open source software implemented in MATLAB, and is available at https://github.com/AISKhalil/CNAtra

Download Full-text

Hierarchical discovery of large-scale and focal copy number alterations in low-coverage cancer genomes

BMC Bioinformatics ◽

10.1186/s12859-020-3480-3 ◽

2020 ◽

Vol 21 (1) ◽

Cited By ~ 3

Author(s):

Ahmed Ibrahim Samir Khalil ◽

Costerwell Khyriem ◽

Anupam Chattopadhyay ◽

Amartya Sanyal

Keyword(s):

Copy Number ◽

Large Scale ◽

Copy Number Alterations ◽

Cancer Genomes ◽

Low Coverage

Download Full-text

EPCO-29. EPIGENOMICS OF THE GLIOMA LONGITUDINAL ANALYSIS (GLASS) CONSORTIUM

Neuro-Oncology ◽

10.1093/neuonc/noaa215.308 ◽

2020 ◽

Vol 22 (Supplement_2) ◽

pp. ii75-ii75

Author(s):

Thais Sabedot ◽

Michael Wells ◽

Indrani Datta ◽

Tathiane Malta ◽

Ana Valeria Castro ◽

...

Keyword(s):

Dna Methylation ◽

Longitudinal Analysis ◽

Copy Number ◽

Large Scale ◽

Molecular Classification ◽

Tumor Burden ◽

Copy Number Alterations ◽

Unsupervised Analysis ◽

Diffuse Gliomas ◽

New Biomarkers

Abstract Adult diffuse gliomas are central nervous system (CNS) tumors that arise from the malignant transformation of glial cells. Nearly all gliomas will recur despite standard treatment however, current histopathological grading fails to predict which of them will relapse and/or progress. The Glioma Longitudinal AnalySiS (GLASS) consortium is a large-scale collaboration that aims to investigate the molecular profiling of matched primary and recurrent glioma samples from multiple institutions in order to better understand the dynamic evolution of these tumors. At this time, the cohort comprises 946 samples across 11 institutions and among those, 864 have DNA methylation data available. The current molecular classification based on 7 subtypes published by TCGA in 2016 was applied to the dataset. Among the IDH wildtype tumors, 33% (16/49) of the patients showed a change of subtype upon recurrence, whereas most of them (9/16) were Classic-like at the primary stage but changed to either Mesenchymal-like or PA-like at the recurrent level. Among the IDH mutant tumors, 15% (22/142) showed a change of subtype at recurrent stage, in which 16 out of 22 progressed from G-CIMP-high to G-CIMP-low. Although some tumors progressed to a different subtype upon recurrence, an unsupervised analysis showed that the samples tend to cluster by patient instead of by subtype. By estimating the copy number alterations of these tumors using DNA methylation, the overall copy number profile of the recurrent samples remains similar to their primary counterpart. From this initial analysis using epigenomic data, we were able to characterize some aspects of glioma evolution and how the DNA methylation is associated with the progression of these tumors to different subtypes. These findings corroborate the importance of epigenetics in gliomas and can potentially lead to the identification of new biomarkers that can reflect tumor burden and predict its development.

Download Full-text

Accucopy: accurate and fast inference of allele-specific copy number alterations from low-coverage low-purity tumor sequencing data

BMC Bioinformatics ◽

10.1186/s12859-020-03924-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xinping Fan ◽

Guanghao Luo ◽

Yu S. Huang

Keyword(s):

Copy Number ◽

Bayesian Learning ◽

Kernel Smoothing ◽

Gaussian Mixture ◽

Copy Number Alterations ◽

Sequencing Data ◽

Copy Numbers ◽

Allele Specific ◽

Tumor Sequencing ◽

Low Coverage

Abstract Background Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task. Results We introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the expectation–maximization algorithm, and sparse Bayesian learning were customized and built into the model. Accucopy is implemented in C++ /Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/. Conclusions We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.

Download Full-text

Using low-coverage whole genome sequencing technique to analyze the chromosomal copy number alterations in the exfoliative cells of cervical cancer

Journal of Gynecologic Oncology ◽

10.3802/jgo.2018.29.e78 ◽

2018 ◽

Vol 29 (5) ◽

Author(s):

Tong Ren ◽

Jing Suo ◽

Shikai Liu ◽

Shu Wang ◽

Shan Shu ◽

...

Keyword(s):

Cervical Cancer ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Whole Genome ◽

Copy Number Alterations ◽

Sequencing Technique ◽

Chromosomal Copy Number ◽

Chromosomal Copy ◽

Low Coverage

Download Full-text

Cell-free tumour DNA analysis detects copy number alterations in gastro-oesophageal cancer patients

PLoS ONE ◽

10.1371/journal.pone.0245488 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0245488

Author(s):

Karin Wallander ◽

Jesper Eisfeldt ◽

Mats Lindblad ◽

Daniel Nilsson ◽

Kenny Billiau ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Oesophageal Cancer ◽

Copy Number ◽

Dna Analysis ◽

Tissue Sample ◽

Whole Genome ◽

Copy Number Alterations ◽

Plasma Dna ◽

Low Coverage

Background Analysis of cell-free tumour DNA, a liquid biopsy, is a promising biomarker for cancer. We have performed a proof-of principle study to test the applicability in the clinical setting, analysing copy number alterations (CNAs) in plasma and tumour tissue from 44 patients with gastro-oesophageal cancer. Methods DNA was isolated from blood plasma and a tissue sample from each patient. Array-CGH was applied to the tissue DNA. The cell-free plasma DNA was sequenced by low-coverage whole-genome sequencing using a clinical pipeline for non-invasive prenatal testing. WISECONDOR and ichorCNA, two bioinformatic tools, were used to process the output data and were compared to each other. Results Cancer-associated CNAs could be seen in 59% (26/44) of the tissue biopsies. In the plasma samples, a targeted approach analysing 61 regions of special interest in gastro-oesophageal cancer detected cancer-associated CNAs with a z-score >5 in 11 patients. Broadening the analysis to a whole-genome view, 17/44 patients (39%) had cancer-associated CNAs using WISECONDOR and 13 (30%) using ichorCNA. Of the 26 patients with tissue-verified cancer-associated CNAs, 14 (54%) had corresponding CNAs in plasma. Potentially clinically actionable amplifications overlapping the genes VEGFA, EGFR and FGFR2 were detected in the plasma from three patients. Conclusions We conclude that low-coverage whole-genome sequencing without prior knowledge of the tumour alterations could become a useful tool for cell-free tumour DNA analysis of total CNAs in plasma from patients with gastro-oesophageal cancer.

Download Full-text

Single-cell copy number calling and event history reconstruction

10.1101/2020.04.28.065755 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jack Kuipers ◽

Mustafa Anıl Tuncel ◽

Pedro Ferreira ◽

Katharina Jahn ◽

Niko Beerenwinkel

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Copy Number ◽

Driving Forces ◽

Simulated Data ◽

Read Depth ◽

Cancer Diagnostics ◽

Whole Genome ◽

Copy Number Alterations ◽

Sequencing Data

Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling down to the level of individual cells. Recent high-throughput protocols allow for the processing of hundreds of cells through shallow whole-genome DNA sequencing. The resulting low read-depth data poses substantial statistical and computational challenges to the identification of copy number alterations. We developed SCICoNE, a statistical model and MCMC algorithm tailored to single-cell copy number profiling from shallow whole-genome DNA sequencing data. SCICoNE reconstructs the history of copy number events in the tumour and uses these evolutionary relationships to identify the copy number profiles of the individual cells. We show the accuracy of this approach in evaluations on simulated data and demonstrate its practicability in applications to a xenograft breast cancer sample.

Download Full-text

Accucopy: Accurate and Fast Inference of Allele-specific Copy Number Alterations from Low-coverage Low-purity Tumor Sequencing Data

10.1101/2020.01.02.892364 ◽

2020 ◽

Author(s):

Xinping Fan ◽

Guanghao Luo ◽

Yu S. Huang

Keyword(s):

Copy Number ◽

Bayesian Learning ◽

Kernel Smoothing ◽

Gaussian Mixture ◽

Copy Number Alterations ◽

Sequencing Data ◽

Copy Numbers ◽

Allele Specific ◽

Tumor Sequencing ◽

Low Coverage

AbstractBackgroundCopy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task.ResultsWe introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the Expectation-Maximization (EM) algorithm, and Sparse Bayesian Learning (SBL) were customized and built into the model. Accucopy is implemented in C++/Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/.ConclusionsWe describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.

Download Full-text

Pathogenesis of Penile Squamous Cell Carcinoma: Molecular Update and Systematic Review

International Journal of Molecular Sciences ◽

10.3390/ijms23010251 ◽

2021 ◽

Vol 23 (1) ◽

pp. 251

Author(s):

Inmaculada Ribera-Cortada ◽

José Guerrero-Pineda ◽

Isabel Trias ◽

Luis Veloza ◽

Adriana Garcia ◽

...

Keyword(s):

Squamous Cell Carcinoma ◽

Cell Carcinoma ◽

Squamous Cell ◽

Copy Number ◽

Large Scale ◽

Somatic Mutations ◽

Genetic Alterations ◽

Copy Number Alterations ◽

Hpv Status ◽

Penile Squamous Cell Carcinoma

Penile squamous cell carcinoma (PSCC) is a rare but aggressive neoplasm with dual pathogenesis (human papillomavirus (HPV)-associated and HPV-independent). The development of targeted treatment is hindered by poor knowledge of the molecular landscape of PSCC. We performed a thorough review of genetic alterations of PSCC focused on somatic mutations and/or copy number alterations. A total of seven articles have been identified which, overall, include 268 PSCC. However, the series are heterogeneous regarding methodologies employed for DNA sequencing and HPV detection together with HPV prevalence, and include, in general, a limited number of cases, which results in markedly different findings. Reported top-ranked mutations involve TP53, CDKN2A, FAT1, NOTCH-1 and PIK3CA. Numerical alterations involve gains in MYC and EGFR, as well as amplifications in HPV integration loci. A few genes including TP53, CDKN2A, PIK3CA and CCND1 harbor both somatic mutations and copy number alterations. Notch, RTK-RAS and Hippo pathways are frequently deregulated. Nevertheless, the relevance of the identified alterations, their role in signaling pathways or their association with HPV status remain elusive. Combined targeting of different pathways might represent a valid therapeutic approach in PSCC. This work calls for large-scale sequencing studies with robust HPV testing to improve the genomic understanding of PSCC.

Download Full-text

Genetic alterations in the 3q26.31-32 locus confer an aggressive prostate cancer phenotype

Communications Biology ◽

10.1038/s42003-020-01175-x ◽

2020 ◽

Vol 3 (1) ◽

Author(s):

Benjamin S. Simpson ◽

Niedzica Camacho ◽

Hayley J. Luxton ◽

Hayley Pye ◽

Ron Finn ◽

...

Keyword(s):

Prostate Cancer ◽

Copy Number ◽

Large Scale ◽

Genetic Alterations ◽

Gleason Grade ◽

Copy Number Alterations ◽

Aggressive Prostate Cancer ◽

Prostate Cancer Development ◽

Transcriptional Changes ◽

Cancer Phenotype

AbstractLarge-scale genetic aberrations that underpin prostate cancer development and progression, such as copy-number alterations (CNAs), have been described but the consequences of specific changes in many identified loci is limited. Germline SNPs in the 3q26.31 locus are associated with aggressive prostate cancer, and is the location of NAALADL2, a gene overexpressed in aggressive disease. The closest gene to NAALADL2 is TBL1XR1, which is implicated in tumour development and progression. Using publicly-available cancer genomic data we report that NAALADL2 and TBL1XR1 gains/amplifications are more prevalent in aggressive sub-types of prostate cancer when compared to primary cohorts. In primary disease, gains/amplifications occurred in 15.99% (95% CI: 13.02–18.95) and 14.96% (95% CI: 12.08–17.84%) for NAALADL2 and TBL1XR1 respectively, increasing in frequency in higher Gleason grade and stage tumours. Gains/amplifications result in transcriptional changes and the development of a pro-proliferative and aggressive phenotype. These results support a pivotal role for copy-number gains in this genetic region.

Download Full-text

Genome-Wide Analysis of Copy Number Analysis of Myelodysplastic Syndromes Using High-Density SNP-Genotyping Microarrays.

Blood ◽

10.1182/blood.v106.11.3420.3420 ◽

2005 ◽

Vol 106 (11) ◽

pp. 3420-3420

Author(s):

Masashi Sanada ◽

Yasuhito Nannya ◽

Kumi Nakazaki ◽

Go Yamamoto ◽

Lili Wang ◽

...

Keyword(s):

Myelodysplastic Syndromes ◽

Copy Number ◽

Large Scale ◽

Target Genes ◽

Chromosomal Abnormalities ◽

High Density ◽

Copy Number Alterations ◽

Genome Wide Analysis ◽

Conventional Cytogenetic Analysis ◽

Genome Wide

Abstract Myelodysplastic syndromes (MDS) are clonal disorders of hematopoietic progenitors characterized by impaired blood cell production due to ineffective hematopoiesis and high propensity to acute myeloid leukemias. One of the prominent features of MDS is the high frequency of unbalanced chromosomal abnormalities that result in genetic imbalances and copy number alterations. Although the chromosomal segments involved in these abnormalities are thought to contain relevant genes to the pathogenesis of MDS, conventional analyses including FISH have failed to identify critical regions small enough to pinpoint their target genes. Affymetrix® GeneChip® 100K/500K mapping arrays were originally developed for large-scale genotyping of more than 100,000/500,000 SNPs in two separate arrays, but the quantitative nature of the preparative whole-genome amplification and array hybridization thereafter also allows for accurate copy number estimate of the genome using these platforms at the resolutions of 21.3 kb and 5.4 kb with 116,204 and 520,000 oligonucleotide probes, respectively. Here we developed robust algorithms (CNAG) for copy number detection using 100K and/or 500K arrays and analyzed 88 MDS samples on these platforms in order to identify relevant genes for development of MDS. With these huge numbers of uniformly distributed SNP probes, numerous copy number alterations were sensitively detected in cases with MDS with more numbers of abnormalities found in advanced diseases (RAEB and RAEB-t). In addition to large-scale alterations of various chromosomal segments previously reported in these syndromes, a number of small cryptic chromosomal abnormalities were identified that would escape conventional cytogenetic analysis or array CGH analysis. Minimum overlapping deletions in 5q, 7q, 12p, 13q, and 20q were precisely defined, although no pinpoint homozygous deletions were detected within these regions. A common 20q deletion spans a 400 kb segment harboring five transcriptomes and the common 12p deletion defines a 1.3 Mb region that contains the ETV6 gene. Other common overlapping abnormalities include deletions in 21q22, 17q13, and gains of 11q25. Genome-wide analysis of copy number changes using high-density oligonucleotide arrays provides valuable information about genetic abnormalities in MDS.

Download Full-text