scholarly journals VCF2CNA: A tool for efficiently detecting copy-number alterations in VCF genotype data

2017 ◽  
Author(s):  
Daniel K. Putnam ◽  
Ma Xiaotu ◽  
Stephen V. Rice ◽  
Yu Liu ◽  
Jinghui Zhang ◽  
...  

AbstractVCF2CNA is a web interface tool for copy-number alteration (CNA) analysis of VCF and other variant file formats. We applied it to 46 adult glioblastoma and 146 pediatric neuroblastoma samples sequenced by Illumina and Complete Genomics (CGI) platforms respectively. VCF2CNA was highly consistent with a state-of-the-art algorithm using raw sequencing data (mean F1-score=0.994) in high-quality glioblastoma samples and was robust to uneven coverage introduced by library artifacts. In the neuroblastoma set, VCF2CNA identified MYCN high-level amplifications in 31 of 32 clinically validated samples compared to 15 found by CGI’s HMM-based CNA model. The findings suggest that VCF2CNA is an accurate, efficient and platform-independent tool for CNA analyses without accessing raw sequence data.

2019 ◽  
Vol 35 (19) ◽  
pp. 3824-3825 ◽  
Author(s):  
He Zhang ◽  
Xiaowei Zhan ◽  
James Brugarolas ◽  
Yang Xie

Abstract Motivation Detection of somatic copy number alterations (SCNAs) using high-throughput sequencing has become popular because of rapid developments in sequencing technology. Existing methods do not perform well in calling SCNAs for the unstable tumor genomes. Results We developed a new method, DEFOR, to detect SCNAs in tumor samples from exome-sequencing data. The evaluation showed that DEFOR has a higher accuracy for SCNA detection from exome sequencing compared with the five existing tools. This advantage is especially apparent in unstable tumor genomes with a large proportion of SCNAs. Availability and implementation DEFOR is available at https://github.com/drzh/defor. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Xinping Fan ◽  
Guanghao Luo ◽  
Yu S. Huang

Abstract Background Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task. Results We introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the expectation–maximization algorithm, and sparse Bayesian learning were customized and built into the model. Accucopy is implemented in C++ /Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/. Conclusions We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.


2020 ◽  
Author(s):  
Andrew J. Page ◽  
Nabil-Fareed Alikhan ◽  
Michael Strinden ◽  
Thanh Le Viet ◽  
Timofey Skvortsov

AbstractSpoligotyping of Mycobacterium tuberculosis provides a subspecies classification of this major human pathogen. Spoligotypes can be predicted from short read genome sequencing data; however, no methods exist for long read sequence data such as from Nanopore or PacBio. We present a novel software package Galru, which can rapidly detect the spoligotype of a Mycobacterium tuberculosis sample from as little as a single uncorrected long read. It allows for near real-time spoligotyping from long read data as it is being sequenced, giving rapid sample typing. We compare it to the existing state of the art software and find it performs identically to the results obtained from short read sequencing data. Galru is freely available from https://github.com/quadram-institute-bioscience/galru under the GPLv3 open source licence.


Biomedicines ◽  
2020 ◽  
Vol 8 (12) ◽  
pp. 574
Author(s):  
Ege Ülgen ◽  
Sıla Karacan ◽  
Umut Gerlevik ◽  
Özge Can ◽  
Kaya Bilguvar ◽  
...  

Little is known about the mutational processes that shape the genetic landscape of gliomas. Numerous mutational processes leave marks on the genome in the form of mutations, copy number alterations, rearrangements or their combinations. To explore gliomagenesis, we hypothesized that gliomas with different underlying oncogenic mechanisms would have differences in the burden of various forms of these genomic alterations. This was an analysis on adult diffuse gliomas, but IDH-mutant gliomas as well as diffuse midline gliomas H3-K27M were excluded to search for the possible presence of new entities among the very heterogenous group of IDH-WT glioblastomas. The cohort was divided into two molecular subsets: (1) Molecularly-defined GBM (mGBM) as those that carried molecular features of glioblastomas (including TERT promoter mutations, 7/10 pattern, or EGFR-amplification), and (2) those who did not (others). Whole exome sequencing was performed for 37 primary tumors and matched blood samples as well as 8 recurrences. Single nucleotide variations (SNV), short insertion or deletions (indels) and copy number alterations (CNA) were quantified using 5 quantitative metrics (SNV burden, indel burden, copy number alteration frequency-wGII, chromosomal arm event ratio-CAER, copy number amplitude) as well as 4 parameters that explored underlying oncogenic mechanisms (chromothripsis, double minutes, microsatellite instability and mutational signatures). Findings were validated in the TCGA pan-glioma cohort. mGBM and “Others” differed significantly in their SNV (only in the TCGA cohort) and CNA metrics but not indel burden. SNV burden increased with increasing age at diagnosis and at recurrences and was driven by mismatch repair deficiency. On the contrary, indel and CNA metrics remained stable over increasing age at diagnosis and with recurrences. Copy number alteration frequency (wGII) correlated significantly with chromothripsis while CAER and CN amplitude correlated significantly with the presence of double minutes, suggesting separate underlying mechanisms for different forms of CNA.


PLoS ONE ◽  
2012 ◽  
Vol 7 (12) ◽  
pp. e51422 ◽  
Author(s):  
Rafael Valdés-Mas ◽  
Silvia Bea ◽  
Diana A. Puente ◽  
Carlos López-Otín ◽  
Xose S. Puente

2017 ◽  
Vol 35 (6_suppl) ◽  
pp. 296-296 ◽  
Author(s):  
Daniel H. Hovelson ◽  
Lorena Lazo De La Vega ◽  
Andrew McDaniel ◽  
Aaron Udager ◽  
Rohit Mehra ◽  
...  

296 Background: Expression-based molecular subtypes thought to be intrinsic in bladder cancer have been widely reported, carrying important potential clinical treatment implications. Histologically, bladder cancers are also heterogeneous diseases, with a large portion of urothelial carcinomas exhibiting divergent differentiation. Previous subtyping efforts have been carried out using predominantly fresh frozen tissue samples, potentially obscuring this known differentiation heterogeneity. Methods: Here we performed targeted multiplexed, amplicon-based DNA and RNA sequencing on 100 formalin-fixed paraffin-embedded (FFPE) bladder cancer samples (including 12 paired urothelial / squamous lesions). High-confidence somatic point mutations, short insertions/deletions (indels), and copy number alterations were detected using the DNA component of the Oncomine Comprehensive Assay (OCP). Targeted RNA sequencing was carried out using a custom Ampliseq panel comprised of 8 housekeeping genes and 103 target genes assessing major transcriptional programs as identified from publically available data. Results: By DNA analysis, we observe frequent TP53 (35%) and activating hotspot PIK3CA (23%) somatic mutations across the cohort, as well as targetable high-level (log-2 copy number ratio > = 1.5) focal amplifications of ERBB2 (3%) or EGFR (3%) in a subset of samples. We report a novel approach for detecting sub-gene copy-number alterations, and confirm several detectable multi-exon losses using whole transcriptome RNA sequencing. Pairing targeted RNA expression analysis with DNA-based alterations, we show high level expression of EGFR and ERBB2 in focally-amplified samples. Most importantly, we show that despite identical prioritized somatic genomic alterations, we observe divergent expression-based profiles in 3 of 12 (25%) paired urothelial and squamous samples. Conclusions: Taken together, these results highlight the importance of molecular heterogeneity in bladder cancer and suggest important considerations for using existing expression-based clustering approaches to guide clinical treatment decisions.


2015 ◽  
Vol 31 (16) ◽  
pp. 2713-2720 ◽  
Author(s):  
Arief Gusnanto ◽  
Peter Tcherveniakov ◽  
Farag Shuweihdi ◽  
Manar Samman ◽  
Pamela Rabbitts ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document