Canvas: versatile and scalable detection of copy number variants

Mapping Intimacies ◽

10.1101/036194 ◽

2016 ◽

Author(s):

Eric Roller ◽

Sergii Ivakhno ◽

Steve Lee ◽

Thomas Royce ◽

Stephen Tanner

Keyword(s):

Copy Number ◽

Large Scale ◽

Copy Number Variants ◽

Variant Calling ◽

Experimental Designs ◽

Genome Wide ◽

Whole Exome ◽

Sequencing Studies ◽

Copy Number Changes ◽

Robust Variant

Motivation: Increased throughput and diverse experimental designs of large-scale sequencing studies necessi-tate versatile, scalable and robust variant calling tools. In particular, identification of copy number changes re-mains a challenging task due to their complexity, susceptibility to sequencing biases, variation in coverage data and dependence on genome-wide sample properties, such as tumor polyploidy or polyclonality in cancer samples. Results: We have developed a new tool, Canvas, for identification of copy number changes from diverse se-quencing experiments including whole-genome matched tumor-normal and single-sample normal re-sequencing, as well as whole-exome matched and unmatched tumor-normal studies. In addition to variant calling, Canvas infers genome-wide parameters such as cancer ploidy, purity and heterogeneity. It provides fast and simple to execute workflows that can scale to thousands of samples and can be easily incorporated into existing variant calling pipelines. Availability: Canvas is distributed under an open source license and can be downloaded from https://github.com/Illumina/canvas.

Download Full-text

tHapMix: simulating tumour samples through haplotype mixtures

10.1101/057414 ◽

2016 ◽

Author(s):

Sergii Ivakhno ◽

Camilla Colombo ◽

Stephen Tanner ◽

Philip Tedder ◽

Stefano Berri ◽

...

Keyword(s):

Copy Number ◽

Large Scale ◽

Variant Calling ◽

Copy Number Variant ◽

Supplementary Information ◽

Genome Diversity ◽

Simulation Framework ◽

Somatic Genome ◽

Copy Number Changes ◽

Sequencing Platforms

AbstractMotivationLarge-scale rearrangements and copy number changes combined with different modes of cloevolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable oriant calling tools and create well-calibrated benchmarks.ResultsWe developed a new simulation framework tHapMix that enables the creation of tumour samples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools.Availability and implementationtHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing

Cancer Informatics ◽

10.4137/cin.s36612 ◽

2016 ◽

Vol 15 ◽

pp. CIN.S36612 ◽

Cited By ~ 3

Author(s):

Lun-Ching Chang ◽

Biswajit Das ◽

Chih-Jian Lih ◽

Han Si ◽

Corinne E. Camalier ◽

...

Keyword(s):

Exome Sequencing ◽

Whole Exome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Estimation Methods ◽

Single Nucleotide Variants ◽

False Positive Error ◽

Reference Set ◽

Genome Wide ◽

Whole Exome

With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly ( r = 0.96–0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman's coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis.

Download Full-text

P3-081: POST-VARIANT CALLING QUALITY CONTROL (QC) PIPELINE AND MULTI-PIPELINE GENOTYPE CONSENSUS CALLER FOR LARGE-SCALE WHOLE GENOME AND WHOLE EXOME SEQUENCING STUDIES

Alzheimer s & Dementia ◽

10.1016/j.jalz.2018.06.1437 ◽

2006 ◽

Vol 14 (7S_Part_20) ◽

pp. P1096-P1097

Author(s):

John Stephen Malamon ◽

Adam C. Naj

Keyword(s):

Quality Control ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Large Scale ◽

Variant Calling ◽

Whole Genome ◽

Whole Exome ◽

Sequencing Studies

Download Full-text

POST-VARIANT CALLING QUALITY CONTROL (QC) PIPELINE AND MULTI-PIPELINE GENOTYPE CONSENSUS CALLER FOR LARGE-SCALE WHOLE GENOME AND WHOLE EXOME SEQUENCING STUDIES

Alzheimer s & Dementia ◽

10.1016/j.jalz.2017.06.1275 ◽

2017 ◽

Vol 13 (7) ◽

pp. P956-P957

Author(s):

John Stephen Malamon ◽

Adam C. Naj

Keyword(s):

Quality Control ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Large Scale ◽

Variant Calling ◽

Whole Genome ◽

Whole Exome ◽

Sequencing Studies

Download Full-text

Novel genomic findings in multiple myeloma identified through routine diagnostic sequencing

Journal of Clinical Pathology ◽

10.1136/jclinpath-2018-205195 ◽

2018 ◽

Vol 71 (10) ◽

pp. 895-899 ◽

Cited By ~ 13

Author(s):

Georgina L Ryland ◽

Kate Jones ◽

Melody Chin ◽

John Markham ◽

Elle Aydogan ◽

...

Keyword(s):

Multiple Myeloma ◽

Copy Number ◽

Large Scale ◽

Haematological Malignancy ◽

Bioinformatics Pipeline ◽

Genome Wide ◽

Risk Patients ◽

Copy Number Changes ◽

Therapeutic Decision Making ◽

Generation Sequencing

AimsMultiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre.MethodsA cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline.ResultsAt least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed.ConclusionsOur results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology.

Download Full-text

0306 Exploring the feasibility of using copy number variants as genetic markers through large-scale whole genome sequencing experiments

Journal of Animal Science ◽

10.2527/jam2016-0306 ◽

2016 ◽

Vol 94 (suppl_5) ◽

pp. 146-146

Author(s):

D. M. Bickhart ◽

L. Xu ◽

J. L. Hutchison ◽

J. B. Cole ◽

D. J. Null ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genetic Markers ◽

Genome Sequencing ◽

Copy Number ◽

Large Scale ◽

Copy Number Variants ◽

Whole Genome

Download Full-text

Assessing the Role of Copy Number Variants in Prostate Cancer Risk and Progression using a Novel Genome-Wide Screening Method

10.21236/ada568305 ◽

2012 ◽

Author(s):

Donna Lehman ◽

August Blackburn ◽

Robin Leach

Keyword(s):

Prostate Cancer ◽

Cancer Risk ◽

Copy Number ◽

Screening Method ◽

Copy Number Variants ◽

Prostate Cancer Risk ◽

Genome Wide

Download Full-text

Genome-Wide Association and Whole Exome Sequencing Studies reveal a Novel Candidate Locus for Restless Legs Syndrome

European Journal of Medical Genetics ◽

10.1016/j.ejmg.2021.104186 ◽

2021 ◽

pp. 104186

Author(s):

Ufuk Ergun ◽

Bahar Say ◽

Sezen Guntekin Ergun ◽

Ferda Emriye Percin ◽

Levent Inan ◽

...

Keyword(s):

Exome Sequencing ◽

Whole Exome Sequencing ◽

Restless Legs Syndrome ◽

Genome Wide Association ◽

Candidate Locus ◽

Restless Legs ◽

Genome Wide ◽

Whole Exome ◽

Sequencing Studies

Download Full-text

GWASpro: a high-performance genome-wide association analysis server

Bioinformatics ◽

10.1093/bioinformatics/bty989 ◽

2018 ◽

Vol 35 (14) ◽

pp. 2512-2514 ◽

Cited By ~ 4

Author(s):

Bongsong Kim ◽

Xinbin Dai ◽

Wenchao Zhang ◽

Zhaohong Zhuang ◽

Darlene L Sanchez ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Linear Mixed Model ◽

Association Studies ◽

Learning Curves ◽

Experimental Designs ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Detection and characterization of copy number variants based on whole-genome sequencing by DNBSEQ platforms

10.1101/786962 ◽

2019 ◽

Author(s):

Junhua Rao ◽

Lihua Peng ◽

Fang Chen ◽

Hui Jiang ◽

Chunyu Geng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Copy Number Variant ◽

Whole Genome ◽

Genome Wide ◽

Wide Range ◽

Distribution Sensitivity ◽

Cnv Detection

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.

Download Full-text