scholarly journals Copy Number Variant Detection with Low-Coverage Whole-Genome Sequencing Represents a Viable Alternative to the Conventional Array-CGH

Diagnostics ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 708
Author(s):  
Marcel Kucharík ◽  
Jaroslav Budiš ◽  
Michaela Hýblová ◽  
Gabriel Minárik ◽  
Tomáš Szemes

Copy number variations (CNVs) represent a type of structural variant involving alterations in the number of copies of specific regions of DNA that can either be deleted or duplicated. CNVs contribute substantially to normal population variability, however, abnormal CNVs cause numerous genetic disorders. At present, several methods for CNV detection are applied, ranging from the conventional cytogenetic analysis, through microarray-based methods (aCGH), to next-generation sequencing (NGS). In this paper, we present GenomeScreen, an NGS-based CNV detection method for low-coverage, whole-genome sequencing. We determined the theoretical limits of its accuracy and obtained confirmation in an extensive in silico study and in real patient samples with known genotypes. In theory, at least 6 M uniquely mapped reads are required to detect a CNV with the length of 100 kilobases (kb) or more with high confidence (Z-score > 7). In practice, the in silico analysis required at least 8 M to obtain >99% accuracy (for 100 kb deviations). We compared GenomeScreen with one of the currently used aCGH methods in diagnostic laboratories, which has mean resolution of 200 kb. GenomeScreen and aCGH both detected 59 deviations, while GenomeScreen furthermore detected 134 other (usually) smaller variations. When compared to aCGH, overall performance of the proposed GenemoScreen tool is comparable or superior in terms of accuracy, turn-around time, and cost-effectiveness, thus providing reasonable benefits, particularly in a prenatal diagnosis setting.

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Johannes Smolander ◽  
Sofia Khan ◽  
Kalaimathy Singaravelu ◽  
Leni Kauko ◽  
Riikka J. Lund ◽  
...  

Abstract Background Detection of copy number variations (CNVs) from high-throughput next-generation whole-genome sequencing (WGS) data has become a widely used research method during the recent years. However, only a little is known about the applicability of the developed algorithms to ultra-low-coverage (0.0005–0.8×) data that is used in various research and clinical applications, such as digital karyotyping and single-cell CNV detection. Result Here, the performance of six popular read-depth based CNV detection algorithms (BIC-seq2, Canvas, CNVnator, FREEC, HMMcopy, and QDNAseq) was studied using ultra-low-coverage WGS data. Real-world array- and karyotyping kit-based validation were used as a benchmark in the evaluation. Additionally, ultra-low-coverage WGS data was simulated to investigate the ability of the algorithms to identify CNVs in the sex chromosomes and the theoretical minimum coverage at which these tools can accurately function. Our results suggest that while all the methods were able to detect large CNVs, many methods were susceptible to producing false positives when smaller CNVs (< 2 Mbp) were detected. There was also significant variability in their ability to identify CNVs in the sex chromosomes. Overall, BIC-seq2 was found to be the best method in terms of statistical performance. However, its significant drawback was by far the slowest runtime among the methods (> 3 h) compared with FREEC (~ 3 min), which we considered the second-best method. Conclusions Our comparative analysis demonstrates that CNV detection from ultra-low-coverage WGS data can be a highly accurate method for the detection of large copy number variations when their length is in millions of base pairs. These findings facilitate applications that utilize ultra-low-coverage CNV detection.


2019 ◽  
Author(s):  
Junhua Rao ◽  
Lihua Peng ◽  
Fang Chen ◽  
Hui Jiang ◽  
Chunyu Geng ◽  
...  

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.


2020 ◽  
Author(s):  
Marcel Kucharik ◽  
Jaroslav Budis ◽  
Michaela Hyblova ◽  
Gabriel Minarik ◽  
Tomas Szemes

Copy number variations (CNVs) are a type of structural variant involving alterations in the number of copies of specific regions of DNA, which can either be deleted or duplicated. CNVs contribute substantially to normal population variability; however, abnormal CNVs cause numerous genetic disorders. Nowadays, several methods for CNV detection are used, from the conventional cytogenetic analysis through microarray-based methods (aCGH) to next-generation sequencing (NGS). We present GenomeScreen - NGS based CNV detection method based on a previously described CNV detection algorithm used for non-invasive prenatal testing (NIPT). We determined theoretical limits of its accuracy and confirmed it with extensive in-silico study and already genotyped samples. Theoretically, at least 6M uniquely mapped reads are required to detect CNV with a length of 100 kilobases (kb) or more with high confidence (Z-score > 7). In practice, the in-silico analysis showed the requirement at least 8M to obtain >99% accuracy (for 100 kb deviations). We compared GenomeScreen with one of the currently used aCGH methods in diagnostic laboratories, which has a 200 kb mean resolution. GenomeScreen and aCGH both detected 59 deviations, GenomeScreen furthermore detected 134 other (usually) smaller variations. Furthermore, the overall cost per sample is about 2-3x lower in the case of GenomeScreen.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0245488
Author(s):  
Karin Wallander ◽  
Jesper Eisfeldt ◽  
Mats Lindblad ◽  
Daniel Nilsson ◽  
Kenny Billiau ◽  
...  

Background Analysis of cell-free tumour DNA, a liquid biopsy, is a promising biomarker for cancer. We have performed a proof-of principle study to test the applicability in the clinical setting, analysing copy number alterations (CNAs) in plasma and tumour tissue from 44 patients with gastro-oesophageal cancer. Methods DNA was isolated from blood plasma and a tissue sample from each patient. Array-CGH was applied to the tissue DNA. The cell-free plasma DNA was sequenced by low-coverage whole-genome sequencing using a clinical pipeline for non-invasive prenatal testing. WISECONDOR and ichorCNA, two bioinformatic tools, were used to process the output data and were compared to each other. Results Cancer-associated CNAs could be seen in 59% (26/44) of the tissue biopsies. In the plasma samples, a targeted approach analysing 61 regions of special interest in gastro-oesophageal cancer detected cancer-associated CNAs with a z-score >5 in 11 patients. Broadening the analysis to a whole-genome view, 17/44 patients (39%) had cancer-associated CNAs using WISECONDOR and 13 (30%) using ichorCNA. Of the 26 patients with tissue-verified cancer-associated CNAs, 14 (54%) had corresponding CNAs in plasma. Potentially clinically actionable amplifications overlapping the genes VEGFA, EGFR and FGFR2 were detected in the plasma from three patients. Conclusions We conclude that low-coverage whole-genome sequencing without prior knowledge of the tumour alterations could become a useful tool for cell-free tumour DNA analysis of total CNAs in plasma from patients with gastro-oesophageal cancer.


Cancers ◽  
2021 ◽  
Vol 13 (24) ◽  
pp. 6283
Author(s):  
Migle Gabrielaite ◽  
Mathias Husted Torp ◽  
Malthe Sebro Rasmussen ◽  
Sergio Andreu-Sánchez ◽  
Filipe Garrett Vieira ◽  
...  

Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. Additionally, for nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Several tools had better performance for NA12878, which could be a result of overfitting. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Reducing the total number of called variants could potentially be assisted by the use of background panels for filtering of frequently called variants.


2018 ◽  
Vol 115 (42) ◽  
pp. 10804-10809 ◽  
Author(s):  
Suzanne Rohrback ◽  
Craig April ◽  
Fiona Kaper ◽  
Richard R. Rivera ◽  
Christine S. Liu ◽  
...  

Somatic copy number variations (CNVs) exist in the brain, but their genesis, prevalence, forms, and biological impact remain unclear, even within experimentally tractable animal models. We combined a transposase-based amplification (TbA) methodology for single-cell whole-genome sequencing with a bioinformatic approach for filtering unreliable CNVs (FUnC), developed from machine learning trained on lymphocyte V(D)J recombination. TbA–FUnC offered superior genomic coverage and removed >90% of false-positive CNV calls, allowing extensive examination of submegabase CNVs from over 500 cells throughout the neurogenic period of cerebral cortical development in Mus musculus. Thousands of previously undocumented CNVs were identified. Half were less than 1 Mb in size, with deletions 4× more common than amplification events, and were randomly distributed throughout the genome. However, CNV prevalence during embryonic cortical development was nonrandom, peaking at midneurogenesis with levels triple those found at younger ages before falling to intermediate quantities. These data identify pervasive small and large CNVs as early contributors to neural genomic mosaicism, producing genomically diverse cellular building blocks that form the highly organized, mature brain.


Sign in / Sign up

Export Citation Format

Share Document