scholarly journals Long-read genome sequencing identifies causal structural variation in a Mendelian disease

2017 ◽  
Vol 20 (1) ◽  
pp. 159-163 ◽  
Author(s):  
Jason D Merker ◽  
Aaron M Wenger ◽  
Tam Sneddon ◽  
Megan Grove ◽  
Zachary Zappala ◽  
...  
Author(s):  
Xuefang Zhao ◽  
Ryan L. Collins ◽  
Wan-Ping Lee ◽  
Alexandra M. Weber ◽  
Yukyung Jun ◽  
...  

AbstractVirtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and emerging clinical diagnostic approaches utilize short-reads (srWGS), which present constraints for genome-wide discovery of structural variants (SVs). Alternative long-read single molecule technologies (lrWGS) offer significant advantages for genome assembly and SV detection, while these technologies are currently cost prohibitive for large-scale disease studies and clinical diagnostics (∼5-12X higher cost than comparable coverage srWGS). Moreover, only dozens of such genomes are currently publicly accessible by comparison to millions of srWGS genomes that have been commissioned for international initiatives. Given this ubiquitous reliance on srWGS in human genetics and genomics, we sought to characterize and quantify the properties of SVs accessible to both srWGS and lrWGS to establish benchmarks and expectations in ongoing medical and population genetic studies, and to project the added value of SVs uniquely accessible to each technology. In analyses of three trios with matched srWGS and lrWGS from the Human Genome Structural Variation Consortium (HGSVC), srWGS captured ∼11,000 SVs per genome using reference-based algorithms, while haplotype-resolved assembly from lrWGS identified ∼25,000 SVs per genome. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplications (SD) and simple repeats (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of the human reference, we observed extremely high concordance (93.8%) for deletions discovered by srWGS and lrWGS after error correction using the raw lrWGS reads. Conversely, lrWGS was superior for detection of insertions across all genomic contexts. Given that the non-SD/SR sequences span 90.3% of the GRCh38 reference, and encompass 95.9% of coding exons in currently annotated disease associated genes, improved sensitivity from lrWGS to discover novel and interpretable pathogenic deletions not already accessible to srWGS is likely to be incremental. However, these analyses highlight the added value of assembly-based lrWGS to create new catalogues of functional insertions and transposable elements, as well as disease associated repeat expansions in genomic regions previously recalcitrant to routine assessment.


2019 ◽  
Vol 64 (5) ◽  
pp. 359-368 ◽  
Author(s):  
Takeshi Mizuguchi ◽  
Takeshi Suzuki ◽  
Chihiro Abe ◽  
Ayako Umemura ◽  
Katsushi Tokunaga ◽  
...  

2018 ◽  
Author(s):  
Alba Sanchis-Juan ◽  
Jonathan Stephens ◽  
Courtney E French ◽  
Nicholas Gleadall ◽  
Karyn Mégy ◽  
...  

AbstractComplex structural variants (cxSVs) are genomic rearrangements comprising multiple structural variants, typically involving three or more breakpoint junctions. They contribute to human genomic variation and can cause Mendelian disease, however they are not typically considered during genetic testing. Here, we investigate the role of cxSVs in Mendelian disease using short-read whole genome sequencing (WGS) data from 1,324 individuals with neurodevelopmental or retinal disorders from the NIHR BioResource project. We present four cases of individuals with a cxSV affecting Mendelian disease-associated genes. Three of the cxSVs are pathogenic: a de novo duplication-inversion-inversion-deletion affecting ARID1B in an individual with Coffin-Siris syndrome, a deletion-inversion-duplication affecting HNRNPU in an individual with intellectual disability and seizures, and a homozygous deletion-inversion-deletion affecting CEP78 in an individual with cone-rod dystrophy. Additionally, we identified a de novo duplication-inversion-duplication overlapping CDKL5 in an individual with neonatal hypoxic-ischaemic encephalopathy. Long-read sequencing technology used to resolve the breakpoints demonstrated the presence of both a disrupted and an intact copy of CDKL5 on the same allele; therefore, it was classified as a variant of uncertain significance. Analysis of sequence flanking all breakpoint junctions in all the cxSVs revealed both microhomology and longer repetitive sequences, suggesting both replication and homology based processes. Accurate resolution of cxSVs is essential for clinical interpretation, and here we demonstrate that long-read WGS is a powerful technology by which to achieve this. Our results show cxSVs are an important although rare cause of Mendelian disease, and we therefore recommend their consideration during research and clinical investigations.


2016 ◽  
Author(s):  
Jason D. Merker ◽  
Aaron M. Wenger ◽  
Tam Sneddon ◽  
Megan Grove ◽  
Daryl Waggott ◽  
...  

AbstractCurrent clinical genomics assays primarily utilize short-read sequencing (SRS), which offers high throughput, high base accuracy, and low cost per base. SRS has, however, limited ability to evaluate tandem repeats, regions with high [GC] or [AT] content, highly polymorphic regions, highly paralogous regions, and large-scale structural variants. Long-read sequencing (LRS) has complementary strengths and offers a means to discover overlooked genetic variation in patients undiagnosed by SRS. To evaluate LRS, we selected a patient who presented with multiple neoplasia and cardiac myxomata suggestive of Carney complex for whom targeted clinical gene testing and whole genome SRS were negative. Low coverage whole genome LRS was performed on the PacBio Sequel system and structural variants were called, yielding 6,971 deletions and 6,821 insertions > 50bp. Filtering for variants that are absent in an unrelated control and that overlap a coding exon of a disease gene identified three deletions and three insertions. One of these, a heterozygous 2,184 bp deletion, overlaps the first coding exon of PRKAR1A, which is implicated in autosomal dominant Carney complex. This variant was confirmed by Sanger sequencing and was classified as pathogenic using standard criteria for the interpretation of sequence variants. This first successful application of whole genome LRS to identify a pathogenic variant suggests that LRS has significant potential to identify disease-causing structural variation. We recommend larger studies to evaluate the diagnostic yield of LRS, and the development of a comprehensive catalog of common human structural variation to support future studies.


2021 ◽  
Vol 2 (2) ◽  
pp. 100023
Author(s):  
Susan M. Hiatt ◽  
James M.J. Lawlor ◽  
Lori H. Handley ◽  
Ryne C. Ramaker ◽  
Brianne B. Rogers ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hannah E. Roberts ◽  
Maria Lopopolo ◽  
Alistair T. Pagnamenta ◽  
Eshita Sharma ◽  
Duncan Parkes ◽  
...  

AbstractRecent advances in throughput and accuracy mean that the Oxford Nanopore Technologies PromethION platform is a now a viable solution for genome sequencing. Much of the validation of bioinformatic tools for this long-read data has focussed on calling germline variants (including structural variants). Somatic variants are outnumbered many-fold by germline variants and their detection is further complicated by the effects of tumour purity/subclonality. Here, we evaluate the extent to which Nanopore sequencing enables detection and analysis of somatic variation. We do this through sequencing tumour and germline genomes for a patient with diffuse B-cell lymphoma and comparing results with 150 bp short-read sequencing of the same samples. Calling germline single nucleotide variants (SNVs) from specific chromosomes of the long-read data achieved good specificity and sensitivity. However, results of somatic SNV calling highlight the need for the development of specialised joint calling algorithms. We find the comparative genome-wide performance of different tools varies significantly between structural variant types, and suggest long reads are especially advantageous for calling large somatic deletions and duplications. Finally, we highlight the utility of long reads for phasing clinically relevant variants, confirming that a somatic 1.6 Mb deletion and a p.(Arg249Met) mutation involving TP53 are oriented in trans.


2018 ◽  
Vol 64 (3) ◽  
pp. 191-197 ◽  
Author(s):  
Takeshi Mizuguchi ◽  
Tomoko Toyota ◽  
Hiroaki Adachi ◽  
Noriko Miyake ◽  
Naomichi Matsumoto ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document