Genome sequencing identifies rare tandem repeat expansions and copy number variants in Lennox–Gastaut syndrome

Farah Qaiser; Tara Sadoway; Yue Yin; Quratulain Zulfiqar Ali; Charlotte M Nguyen; Natalie Shum; Ian Backstrom; Paula T Marques; Sepideh Tabarestani; Renato P Munhoz; Timo Krings; Christopher E Pearson; Ryan K C Yuen; Danielle M Andrade

doi:10.1093/braincomms/fcab207

Genome sequencing identifies rare tandem repeat expansions and copy number variants in Lennox–Gastaut syndrome

Brain Communications ◽

10.1093/braincomms/fcab207 ◽

2021 ◽

Vol 3 (3) ◽

Author(s):

Farah Qaiser ◽

Tara Sadoway ◽

Yue Yin ◽

Quratulain Zulfiqar Ali ◽

Charlotte M Nguyen ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Tandem Repeat ◽

Copy Number ◽

Copy Number Variants ◽

Spinocerebellar Ataxia Type ◽

Whole Genome ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Repeat Expansions

Abstract Epilepsies are a group of common neurological disorders with a substantial genetic basis. Despite this, the molecular diagnosis of epilepsies remains challenging due to its heterogeneity. Studies utilizing whole-genome sequencing may provide additional insights into genetic causes of epilepsies of unknown aetiology. Whole-genome sequencing was used to evaluate a cohort of adults with unexplained developmental and epileptic encephalopathies (n = 30), for whom prior genetic tests, including whole-exome sequencing in some cases, were negative or inconclusive. Rare single nucleotide variants, insertions/deletions, copy number variants and tandem repeat expansions were analysed. Seven pathogenic or likely pathogenic single nucleotide variants, and two pathogenic deleterious copy number variants were identified in nine patients (32.1% of the cohort). One of the copy number variants, identified in a patient with Lennox–Gastaut syndrome, was too small to be detected by chromosomal microarray techniques. We also identified two tandem repeat expansions with clinical implications in two other patients with Lennox–Gastaut syndrome: a CGG repeat expansion in the 5′untranslated region of DIP2B, and a CTG expansion in ATXN8OS (previously implicated in spinocerebellar ataxia type 8). Three patients had KCNA2 pathogenic variants. One of them died of sudden unexpected death in epilepsy. The other two patients had, in addition to a KCNA2 variant, a second de novo variant impacting potential epilepsy-relevant genes (KCNIP4 and UBR5). Overall, whole-genome sequencing provided a genetic explanation in 32.1% of the total cohort. This is also the first report of coding and non-coding tandem repeat expansions identified in patients with Lennox–Gastaut syndrome. This study demonstrates that using whole-genome sequencing, the examination of multiple types of rare genetic variation, including those found in the non-coding region of the genome, can help resolve unexplained epilepsies.

Download Full-text

Performance of copy number variants detection based on whole-genome sequencing by DNBSEQ platforms

BMC Bioinformatics ◽

10.1186/s12859-020-03859-x ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Junhua Rao ◽

Lihua Peng ◽

Xinming Liang ◽

Hui Jiang ◽

Chunyu Geng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Technology Use ◽

Copy Number ◽

Massively Parallel Sequencing ◽

Copy Number Variants ◽

Whole Genome ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Cnv Detection

Abstract Background DNBSEQ™ platforms are new massively parallel sequencing (MPS) platforms that use DNA nanoball technology. Use of data generated from DNBSEQ™ platforms to detect single nucleotide variants (SNVs) and small insertions and deletions (indels) has proven to be quite effective, while the feasibility of copy number variants (CNVs) detection is unclear. Results Here, we first benchmarked different CNV detection tools based on Illumina whole-genome sequencing (WGS) data of NA12878 and then assessed these tools in CNV detection based on DNBSEQ™ sequencing data from the same sample. When the same tool was used, the CNVs detected based on DNBSEQ™ and Illumina data were similar in quantity, length and distribution, while great differences existed within results from different tools and even based on data from a single platform. We further estimated the CNV detection power based on available CNV benchmarks of NA12878 and found similar precision and sensitivity between the DNBSEQ™ and Illumina platforms. We also found higher precision of CNVs shorter than 1 kbp based on DNBSEQ™ platforms than those based on Illumina platforms by using Pindel, DELLY and LUMPY. We carefully compared these two available benchmarks and found a large proportion of specific CNVs between them. Thus, we constructed a more complete CNV benchmark of NA12878 containing 3512 CNV regions. Conclusions We assessed and benchmarked CNV detections based on WGS with DNBSEQ™ platforms and provide guidelines for future studies.

Download Full-text

0306 Exploring the feasibility of using copy number variants as genetic markers through large-scale whole genome sequencing experiments

Journal of Animal Science ◽

10.2527/jam2016-0306 ◽

2016 ◽

Vol 94 (suppl_5) ◽

pp. 146-146

Author(s):

D. M. Bickhart ◽

L. Xu ◽

J. L. Hutchison ◽

J. B. Cole ◽

D. J. Null ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genetic Markers ◽

Genome Sequencing ◽

Copy Number ◽

Large Scale ◽

Copy Number Variants ◽

Whole Genome

Download Full-text

Detection and characterization of copy number variants based on whole-genome sequencing by DNBSEQ platforms

10.1101/786962 ◽

2019 ◽

Author(s):

Junhua Rao ◽

Lihua Peng ◽

Fang Chen ◽

Hui Jiang ◽

Chunyu Geng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Copy Number Variant ◽

Whole Genome ◽

Genome Wide ◽

Wide Range ◽

Distribution Sensitivity ◽

Cnv Detection

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.

Download Full-text

Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants

10.1101/316976 ◽

2018 ◽

Cited By ~ 4

Author(s):

Maxime Garcia ◽

Szilveszter Juhos ◽

Malin Larsson ◽

Pall I. Olason ◽

Marcel Martin ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Open Source ◽

Genome Sequencing ◽

Development Project ◽

Whole Genome ◽

Sequencing Analysis ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Insertion And Deletion ◽

Sample Heterogeneity

AbstractSummaryWhole-genome sequencing (WGS) is a cornerstone of precision medicine, but portable and reproducible open-source workflows for WGS analyses of germline and somatic variants are lacking. We present Sarek, a modular, comprehensive, and easy-to-install workflow, combining a range of software for the identification and annotation of single-nucleotide variants (SNVs), insertion and deletion variants (indels), structural variants, tumor sample heterogeneity, and karyotyping from germline or paired tumor/normal samples. Sarek is implemented in a bioinformatics workflow language (Nextflow) with Docker and Singularity compatible containers, ensuring easy deployment and full reproducibility at any Linux based compute cluster or cloud computing environment. Sarek supports the human reference genomes GRCh37 and GRCh38, and can readily be used both as a core production workflow at sequencing facilities and as a powerful stand-alone tool for individual research groups.AvailabilitySource code and instructions for local installation are available at GitHub (https://github.com/SciLifeLab/Sarek) under the MIT open-source license, and we invite the research community to contribute additional functionality as a collaborative open-source development project.

Download Full-text

Copy‐Number Variants Detection by Low‐Pass Whole‐Genome Sequencing

Current Protocols in Human Genetics ◽

10.1002/cphg.43 ◽

2017 ◽

Vol 94 (1) ◽

Cited By ~ 4

Author(s):

Zirui Dong ◽

Weiwei Xie ◽

Haixiao Chen ◽

Jinjin Xu ◽

Huilin Wang ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Whole Genome ◽

Low Pass

Download Full-text

SECNVs: A Simulator of Copy Number Variants and Whole-Exome Sequences from Reference Genomes

10.1101/824128 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yue Xing ◽

Alan R. Dabney ◽

Xiao Li ◽

Guosong Wang ◽

Clare A. Gill ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Whole Genome ◽

Sequencing Data ◽

Software Applications ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

AbstractCopy number variants are insertions and deletions of 1 kb or larger in a genome that play an important role in phenotypic changes and human disease. Many software applications have been developed to detect copy number variants using either whole-genome sequencing or whole-exome sequencing data. However, there is poor agreement in the results from these applications. Simulated datasets containing copy number variants allow comprehensive comparisons of the operating characteristics of existing and novel copy number variant detection methods. Several software applications have been developed to simulate copy number variants and other structural variants in whole-genome sequencing data. However, none of the applications reliably simulate copy number variants in whole-exome sequencing data. We have developed and tested SECNVs (Simulator of Exome Copy Number Variants), a fast, robust and customizable software application for simulating copy number variants and whole-exome sequences from a reference genome. SECNVs is easy to install, implements a wide range of commands to customize simulations, can output multiple samples at once, and incorporates a pipeline to output rearranged genomes, short reads and BAM files in a single command. Variants generated by SECNVs are detected with high sensitivity and precision by tools commonly used to detect copy number variants. SECNVs is publicly available at https://github.com/YJulyXing/SECNVs.

Download Full-text

Combining callers improves the detection of copy number variants from whole-genome sequencing

European Journal of Human Genetics ◽

10.1038/s41431-021-00983-x ◽

2021 ◽

Author(s):

Marie Coutelier ◽

Manuel Holtgrewe ◽

Marten Jäger ◽

Ricarda Flöttman ◽

Martin A. Mensah ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Computation Time ◽

Comparative Genomic ◽

Whole Genome ◽

Base Pairs ◽

Whole Exome ◽

Human Pathology

AbstractCopy Number Variants (CNVs) are deletions, duplications or insertions larger than 50 base pairs. They account for a large percentage of the normal genome variation and play major roles in human pathology. While array-based approaches have long been used to detect them in clinical practice, whole-genome sequencing (WGS) bears the promise to allow concomitant exploration of CNVs and smaller variants. However, accurately calling CNVs from WGS remains a difficult computational task, for which a consensus is still lacking. In this paper, we explore practical calling options to reach the best compromise between sensitivity and sensibility. We show that callers based on different signal (paired-end reads, split reads, coverage depth) yield complementary results. We suggest approaches combining four selected callers (Manta, Delly, ERDS, CNVnator) and a regenotyping tool (SV2), and show that this is applicable in everyday practice in terms of computation time and further interpretation. We demonstrate the superiority of these approaches over array-based Comparative Genomic Hybridization (aCGH), specifically regarding the lack of resolution in breakpoint definition and the detection of potentially relevant CNVs. Finally, we confirm our results on the NA12878 benchmark genome, as well as one clinically validated sample. In conclusion, we suggest that WGS constitutes a timely and economically valid alternative to the combination of aCGH and whole-exome sequencing.

Download Full-text

Genome‐wide detection of copy number variants in European autochthonous and commercial pig breeds by whole‐genome sequencing of DNA pools identified breed‐characterising copy number states

Animal Genetics ◽

10.1111/age.12954 ◽

2020 ◽

Vol 51 (4) ◽

pp. 541-556

Author(s):

S. Bovo ◽

A. Ribani ◽

M. Muñoz ◽

E. Alves ◽

J. P. Araujo ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Whole Genome ◽

Pig Breeds ◽

Genome Wide ◽

Dna Pools

Download Full-text

Straglr: discovering and genotyping tandem repeat expansions using whole genome long-read sequences

Genome Biology ◽

10.1186/s13059-021-02447-3 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Readman Chiu ◽

Indhu-Shree Rajan-Babu ◽

Jan M. Friedman ◽

Inanc Birol

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Tandem Repeat ◽

Neurological Disorders ◽

Software Tool ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Long Read ◽

Repeat Expansions

AbstractTandem repeat (TR) expansion is the underlying cause of over 40 neurological disorders. Long-read sequencing offers an exciting avenue over conventional technologies for detecting TR expansions. Here, we present Straglr, a robust software tool for both targeted genotyping and novel expansion detection from long-read alignments. We benchmark Straglr using various simulations, targeted genotyping data of cell lines carrying expansions of known diseases, and whole genome sequencing data with chromosome-scale assembly. Our results suggest that Straglr may be useful for investigating disease-associated TR expansions using long-read sequencing.

Download Full-text

Identification of single nucleotide variants in the Moroccan population by whole-genome sequencing

BMC Genetics ◽

10.1186/s12863-020-00917-4 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Lucy Crooks ◽

Johnathan Cooper-Knock ◽

Paul R. Heath ◽

Ahmed Bouhouche ◽

Mostafa Elfahime ◽

...

Keyword(s):

Genetic Diversity ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Large Population ◽

European Ancestry ◽

Whole Genome ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Moroccan Population ◽

African Populations

Abstract Background Large-scale human sequencing projects have described around a hundred-million single nucleotide variants (SNVs). These studies have predominately involved individuals with European ancestry despite the fact that genetic diversity is expected to be highest in Africa where Homo sapiens evolved and has maintained a large population for the longest time. The African Genome Variation Project examined several African populations but these were all located south of the Sahara. Morocco is on the northwest coast of Africa and mostly lies north of the Sahara, which makes it very attractive for studying genetic diversity. The ancestry of present-day Moroccans is unknown and may be substantially different from Africans found South of the Sahara desert, Recent genomic data of Taforalt individuals in Eastern Morocco revealed 15,000-year-old modern humans and suggested that North African individuals may be genetically distinct from previously studied African populations. Results We present SNVs discovered by whole genome sequencing (WGS) of three Moroccans. From a total of 5.9 million SNVs detected, over 200,000 were not identified by 1000G and were not in the extensive gnomAD database. We summarise the SNVs by genomic position, type of sequence gene context and effect on proteins encoded by the sequence. Analysis of the overall genomic information of the Moroccan individuals to individuals from 1000G supports the Moroccan population being distinct from both sub-Saharan African and European populations. Conclusions We conclude that Moroccan samples are genetically distinct and lie in the middle of the previously observed cline between populations of European and African ancestry. WGS of Moroccan individuals can identify a large number of novel SNVs and aid in functional characterisation of the genome.

Download Full-text