Canvas SPW: calling de novo copy number variants in pedigrees

Mapping Intimacies ◽

10.1101/121939 ◽

2017 ◽

Author(s):

Sergii Ivakhno ◽

Eric Roller ◽

Camilla Colombo ◽

Philip Tedder ◽

Anthony J. Cox

Keyword(s):

Copy Number ◽

De Novo ◽

Late Onset ◽

Genetic Diseases ◽

Copy Number Variants ◽

Variant Calling ◽

Supplementary Information ◽

Sequencing Data ◽

Pedigree Structure ◽

Wide Range

AbstractMotivationWhole genome sequencing is becoming a diagnostics of choice for the identification of rare inherited and de novo copy number variants in families with various pediatric and late-onset genetic diseases. However, joint variant calling in pedigrees is hampered by the complexity of consensus breakpoint alignment across samples within an arbitrary pedigree structure.ResultsWe have developed a new tool, Canvas SPW, for the identification of inherited and de novo copy number variants from pedigree sequencing data. Canvas SPW supports a number of family structures and provides a wide range of scoring and filtering options to automate and streamline identification of de novo variants.AvailabilityCanvas SPW is available for download from https://github.com/Illumina/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Understanding Schizophrenia: Genetic Causes and Treatment

Current Neuropsychiatry and Clinical Neuroscience Reports ◽

10.33702/cncnr.2019.1.1.2 ◽

2019 ◽

Vol 1 (1) ◽

pp. 6-12

Author(s):

Fatima Javeria ◽

Shazma Altaf ◽

Alishah Zair ◽

Rana Khalid Iqbal

Keyword(s):

Copy Number ◽

Drug Targets ◽

Disease Risk ◽

Mental Disease ◽

Copy Number Variants ◽

Aminobutyric Acid ◽

Epigenetic Mechanisms ◽

Gamma Aminobutyric Acid ◽

Wide Range ◽

Severe Mental Disease

Schizophrenia is a severe mental disease. The word schizophrenia literally means split mind. There are three major categories of symptoms which include positive, negative and cognitive symptoms. The disease is characterized by symptoms of hallucination, delusions, disorganized thinking and speech. Schizophrenia is related to many other mental and psychological problems like suicide, depression, hallucinations. Including these, it is also a problem for the patient’s family and the caregiver. There is no clear reason for the disease, but with the advances in molecular genetics; certain epigenetic mechanisms are involved in the pathophysiology of the disease. Epigenetic mechanisms that are mainly involved are the DNA methylation, copy number variants. With the advent of GWAS, a wide range of SNPs is found linked with the etiology of schizophrenia. These SNPs serve as ‘hubs’; because these all are integrating with each other in causing of schizophrenia risk. Until recently, there is no treatment available to cure the disease; but anti-psychotics can reduce the disease risk by minimizing its symptoms. Dopamine, serotonin, gamma-aminobutyric acid, are the neurotransmitters which serve as drug targets in the treatment of schizophrenia. Due to the involvement of genetic and epigenetic mechanisms, drugs available are already targeting certain genes involved in the etiology of the disease.

Download Full-text

Faculty Opinions recommendation of rDNA Copy Number Variants Are Frequent Passenger Mutations in Saccharomyces cerevisiae Deletion Collections and de Novo Transformants.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726580404.793522980 ◽

2016 ◽

Author(s):

Judith Berman

Keyword(s):

Saccharomyces Cerevisiae ◽

Copy Number ◽

De Novo ◽

Copy Number Variants ◽

Passenger Mutations ◽

Rdna Copy ◽

Rdna Copy Number

Download Full-text

An integrative approach to investigate the respective roles of single-nucleotide variants and copy-number variants in Attention-Deficit/Hyperactivity Disorder

Scientific Reports ◽

10.1038/srep22851 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 9

Author(s):

Leandro de Araújo Lima ◽

Ana Cecília Feio-dos-Santos ◽

Sintia Iole Belangero ◽

Ary Gadelha ◽

Rodrigo Affonseca Bressan ◽

...

Keyword(s):

Attention Deficit Hyperactivity Disorder ◽

Attention Deficit ◽

Copy Number ◽

De Novo ◽

Copy Number Variants ◽

Integrative Approach ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Hyperactivity Disorder ◽

New Genes

Abstract Many studies have attempted to investigate the genetic susceptibility of Attention-Deficit/Hyperactivity Disorder (ADHD), but without much success. The present study aimed to analyze both single-nucleotide and copy-number variants contributing to the genetic architecture of ADHD. We generated exome data from 30 Brazilian trios with sporadic ADHD. We also analyzed a Brazilian sample of 503 children/adolescent controls from a High Risk Cohort Study for the Development of Childhood Psychiatric Disorders, and also previously published results of five CNV studies and one GWAS meta-analysis of ADHD involving children/adolescents. The results from the Brazilian trios showed that cases with de novo SNVs tend not to have de novo CNVs and vice-versa. Although the sample size is small, we could also see that various comorbidities are more frequent in cases with only inherited variants. Moreover, using only genes expressed in brain, we constructed two “in silico” protein-protein interaction networks, one with genes from any analysis, and other with genes with hits in two analyses. Topological and functional analyses of genes in this network uncovered genes related to synapse, cell adhesion, glutamatergic and serotoninergic pathways, both confirming findings of previous studies and capturing new genes and genetic variants in these pathways.

Download Full-text

PocaCNV: A Tool to Detect Copy Number Variants from Population-Scale Genome Sequencing Data

10.1109/bibm52615.2021.9669405 ◽

2021 ◽

Author(s):

Zhendong Zhang ◽

Yongzhuang Liu ◽

Gaoyang Li ◽

Yadong Wang

Keyword(s):

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Sequencing Data ◽

Population Scale

Download Full-text

Abstract 15169: De Novo Variants of USP10 in Early Onset Bicuspid Aortic Valve Disease

Circulation ◽

10.1161/circ.142.suppl_3.15169 ◽

2020 ◽

Vol 142 (Suppl_3) ◽

Author(s):

Siddharth K Prakash ◽

Angela T Yetman ◽

Hector I Michelena ◽

Malenka M Bissell ◽

Yuli Y Kim ◽

...

Keyword(s):

Aortic Valve ◽

Bicuspid Aortic Valve ◽

Early Onset ◽

Rare Variants ◽

De Novo ◽

Late Onset ◽

Copy Number Variants ◽

Aortic Aneurysms ◽

Thoracic Aortic Aneurysms ◽

Aortic Dissections

Introduction: Bicuspid Aortic Valve (BAV), the most common congenital heart defect, is a major cause of aortic regurgitation or stenosis requiring valve replacement and thoracic aortic aneurysms predisposing to acute aortic dissections (TAD). The spectrum of BAV ranges from severe early onset valve and aortic complications to sporadic late onset disease. Hypothesis: Early onset BAV (EBAV) cases with valve or aortic complications that require intervention prior to age 30 are enriched for rare genetic variants that cause BAV and TAD. Methods: We performed whole exome sequencing of 147 EBAV cases in 141 families who were enrolled in the UTHealth Bicuspid Aortic Valve Research Registry. Candidate variants in the EBAV cohort (26% female, mean age 18, 44% with TAD) were compared to unselected controls from the Genome Aggregation Database (gnoMAD) and the Database of Genotypes and Phenotypes (dbGAP). We considered variants with minor allele frequencies (MAF) < 1%, Combined Annotation Dependent Depletion (CADD) scores > 25, and damaging (Polyphen-2) or deleterious (SIFT) functional prediction scores. Genomic copy number variants (CNVs) were detected using CoNIFER and prioritized when deletions involved genes with probability of loss intolerance (pLI) > 0.9. Variants were validated using quantitative PCR or Sanger sequencing. Results: We identified 6 rare variants of USP10 in 6 EBAV families (4% of cohort): 4 CNVs (2 duplications and 2 deletions) that are rare in dbGAP controls (4 in 15,414) and 2 deleterious rare missense variants (MAF<5x10 -5 in gnoMAD). Two of the 4 CNVs were de novo events in trios. In contrast, rare deleterious variants of the known causal BAV genes NOTCH1 (1), ROBO4 (1), GATA4 (1), GATA5 (1), and SMAD6 (4) were found in 7 total families. USP10 encodes a ubiquitin peptidase that is required for endothelial Notch signaling during vascular development. Conclusions: We identified rare and de novo variants of USP10 that implicate USP10 as a new candidate gene for BAV.

Download Full-text

AStrap: identification of alternative splicing from transcript sequences without a reference genome

Bioinformatics ◽

10.1093/bioinformatics/bty1008 ◽

2018 ◽

Vol 35 (15) ◽

pp. 2654-2656 ◽

Cited By ~ 5

Author(s):

Guoli Ji ◽

Wenbin Ye ◽

Yaru Su ◽

Moliang Chen ◽

Guangzao Huang ◽

...

Keyword(s):

Machine Learning ◽

Alternative Splicing ◽

Single Molecule ◽

Reference Genome ◽

De Novo ◽

Supplementary Information ◽

Model Organisms ◽

Sequencing Data ◽

Extensive Evaluation ◽

Reference Genomes

Abstract Summary Alternative splicing (AS) is a well-established mechanism for increasing transcriptome and proteome diversity, however, detecting AS events and distinguishing among AS types in organisms without available reference genomes remains challenging. We developed a de novo approach called AStrap for AS analysis without using a reference genome. AStrap identifies AS events by extensive pair-wise alignments of transcript sequences and predicts AS types by a machine-learning model integrating more than 500 assembled features. We evaluated AStrap using collected AS events from reference genomes of rice and human as well as single-molecule real-time sequencing data from Amborella trichopoda. Results show that AStrap can identify much more AS events with comparable or higher accuracy than the competing method. AStrap also possesses a unique feature of predicting AS types, which achieves an overall accuracy of ∼0.87 for different species. Extensive evaluation of AStrap using different parameters, sample sizes and machine-learning models on different species also demonstrates the robustness and flexibility of AStrap. AStrap could be a valuable addition to the community for the study of AS in non-model organisms with limited genetic resources. Availability and implementation AStrap is available for download at https://github.com/BMILAB/AStrap. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Detection and characterization of copy number variants based on whole-genome sequencing by DNBSEQ platforms

10.1101/786962 ◽

2019 ◽

Author(s):

Junhua Rao ◽

Lihua Peng ◽

Fang Chen ◽

Hui Jiang ◽

Chunyu Geng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Copy Number Variant ◽

Whole Genome ◽

Genome Wide ◽

Wide Range ◽

Distribution Sensitivity ◽

Cnv Detection

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.

Download Full-text

tHapMix: simulating tumour samples through haplotype mixtures

10.1101/057414 ◽

2016 ◽

Author(s):

Sergii Ivakhno ◽

Camilla Colombo ◽

Stephen Tanner ◽

Philip Tedder ◽

Stefano Berri ◽

...

Keyword(s):

Copy Number ◽

Large Scale ◽

Variant Calling ◽

Copy Number Variant ◽

Supplementary Information ◽

Genome Diversity ◽

Simulation Framework ◽

Somatic Genome ◽

Copy Number Changes ◽

Sequencing Platforms

AbstractMotivationLarge-scale rearrangements and copy number changes combined with different modes of cloevolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable oriant calling tools and create well-calibrated benchmarks.ResultsWe developed a new simulation framework tHapMix that enables the creation of tumour samples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools.Availability and implementationtHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Integrative DNA copy number detection and genotyping from sequencing and array-based platforms

10.1101/172700 ◽

2017 ◽

Cited By ~ 2

Author(s):

Zilu Zhou ◽

Weixin Wang ◽

Li-San Wang ◽

Nancy Ruonan Zhang

Keyword(s):

Copy Number ◽

Association Studies ◽

Snp Array ◽

Supplementary Information ◽

Detection Accuracy ◽

Sequencing Data ◽

Array Data ◽

Combining Data ◽

Allele Specific ◽

Cnv Detection

AbstractMotivationCopy number variations (CNVs) are gains and losses of DNA segments and have been associated with disease. Many large-scale genetic association studies are performing CNV analysis using whole exome sequencing (WES) and whole genome sequencing (WGS). In many of these studies, previous SNP-array data are available. An integrated cross-platform analysis is expected to improve resolution and accuracy, yet there is no tool for effectively combining data from sequencing and array platforms. The detection of CNVs using sequencing data alone can also be further improved by the utilization of allele-specific reads.ResultsWe propose a statistical framework, integrated Copy Number Variation detection algorithm (iCNV), which can be applied to multiple study designs: WES only, WGS only, SNP array only, or any combination of SNP and sequencing data. iCNV applies platform specific normalization, utilizes allele specific reads from sequencing and integrates matched NGS and SNP-array data by a Hidden Markov Model (HMM). We compare integrated two-platform CNV detection using iCNV to naive intersection or union of platforms and show that iCNV increases sensitivity and robustness. We also assess the accuracy of iCNV on WGS data only, and show that the utilization of allele-specific reads improve CNV detection accuracy compared to existing methods.Availabilityhttps://github.com/zhouzilu/[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Insights into dispersed duplications and complex structural mutations from whole genome sequencing 706 families

10.1101/2020.08.03.235358 ◽

2020 ◽

Author(s):

Christopher W. Whelan ◽

Robert E. Handsaker ◽

Giulio Genovese ◽

Seva Kashin ◽

Monkol Lek ◽

...

Keyword(s):

Gene Expression ◽

Copy Number Variation ◽

Copy Number ◽

De Novo ◽

Whole Genome ◽

Sequencing Data ◽

Number Variation ◽

Structural Mutations ◽

Or Gene ◽

Genomic Locations

AbstractTwo intriguing forms of genome structural variation (SV) – dispersed duplications, and de novo rearrangements of complex, multi-allelic loci – have long escaped genomic analysis. We describe a new way to find and characterize such variation by utilizing identity-by-descent (IBD) relationships between siblings together with high-precision measurements of segmental copy number. Analyzing whole-genome sequence data from 706 families, we find hundreds of “IBD-discordant” (IBDD) CNVs: loci at which siblings’ CNV measurements and IBD states are mathematically inconsistent. We found that commonly-IBDD CNVs identify dispersed duplications; we mapped 95 of these common dispersed duplications to their true genomic locations through family-based linkage and population linkage disequilibrium (LD), and found several to be in strong LD with genome-wide association (GWAS) signals for common diseases or gene expression variation at their revealed genomic locations. Other CNVs that were IBDD in a single family appear to involve de novo mutations in complex and multi-allelic loci; we identified 26 de novo structural mutations that had not been previously detected in earlier analyses of the same families by diverse SV analysis methods. These included a de novo mutation of the amylase gene locus and multiple de novo mutations at chromosome 15q14. Combining these complex mutations with more-conventional CNVs, we estimate that segmental mutations larger than 1kb arise in about one per 22 human meioses. These methods are complementary to previous techniques in that they interrogate genomic regions that are home to segmental duplication, high CNV allele frequencies, and multi-allelic CNVs.Author SummaryCopy number variation is an important form of genetic variation in which individuals differ in the number of copies of segments of their genomes. Certain aspects of copy number variation have traditionally been difficult to study using short-read sequencing data. For example, standard analyses often cannot tell whether the duplicated copies of a segment are located near the original copy or are dispersed to other regions of the genome. Another aspect of copy number variation that has been difficult to study is the detection of mutations in the copy number of DNA segments passed down from parents to their children, particularly when the mutations affect genome segments which already display common copy number variation in the population. We develop an analytical approach to solving these problems when sequencing data is available for all members of families with at least two children. This method is based on determining the number of parental haplotypes the two siblings share at each location in their genome, and using that information to determine the possible inheritance patterns that might explain the copy numbers we observe in each family member. We show that dispersed duplications and mutations can be identified by looking for copy number variants that do not follow these expected inheritance patterns. We use this approach to determine the location of 95 common duplications which are dispersed to distant regions of the genome, and demonstrate that these duplications are linked to genetic variants that affect disease risk or gene expression levels. We also identify a set of copy number mutations not detected by previous analyses of sequencing data from a large cohort of families, and show that repetitive and complex regions of the genome undergo frequent mutations in copy number.

Download Full-text