scholarly journals Canvas SPW: calling de novo copy number variants in pedigrees

2017 ◽  
Author(s):  
Sergii Ivakhno ◽  
Eric Roller ◽  
Camilla Colombo ◽  
Philip Tedder ◽  
Anthony J. Cox

AbstractMotivationWhole genome sequencing is becoming a diagnostics of choice for the identification of rare inherited and de novo copy number variants in families with various pediatric and late-onset genetic diseases. However, joint variant calling in pedigrees is hampered by the complexity of consensus breakpoint alignment across samples within an arbitrary pedigree structure.ResultsWe have developed a new tool, Canvas SPW, for the identification of inherited and de novo copy number variants from pedigree sequencing data. Canvas SPW supports a number of family structures and provides a wide range of scoring and filtering options to automate and streamline identification of de novo variants.AvailabilityCanvas SPW is available for download from https://github.com/Illumina/[email protected] informationSupplementary data are available at Bioinformatics online.

2019 ◽  
Vol 1 (1) ◽  
pp. 6-12
Author(s):  
Fatima Javeria ◽  
Shazma Altaf ◽  
Alishah Zair ◽  
Rana Khalid Iqbal

Schizophrenia is a severe mental disease. The word schizophrenia literally means split mind. There are three major categories of symptoms which include positive, negative and cognitive symptoms. The disease is characterized by symptoms of hallucination, delusions, disorganized thinking and speech. Schizophrenia is related to many other mental and psychological problems like suicide, depression, hallucinations. Including these, it is also a problem for the patient’s family and the caregiver. There is no clear reason for the disease, but with the advances in molecular genetics; certain epigenetic mechanisms are involved in the pathophysiology of the disease. Epigenetic mechanisms that are mainly involved are the DNA methylation, copy number variants. With the advent of GWAS, a wide range of SNPs is found linked with the etiology of schizophrenia. These SNPs serve as ‘hubs’; because these all are integrating with each other in causing of schizophrenia risk. Until recently, there is no treatment available to cure the disease; but anti-psychotics can reduce the disease risk by minimizing its symptoms. Dopamine, serotonin, gamma-aminobutyric acid, are the neurotransmitters which serve as drug targets in the treatment of schizophrenia. Due to the involvement of genetic and epigenetic mechanisms, drugs available are already targeting certain genes involved in the etiology of the disease.


2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Leandro de Araújo Lima ◽  
Ana Cecília Feio-dos-Santos ◽  
Sintia Iole Belangero ◽  
Ary Gadelha ◽  
Rodrigo Affonseca Bressan ◽  
...  

Abstract Many studies have attempted to investigate the genetic susceptibility of Attention-Deficit/Hyperactivity Disorder (ADHD), but without much success. The present study aimed to analyze both single-nucleotide and copy-number variants contributing to the genetic architecture of ADHD. We generated exome data from 30 Brazilian trios with sporadic ADHD. We also analyzed a Brazilian sample of 503 children/adolescent controls from a High Risk Cohort Study for the Development of Childhood Psychiatric Disorders, and also previously published results of five CNV studies and one GWAS meta-analysis of ADHD involving children/adolescents. The results from the Brazilian trios showed that cases with de novo SNVs tend not to have de novo CNVs and vice-versa. Although the sample size is small, we could also see that various comorbidities are more frequent in cases with only inherited variants. Moreover, using only genes expressed in brain, we constructed two “in silico” protein-protein interaction networks, one with genes from any analysis, and other with genes with hits in two analyses. Topological and functional analyses of genes in this network uncovered genes related to synapse, cell adhesion, glutamatergic and serotoninergic pathways, both confirming findings of previous studies and capturing new genes and genetic variants in these pathways.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Siddharth K Prakash ◽  
Angela T Yetman ◽  
Hector I Michelena ◽  
Malenka M Bissell ◽  
Yuli Y Kim ◽  
...  

Introduction: Bicuspid Aortic Valve (BAV), the most common congenital heart defect, is a major cause of aortic regurgitation or stenosis requiring valve replacement and thoracic aortic aneurysms predisposing to acute aortic dissections (TAD). The spectrum of BAV ranges from severe early onset valve and aortic complications to sporadic late onset disease. Hypothesis: Early onset BAV (EBAV) cases with valve or aortic complications that require intervention prior to age 30 are enriched for rare genetic variants that cause BAV and TAD. Methods: We performed whole exome sequencing of 147 EBAV cases in 141 families who were enrolled in the UTHealth Bicuspid Aortic Valve Research Registry. Candidate variants in the EBAV cohort (26% female, mean age 18, 44% with TAD) were compared to unselected controls from the Genome Aggregation Database (gnoMAD) and the Database of Genotypes and Phenotypes (dbGAP). We considered variants with minor allele frequencies (MAF) < 1%, Combined Annotation Dependent Depletion (CADD) scores > 25, and damaging (Polyphen-2) or deleterious (SIFT) functional prediction scores. Genomic copy number variants (CNVs) were detected using CoNIFER and prioritized when deletions involved genes with probability of loss intolerance (pLI) > 0.9. Variants were validated using quantitative PCR or Sanger sequencing. Results: We identified 6 rare variants of USP10 in 6 EBAV families (4% of cohort): 4 CNVs (2 duplications and 2 deletions) that are rare in dbGAP controls (4 in 15,414) and 2 deleterious rare missense variants (MAF<5x10 -5 in gnoMAD). Two of the 4 CNVs were de novo events in trios. In contrast, rare deleterious variants of the known causal BAV genes NOTCH1 (1), ROBO4 (1), GATA4 (1), GATA5 (1), and SMAD6 (4) were found in 7 total families. USP10 encodes a ubiquitin peptidase that is required for endothelial Notch signaling during vascular development. Conclusions: We identified rare and de novo variants of USP10 that implicate USP10 as a new candidate gene for BAV.


2018 ◽  
Vol 35 (15) ◽  
pp. 2654-2656 ◽  
Author(s):  
Guoli Ji ◽  
Wenbin Ye ◽  
Yaru Su ◽  
Moliang Chen ◽  
Guangzao Huang ◽  
...  

Abstract Summary Alternative splicing (AS) is a well-established mechanism for increasing transcriptome and proteome diversity, however, detecting AS events and distinguishing among AS types in organisms without available reference genomes remains challenging. We developed a de novo approach called AStrap for AS analysis without using a reference genome. AStrap identifies AS events by extensive pair-wise alignments of transcript sequences and predicts AS types by a machine-learning model integrating more than 500 assembled features. We evaluated AStrap using collected AS events from reference genomes of rice and human as well as single-molecule real-time sequencing data from Amborella trichopoda. Results show that AStrap can identify much more AS events with comparable or higher accuracy than the competing method. AStrap also possesses a unique feature of predicting AS types, which achieves an overall accuracy of ∼0.87 for different species. Extensive evaluation of AStrap using different parameters, sample sizes and machine-learning models on different species also demonstrates the robustness and flexibility of AStrap. AStrap could be a valuable addition to the community for the study of AS in non-model organisms with limited genetic resources. Availability and implementation AStrap is available for download at https://github.com/BMILAB/AStrap. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Junhua Rao ◽  
Lihua Peng ◽  
Fang Chen ◽  
Hui Jiang ◽  
Chunyu Geng ◽  
...  

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.


2016 ◽  
Author(s):  
Sergii Ivakhno ◽  
Camilla Colombo ◽  
Stephen Tanner ◽  
Philip Tedder ◽  
Stefano Berri ◽  
...  

AbstractMotivationLarge-scale rearrangements and copy number changes combined with different modes of cloevolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable oriant calling tools and create well-calibrated benchmarks.ResultsWe developed a new simulation framework tHapMix that enables the creation of tumour samples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools.Availability and implementationtHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/[email protected] informationSupplementary data are available at Bioinformatics online.


2017 ◽  
Author(s):  
Zilu Zhou ◽  
Weixin Wang ◽  
Li-San Wang ◽  
Nancy Ruonan Zhang

AbstractMotivationCopy number variations (CNVs) are gains and losses of DNA segments and have been associated with disease. Many large-scale genetic association studies are performing CNV analysis using whole exome sequencing (WES) and whole genome sequencing (WGS). In many of these studies, previous SNP-array data are available. An integrated cross-platform analysis is expected to improve resolution and accuracy, yet there is no tool for effectively combining data from sequencing and array platforms. The detection of CNVs using sequencing data alone can also be further improved by the utilization of allele-specific reads.ResultsWe propose a statistical framework, integrated Copy Number Variation detection algorithm (iCNV), which can be applied to multiple study designs: WES only, WGS only, SNP array only, or any combination of SNP and sequencing data. iCNV applies platform specific normalization, utilizes allele specific reads from sequencing and integrates matched NGS and SNP-array data by a Hidden Markov Model (HMM). We compare integrated two-platform CNV detection using iCNV to naive intersection or union of platforms and show that iCNV increases sensitivity and robustness. We also assess the accuracy of iCNV on WGS data only, and show that the utilization of allele-specific reads improve CNV detection accuracy compared to existing methods.Availabilityhttps://github.com/zhouzilu/[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Christopher W. Whelan ◽  
Robert E. Handsaker ◽  
Giulio Genovese ◽  
Seva Kashin ◽  
Monkol Lek ◽  
...  

AbstractTwo intriguing forms of genome structural variation (SV) – dispersed duplications, and de novo rearrangements of complex, multi-allelic loci – have long escaped genomic analysis. We describe a new way to find and characterize such variation by utilizing identity-by-descent (IBD) relationships between siblings together with high-precision measurements of segmental copy number. Analyzing whole-genome sequence data from 706 families, we find hundreds of “IBD-discordant” (IBDD) CNVs: loci at which siblings’ CNV measurements and IBD states are mathematically inconsistent. We found that commonly-IBDD CNVs identify dispersed duplications; we mapped 95 of these common dispersed duplications to their true genomic locations through family-based linkage and population linkage disequilibrium (LD), and found several to be in strong LD with genome-wide association (GWAS) signals for common diseases or gene expression variation at their revealed genomic locations. Other CNVs that were IBDD in a single family appear to involve de novo mutations in complex and multi-allelic loci; we identified 26 de novo structural mutations that had not been previously detected in earlier analyses of the same families by diverse SV analysis methods. These included a de novo mutation of the amylase gene locus and multiple de novo mutations at chromosome 15q14. Combining these complex mutations with more-conventional CNVs, we estimate that segmental mutations larger than 1kb arise in about one per 22 human meioses. These methods are complementary to previous techniques in that they interrogate genomic regions that are home to segmental duplication, high CNV allele frequencies, and multi-allelic CNVs.Author SummaryCopy number variation is an important form of genetic variation in which individuals differ in the number of copies of segments of their genomes. Certain aspects of copy number variation have traditionally been difficult to study using short-read sequencing data. For example, standard analyses often cannot tell whether the duplicated copies of a segment are located near the original copy or are dispersed to other regions of the genome. Another aspect of copy number variation that has been difficult to study is the detection of mutations in the copy number of DNA segments passed down from parents to their children, particularly when the mutations affect genome segments which already display common copy number variation in the population. We develop an analytical approach to solving these problems when sequencing data is available for all members of families with at least two children. This method is based on determining the number of parental haplotypes the two siblings share at each location in their genome, and using that information to determine the possible inheritance patterns that might explain the copy numbers we observe in each family member. We show that dispersed duplications and mutations can be identified by looking for copy number variants that do not follow these expected inheritance patterns. We use this approach to determine the location of 95 common duplications which are dispersed to distant regions of the genome, and demonstrate that these duplications are linked to genetic variants that affect disease risk or gene expression levels. We also identify a set of copy number mutations not detected by previous analyses of sequencing data from a large cohort of families, and show that repetitive and complex regions of the genome undergo frequent mutations in copy number.


Sign in / Sign up

Export Citation Format

Share Document