scholarly journals Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes

Science ◽  
2012 ◽  
Vol 337 (6090) ◽  
pp. 64-69 ◽  
Author(s):  
Jacob A. Tennessen ◽  
Abigail W. Bigham ◽  
Timothy D. O’Connor ◽  
Wenqing Fu ◽  
Eimear E. Kenny ◽  
...  

As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Yoshihiro Nawa ◽  
Hiroki Kimura ◽  
Daisuke Mori ◽  
Hidekazu Kato ◽  
Miho Toyama ◽  
...  

AbstractDisabled 1 (DAB1) is an intracellular adaptor protein in the Reelin signaling pathway and plays an essential role in correct neuronal migration and layer formation in the developing brain. DAB1 has been repeatedly reported to be associated with neurodevelopmental disorders including schizophrenia (SCZ) and autism spectrum disorders (ASD) in genetic, animal, and postmortem studies. Recently, increasing attention has been given to rare single-nucleotide variants (SNVs) found by deep sequencing of candidate genes. In this study, we performed exon-targeted resequencing of DAB1 in 370 SCZ and 192 ASD patients using next-generation sequencing technology to identify rare SNVs with a minor allele frequency <1%. We detected two rare missense mutations (G382C, V129I) and then performed a genetic association study in a sample comprising 1763 SCZ, 380 ASD, and 2190 healthy control subjects. Although no statistically significant association with the detected mutations was observed for either SCZ or ASD, G382C was found only in the case group, and in silico analyses and in vitro functional assays suggested that G382C alters the function of the DAB1 protein. The rare variants of DAB1 found in the present study should be studied further to elucidate their potential functional relevance to the pathophysiology of SCZ and ASD.


2020 ◽  
Vol 21 (11) ◽  
pp. 1068-1077
Author(s):  
Xiaochao Sun ◽  
Bin Yang ◽  
Qunye Zhang

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.


2021 ◽  
Vol 22 (4) ◽  
pp. 1876
Author(s):  
Frida Belinky ◽  
Ishan Ganguly ◽  
Eugenia Poliakov ◽  
Vyacheslav Yurchenko ◽  
Igor B. Rogozin

Nonsense mutations turn a coding (sense) codon into an in-frame stop codon that is assumed to result in a truncated protein product. Thus, nonsense substitutions are the hallmark of pseudogenes and are used to identify them. Here we show that in-frame stop codons within bacterial protein-coding genes are widespread. Their evolutionary conservation suggests that many of them are not pseudogenes, since they maintain dN/dS values (ratios of substitution rates at non-synonymous and synonymous sites) significantly lower than 1 (this is a signature of purifying selection in protein-coding regions). We also found that double substitutions in codons—where an intermediate step is a nonsense substitution—show a higher rate of evolution compared to null models, indicating that a stop codon was introduced and then changed back to sense via positive selection. This further supports the notion that nonsense substitutions in bacteria are relatively common and do not necessarily cause pseudogenization. In-frame stop codons may be an important mechanism of regulation: Such codons are likely to cause a substantial decrease of protein expression levels.


2020 ◽  
Author(s):  
Celine Charon ◽  
Rodrigue Allodji ◽  
Vincent Meyer ◽  
Jean-François Deleuze

Abstract Quality control methods for genome-wide association studies and fine mapping are commonly used for imputation, however, they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1,031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1,089 NCBI recorded individuals for additional validation.Without variant pre-filtration based on quality control (QC), we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E-04-1E-03) and rare variants (1E-03-5E-03) (p < 1E-04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) <0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E-04). As a result, to maintain confidence and enough SNVs, we propose here a 2-step post-filtration approach to increase the number of very rare and rare variants compared to conservative post-filtration methods.


GigaScience ◽  
2019 ◽  
Vol 8 (10) ◽  
Author(s):  
Bo Song ◽  
Yue Song ◽  
Yuan Fu ◽  
Elizabeth Balyejusa Kizito ◽  
Sandra Ndagire Kamenya ◽  
...  

Abstract Background The African eggplant (Solanum aethiopicum) is a nutritious traditional vegetable used in many African countries, including Uganda and Nigeria. It is thought to have been domesticated in Africa from its wild relative, Solanum anguivi. S. aethiopicum has been routinely used as a source of disease resistance genes for several Solanaceae crops, including Solanum melongena. A lack of genomic resources has meant that breeding of S. aethiopicum has lagged behind other vegetable crops. Results We assembled a 1.02-Gb draft genome of S. aethiopicum, which contained predominantly repetitive sequences (78.9%). We annotated 37,681 gene models, including 34,906 protein-coding genes. Expansion of disease resistance genes was observed via 2 rounds of amplification of long terminal repeat retrotransposons, which may have occurred ∼1.25 and 3.5 million years ago, respectively. By resequencing 65 S. aethiopicum and S. anguivi genotypes, 18,614,838 single-nucleotide polymorphisms were identified, of which 34,171 were located within disease resistance genes. Analysis of domestication and demographic history revealed active selection for genes involved in drought tolerance in both “Gilo” and “Shum” groups. A pan-genome of S. aethiopicum was assembled, containing 51,351 protein-coding genes; 7,069 of these genes were missing from the reference genome. Conclusions The genome sequence of S. aethiopicum enhances our understanding of its biotic and abiotic resistance. The single-nucleotide polymorphisms identified are immediately available for use by breeders. The information provided here will accelerate selection and breeding of the African eggplant, as well as other crops within the Solanaceae family.


eLife ◽  
2013 ◽  
Vol 2 ◽  
Author(s):  
Hume Stroud ◽  
Bo Ding ◽  
Stacey A Simon ◽  
Suhua Feng ◽  
Maria Bellizzi ◽  
...  

Most transgenic crops are produced through tissue culture. The impact of utilizing such methods on the plant epigenome is poorly understood. Here we generated whole-genome, single-nucleotide resolution maps of DNA methylation in several regenerated rice lines. We found that all tested regenerated plants had significant losses of methylation compared to non-regenerated plants. Loss of methylation was largely stable across generations, and certain sites in the genome were particularly susceptible to loss of methylation. Loss of methylation at promoters was associated with deregulated expression of protein-coding genes. Analyses of callus and untransformed plants regenerated from callus indicated that loss of methylation is stochastically induced at the tissue culture step. These changes in methylation may explain a component of somaclonal variation, a phenomenon in which plants derived from tissue culture manifest phenotypic variability.


2017 ◽  
Vol 114 (34) ◽  
pp. 9158-9163 ◽  
Author(s):  
Steven Timmermans ◽  
Marc Van Montagu ◽  
Claude Libert

Mouse inbred strains remain essential in science. We have analyzed the publicly available genome sequences of 36 popular inbred strains and provide lists for each strain of protein-coding genes that acquired sequence variations that cause premature STOP codons, loss of STOP codons and single nucleotide polymorphisms, and short in-frame insertions and deletions. Our data give an overview of predicted defective proteins, including predicted impact scores, of all these strains compared with the reference mouse genome of C57BL/6J. These data can also be retrieved via a searchable website (mousepost.be) and allow a global, better interpretation of genetic background effects and a source of naturally defective alleles in these 36 sequenced classical and high-priority mouse inbred strains.


2019 ◽  
Author(s):  
Chang Li ◽  
Michael D. Swartz ◽  
Bing Yu ◽  
Yongsheng Bai ◽  
Xiaoming Liu

AbstractmicroRNAs (miRNAs) are short non-coding RNAs that can repress the expression of protein coding messenger RNAs (mRNAs) by binding to the 3’UTR of the target. Genetic mutations such as single nucleotide variants (SNVs) in the 3’UTR of the mRNAs can disrupt this regulatory effect. In this study, we presented dbMTS, the database for miRNA target site (MTS) SNVs, which includes all potential MTS SNVs in the 3’UTR of human genome along with hundreds of functional annotations. This database can help studies easily identify putative SNVs that affect miRNA targeting and facilitate the prioritization of their functional importance. dbMTS is freely available at: https://sites.google.com/site/jpopgen/dbNSFP.


2018 ◽  
Author(s):  
Leandro Radusky ◽  
Carlos Modenutti ◽  
Javier Delgado ◽  
Juan P. Bustamante ◽  
Sebastian Vishnopolska ◽  
...  

AbstractUnderstanding the functional effect of Single Amino acid Substitutions (SAS), derived from the occurrence of single nucleotide variants (SNVs), and their relation to disease development is a major issue in clinical genomics. Even though there are several bioinformatic algorithms and servers that predict if a SAS can be pathogenic or not they give little or non-information on the actual effect on the protein function. Moreover, many of these algorithms are able to predict an effect that no necessarily translates directly into pathogenicity. VarQ Web Server is an online tool that given an UniProt id automatically analyzes known and user provided SAS for their effect on protein activity, folding, aggregation and protein interactions among others. VarQ assessment was performed over a set of previously manually curated variants, showing its ability to correctly predict the phenotypic outcome and its underlying cause. This resource is available online at http://varq.qb.fcen.uba.ar/.Contact: [email protected] Information & Tutorials may be found in the webpage of the tool.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Liubov O. Skorodumova ◽  
Alexandra V. Belodedova ◽  
Elena I. Sharova ◽  
Elena S. Zakharova ◽  
Liliia N. Iulmetova ◽  
...  

Abstract Background Keratoconus is a chronic degenerative disorder of the cornea characterized by thinning and cone-shaped protrusions. Although genetic factors play a key role in keratoconus development, the etiology is still under investigation. The occurrence of single-nucleotide polymorphisms (SNPs) associated with keratoconus in Russian patients is poorly studied. The purpose of this study was to validate whether three reported keratoconus-associated SNPs (rs1536482 near the COL5A1 gene, rs2721051 near the FOXO1 gene, rs1324183 near the MPDZ gene) are also actual for a Russian cohort of patients. Additionally, we investigated the COL5A1 promoter sequence for single-nucleotide variants (SNVs) in a subgroup of keratoconus patients with at least one rs1536482 minor allele (rs1536482+) to assess the role of these SNVs in keratoconus susceptibility associated with rs1536482. Methods This case-control study included 150 keratoconus patients and two control groups (main and additional, 205 and 474 participants, respectively). We performed PCR targeting regions flanking SNVs and the COL5A1 promoter, followed by Sanger sequencing of amplicons. The additional control group was genotyped using an SNP array. Results The minor allele frequency was significantly different between the keratoconus and control cohorts (main and combined) for rs1536482, rs2721051, and rs1324183 (p-value < 0.05). The rare variants rs1043208782 and rs569248712 were found in the COL5A1 promoter in two out of 94 rs1536482+ keratoconus patients. Conclusion rs1536482, rs2721051, and rs1324183 were associated with keratoconus in a Russian cohort. SNVs in the COL5A1 promoter do not play a major role in keratoconus susceptibility associated with rs1536482.


Sign in / Sign up

Export Citation Format

Share Document