scholarly journals Allele-specific binding of RNA-binding proteins reveals functional genetic variants in the RNA

2018 ◽  
Author(s):  
Ei-Wen Yang ◽  
Jae Hoon Bahn ◽  
Esther Yun-Hua Hsiao ◽  
Boon Xin Tan ◽  
Yiwei Sun ◽  
...  

AbstractAllele-specific protein-RNA binding is an essential aspect that may reveal functional genetic variants influencing RNA processing and gene expression phenotypes. Recently, genome-wide detection of in vivo binding sites of RNA binding proteins (RBPs) is greatly facilitated by the enhanced UV crosslinking and immunoprecipitation (eCLIP) protocol. Hundreds of eCLIP-Seq data sets were generated from HepG2 and K562 cells during the ENCODE3 phase. These data afford a valuable opportunity to examine allele-specific binding (ASB) of RBPs. To this end, we developed a new computational algorithm, called BEAPR (Binding Estimation of Allele-specific Protein-RNA interaction). In identifying statistically significant ASB sites, BEAPR takes into account UV cross-linking induced sequence propensity and technical variations between replicated experiments. Using simulated data and actual eCLIP-Seq data, we show that BEAPR largely outperforms often-used methods Chi-Squared test and Fisher’s Exact test. Importantly, BEAPR overcomes the inherent over-dispersion problem of the other methods. Complemented by experimental validations, we demonstrate that ASB events are significantly associated with genetic regulation of splicing and mRNA abundance, supporting the usage of this method to pinpoint functional genetic variants in post-transcriptional gene regulation. Many variants with ASB patterns of RBPs were found as genetic variants with cancer or other disease relevance. About 38% of ASB variants were in linkage disequilibrium with single nucleotide polymorphisms from genome-wide association studies. Overall, our results suggest that BEAPR is an effective method to reveal ASB patterns in eCLIP and can inform functional interpretation of disease-related genetic variants.

2018 ◽  
Author(s):  
Emad Bahrami-Samani ◽  
Yi Xing

AbstractGene expression is tightly regulated at the post-transcriptional level through splicing, transport, translation, and decay. RNA-binding proteins (RBPs) play key roles in post-transcriptional gene regulation, and genetic variants that alter RBP-RNA interactions can affect gene products and functions. We developed a computational method ASPRIN (Allele-Specific Protein-RNA Interaction), that uses a joint analysis of CLIP-seq (cross-linking and immunoprecipitation followed by high-throughput sequencing) and RNA-seq data to identify genetic variants that alter RBP-RNA interactions by directly observing the allelic preference of RBP from CLIP-seq experiments as compared to RNA-seq. We used ASPRIN to systematically analyze CLIP-seq and RNA-seq data for 166 RBPs in two ENCODE (Encyclopedia of DNA Elements) cell lines. ASPRIN identified genetic variants that alter RBP-RNA interactions by modifying RBP binding motifs within RNA. Moreover, through an integrative ASPRIN analysis with population-scale RNA-seq data, we showed that ASPRIN can help reveal potential causal variants that affect alternative splicing via allele-specific protein-RNA interactions.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Ei-Wen Yang ◽  
Jae Hoon Bahn ◽  
Esther Yun-Hua Hsiao ◽  
Boon Xin Tan ◽  
Yiwei Sun ◽  
...  

2020 ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar .


2020 ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research.Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jordy Homing Lam ◽  
Yu Li ◽  
Lizhe Zhu ◽  
Ramzan Umarov ◽  
Hanlun Jiang ◽  
...  

Abstract Protein-RNA interaction plays important roles in post-transcriptional regulation. However, the task of predicting these interactions given a protein structure is difficult. Here we show that, by leveraging a deep learning model NucleicNet, attributes such as binding preference of RNA backbone constituents and different bases can be predicted from local physicochemical characteristics of protein structure surface. On a diverse set of challenging RNA-binding proteins, including Fem-3-binding-factor 2, Argonaute 2 and Ribonuclease III, NucleicNet can accurately recover interaction modes discovered by structural biology experiments. Furthermore, we show that, without seeing any in vitro or in vivo assay data, NucleicNet can still achieve consistency with experiments, including RNAcompete, Immunoprecipitation Assay, and siRNA Knockdown Benchmark. NucleicNet can thus serve to provide quantitative fitness of RNA sequences for given binding pockets or to predict potential binding pockets and binding RNAs for previously unknown RNA binding proteins.


2012 ◽  
Vol 3 (5) ◽  
pp. 403-414 ◽  
Author(s):  
Jochen Imig ◽  
Alexander Kanitz ◽  
André P. Gerber

AbstractThe development of genome-wide analysis tools has prompted global investigation of the gene expression program, revealing highly coordinated control mechanisms that ensure proper spatiotemporal activity of a cell’s macromolecular components. With respect to the regulation of RNA transcripts, the concept of RNA regulons, which – by analogy with DNA regulons in bacteria – refers to the coordinated control of functionally related RNA molecules, has emerged as a unifying theory that describes the logic of regulatory RNA-protein interactions in eukaryotes. Hundreds of RNA-binding proteins and small non-coding RNAs, such as microRNAs, bind to distinct elements in target RNAs, thereby exerting specific and concerted control over posttranscriptional events. In this review, we discuss recent reports committed to systematically explore the RNA-protein interaction network and outline some of the principles and recurring features of RNA regulons: the coordination of functionally related mRNAs through RNA-binding proteins or non-coding RNAs, the modular structure of its components, and the dynamic rewiring of RNA-protein interactions upon exposure to internal or external stimuli. We also summarize evidence for robust combinatorial control of mRNAs, which could determine the ultimate fate of each mRNA molecule in a cell. Finally, the compilation and integration of global protein-RNA interaction data has yielded first insights into network structures and provided the hypothesis that RNA regulons may, in part, constitute noise ‘buffers’ to handle stochasticity in cellular transcription.


2016 ◽  
Vol 12 (2) ◽  
pp. 532-540 ◽  
Author(s):  
Pritha Ghosh ◽  
R. Sowdhamini

We have classified the existing RNA-binding protein (RBP) structures into different structural families. Here, we report ∼2600 proteins with RBP signatures in humans.


2021 ◽  
Author(s):  
Alexander Kitaygorodsky ◽  
Emily Jin ◽  
Yufeng Shen

RNA binding proteins (RBPs) are important regulators of transcriptional and post-transcriptional processes. Computational prediction of localized RBP binding affinity with transcripts is important for interpretation of genetic variation, especially variants outside of protein coding region. Here we describe POLARIS (Prediction Of Localized Affinity for RBPs In Sequence), a new deep-learning method for achieving fast, site-specific binding affinity predictions of RNA-binding proteins (RBPs) to the transcribed genome. POLARIS has two modules: 1. a convolutional neural network (CNN) to predict overall RBP binding within a region based on transcript sequence content and expression level; 2. a Gradient-weighted Class Activation Mapping (GradCAM) implementation for efficient signal backpropagation to individual sequence positions. We trained the model using enhanced crosslinking and immunoprecipitation (eCLIP) data from ENCODE. POLARIS has good performance with a median AUC ~ 0.96 for 160 RBPs across three different cell lines, substantially higher than selected popular published methods trained and tested on the same data sets. When tested on data from a different cell line with the same RBPs, the overall performance is maintained, supporting the ability of cell-type specific affinity prediction. Finally, the GradCAM module allows the model to identify the informative sites in a region that drive prediction. The localized prediction facilitates interpretation of the results and provides basis for inference of functional impact of noncoding variants.


Sign in / Sign up

Export Citation Format

Share Document