SliceIt: A genome-wide resource and visualization tool to design CRISPR/Cas9 screens for editing protein-RNA interaction sites in the human genome

Mapping Intimacies ◽

10.1101/654640 ◽

2019 ◽

Author(s):

Sasank Vemuri ◽

Rajneesh Srivastava ◽

Quoseena Mir ◽

Seyedsasan Hashemikhabir ◽

X. Charlie Dong ◽

...

Keyword(s):

Binding Site ◽

High Throughput ◽

Binding Sites ◽

Genome Browser ◽

Expression Levels ◽

Guide Rna ◽

Exon Expression ◽

Interaction Sites ◽

Gene Search ◽

Rna Interaction

AbstractSeveral protein-RNA cross linking protocols have been established in recent years to delineate the molecular interaction of an RNA Binding Protein (RBP) and its target RNAs. However, functional dissection of the role of the RBP binding sites in modulating the post-transcriptional fate of the target RNA remains challenging. CRISPR/Cas9 genome editing system is being commonly employed to perturb both coding and noncoding regions in the genome. With the advancements in genome-scale CRISPR/Cas9 screens, it is now possible to not only perturb specific binding sites but also probe the global impact of protein-RNA interaction sites across cell types. Here, we present SliceIt (http://sliceit.soic.iupui.edu/), a database of in silico sgRNA (single guide RNA) library to facilitate conducting such high throughput screens. SliceIt comprises of ~4.8 million unique sgRNAs with an estimated range of 2–8 sgRNAs designed per RBP binding site, for eCLIP experiments of >100 RBPs in HepG2 and K562 cell lines from the ENCODE project. SliceIt provides a user friendly environment, developed using advanced search engine framework, Elasticsearch. It is available in both table and genome browser views facilitating the easy navigation of RBP binding sites, designed sgRNAs, exon expression levels across 53 human tissues along with prevalence of SNPs and GWAS hits on binding sites. Exon expression profiles enable examination of locus specific changes proximal to the binding sites. Users can also upload custom tracks of various file formats directly onto genome browser, to navigate additional genomic features in the genome and compare with other types of omics profiles. All the binding site-centric information is dynamically accessible via “search by gene”, “search by coordinates” and “search by RBP” options and readily available to download. Validation of the sgRNA library in SliceIt was performed by selecting RBP binding sites in Lipt1 gene and designing sgRNAs. Effect of CRISPR/Cas9 perturbations on the selected binding sites in HepG2 cell line, was confirmed based on altered proximal exon expression levels using qPCR, further supporting the utility of the resource to design experiments for perturbing protein-RNA interaction networks. Thus, SliceIt provides a one-stop repertoire of guide RNA library to perturb RBP binding sites, along with several layers of functional information to design both low and high throughput CRISPR/Cas9 screens, for studying the phenotypes and diseases associated with RBP binding sites.

Download Full-text

Overcoming the design, build, test bottleneck for synthesis of nonrepetitive protein-RNA cassettes

Nature Communications ◽

10.1038/s41467-021-21578-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Noa Katz ◽

Eitamar Tripto ◽

Naor Granik ◽

Sarah Goldberg ◽

Orna Atar ◽

...

Keyword(s):

Binding Site ◽

Binding Sites ◽

Mammalian Cells ◽

Coat Proteins ◽

Structural Determinants ◽

Design Build ◽

Machine Learning Approach ◽

Successful Design ◽

Experimental Findings ◽

Rna Interaction

AbstractWe apply an oligo-library and machine learning-approach to characterize the sequence and structural determinants of binding of the phage coat proteins (CPs) of bacteriophages MS2 (MCP), PP7 (PCP), and Qβ (QCP) to RNA. Using the oligo library, we generate thousands of candidate binding sites for each CP, and screen for binding using a high-throughput dose-response Sort-seq assay (iSort-seq). We then apply a neural network to expand this space of binding sites, which allowed us to identify the critical structural and sequence features for binding of each CP. To verify our model and experimental findings, we design several non-repetitive binding site cassettes and validate their functionality in mammalian cells. We find that the binding of each CP to RNA is characterized by a unique space of sequence and structural determinants, thus providing a more complete description of CP-RNA interaction as compared with previous low-throughput findings. Finally, based on the binding spaces we demonstrate a computational tool for the successful design and rapid synthesis of functional non-repetitive binding-site cassettes.

Download Full-text

PRIME-3D2D is a 3D2D model to predict binding sites of protein–RNA interaction

Communications Biology ◽

10.1038/s42003-020-1114-y ◽

2020 ◽

Vol 3 (1) ◽

Author(s):

Juan Xie ◽

Jinfang Zheng ◽

Xu Hong ◽

Xiaoxue Tong ◽

Shiyong Liu

Keyword(s):

Full Advantage ◽

High Throughput ◽

Binding Sites ◽

High Throughput Sequencing ◽

Secondary Structures ◽

Biological Processes ◽

Genome Wide ◽

Rna Interaction ◽

Rna Complexes ◽

Better Than

AbstractProtein-RNA interaction participates in many biological processes. So, studying protein–RNA interaction can help us to understand the function of protein and RNA. Although the protein–RNA 3D3D model, like PRIME, was useful in building 3D structural complexes, it can’t be used genome-wide, due to lacking RNA 3D structures. To take full advantage of RNA secondary structures revealed from high-throughput sequencing, we present PRIME-3D2D to predict binding sites of protein–RNA interaction. PRIME-3D2D is almost as good as PRIME at modeling protein–RNA complexes. PRIME-3D2D can be used to predict binding sites on PDB data (MCC = 0.75/0.70 for binding sites in protein/RNA) and transcription-wide (MCC = 0.285 for binding sites in RNA). Testing on PDB and yeast transcription-wide data show that PRIME-3D2D performs better than other binding sites predictor. So, PRIME-3D2D can be used to predict the binding sites both on PDB and genome-wide, and it’s freely available.

Download Full-text

Determination of Protein−RNA Interaction Sites in the Cbf5-H/ACA Guide RNA Complex by Mass Spectrometric Protein Footprinting†

Biochemistry ◽

10.1021/bi701606m ◽

2008 ◽

Vol 47 (6) ◽

pp. 1500-1510 ◽

Cited By ~ 11

Author(s):

Daniel L. Baker ◽

Nicholas T. Seyfried ◽

Hong Li ◽

Ron Orlando ◽

Rebecca M. Terns ◽

...

Keyword(s):

Mass Spectrometric ◽

Protein Footprinting ◽

Guide Rna ◽

Interaction Sites ◽

Rna Interaction ◽

Rna Complex

Download Full-text

Overcoming the design, build, test (DBT) bottleneck for synthesis of nonrepetitive protein-RNA binding cassettes for RNA applications

10.1101/2019.12.24.886168 ◽

2019 ◽

Cited By ~ 1

Author(s):

Noa Katz ◽

Eitamar Tripto ◽

Sarah Goldberg ◽

Orna Atar ◽

Zohar Yakhini ◽

...

Keyword(s):

Binding Site ◽

High Throughput ◽

Binding Sites ◽

Mammalian Cells ◽

Rna Binding ◽

Coat Proteins ◽

Binding Affinities ◽

Predictive Capability ◽

Design Build ◽

Major Bottleneck

AbstractThe design-build-test (DBT) cycle in synthetic biology is considered to be a major bottleneck for progress in the field. The emergence of high-throughput experimental techniques, such as oligo libraries (OLs), combined with machine learning (ML) algorithms, provide the ingredients for a potential “big-data” solution that can generate a sufficient predictive capability to overcome the DBT bottleneck. In this work, we apply the OL-ML approach to the design of RNA cassettes used in gene editing and RNA tracking systems. RNA cassettes are typically made of repetitive hairpins, therefore hindering their retention, synthesis, and functionality. Here, we carried out a high-throughput OL-based experiment to generate thousands of new binding sites for the phage coat proteins of bacteriophages MS2 (MCP), PP7 (PCP), and Qβ (QCP). We then applied a neural network to vastly expand this space of binding sites to millions of additional predicted sites, which allowed us to identify the structural and sequence features that are critical for the binding of each RBP. To verify our approach, we designed new non-repetitive binding site cassettes and tested their functionality in U2OS mammalian cells. We found that all our cassettes exhibited multiple trackable puncta. Additionally, we designed and verified two additional cassettes, the first containing sites that can bind both PCP and QCP, and the second with sites that can bind either MCP or QCP, allowing for an additional orthogonal channel. Consequently, we provide the scientific community with a novel resource for rapidly creating functional non-repetitive binding site cassettes using one or more of three phage coat proteins with a variety of binding affinities for any application spanning bacteria to mammalian cells.

Download Full-text

Transcriptome-wide high-throughput mapping of protein–RNA occupancy profiles using POP-seq

Scientific Reports ◽

10.1038/s41598-020-80846-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mansi Srivastava ◽

Rajneesh Srivastava ◽

Sarath Chandra Janga

Keyword(s):

High Throughput ◽

Binding Proteins ◽

Rna Binding ◽

Rna Binding Proteins ◽

Interaction Network ◽

K562 Cells ◽

Genomic Variation ◽

Interaction Sites ◽

Protein Occupancy ◽

Rna Interaction

AbstractInteraction between proteins and RNA is critical for post-transcriptional regulatory processes. Existing high throughput methods based on crosslinking of the protein–RNA complexes and poly-A pull down are reported to contribute to biases and are not readily amenable for identifying interaction sites on non poly-A RNAs. We present Protein Occupancy Profile-Sequencing (POP-seq), a phase separation based method in three versions, one of which does not require crosslinking, thus providing unbiased protein occupancy profiles on whole cell transcriptome without the requirement of poly-A pulldown. Our study demonstrates that ~ 68% of the total POP-seq peaks exhibited an overlap with publicly available protein–RNA interaction profiles of 97 RNA binding proteins (RBPs) in K562 cells. We show that POP-seq variants consistently capture protein–RNA interaction sites across a broad range of genes including on transcripts encoding for transcription factors (TFs), RNA-Binding Proteins (RBPs) and long non-coding RNAs (lncRNAs). POP-seq identified peaks exhibited a significant enrichment (p value < 2.2e−16) for GWAS SNPs, phenotypic, clinically relevant germline as well as somatic variants reported in cancer genomes, suggesting the prevalence of uncharacterized genomic variation in protein occupied sites on RNA. We demonstrate that the abundance of POP-seq peaks increases with an increase in expression of lncRNAs, suggesting that highly expressed lncRNA are likely to act as sponges for RBPs, contributing to the rewiring of protein–RNA interaction network in cancer cells. Overall, our data supports POP-seq as a robust and cost-effective method that could be applied to primary tissues for mapping global protein occupancies.

Download Full-text

Transcriptome-wide high-throughput mapping of protein-RNA occupancy profiles using POP-seq

10.1101/2020.12.28.424570 ◽

2020 ◽

Author(s):

Mansi Srivastava ◽

Rajneesh Srivastava ◽

Sarath Chandra Janga

Keyword(s):

High Throughput ◽

Binding Proteins ◽

Rna Binding ◽

Rna Binding Proteins ◽

Interaction Network ◽

Genomic Variation ◽

P Value ◽

Interaction Sites ◽

Protein Occupancy ◽

Rna Interaction

AbstractInteraction between proteins and RNA is critical for post-transcriptional regulatory processes. Existing high throughput methods based on crosslinking of the protein-RNA complexes and polyA pull down are reported to contribute to biases and are not readily amenable for identifying interaction sites on non polyA RNAs. We present Protein Occupancy Profile-Sequencing (POP-seq), a phase separation based method in three versions, one of which does not require crosslinking, thus providing unbiased protein occupancy profiles on whole cell transcriptome without the requirement of polyA pulldown. Our study demonstrates that ~68% of the total POP-seq peaks exhibited an overlap with publicly available protein-RNA interaction profiles of 97 RNA binding proteins (RBPs) in K562 cells. We show that POP-seq variants consistently capture protein-RNA interaction sites across a broad range of genes including on transcripts encoding for transcription factors (TFs), RNA-Binding Proteins (RBPs) and long non-coding RNAs (lncRNAs). POP-seq identified peaks exhibited a significant enrichment (p value < 2.2e-16) for GWAS SNPs, phenotypic, clinically relevant germline as well as somatic variants reported in cancer genomes, suggesting the prevalence of uncharacterized genomic variation in protein occupied sites on RNA. We demonstrate that the abundance of POP-seq peaks increases with an increase in expression of lncRNAs, suggesting that highly expressed lncRNA are likely to act as sponges for RBPs, contributing to the rewiring of protein-RNA interaction network in cancer cells. Overall, our data supports POP-seq as a robust and cost-effective method that could be applied to primary tissues for mapping global protein occupancies.

Download Full-text

Covalent-Fragment Screening of Brd4 Identifies a Ligandable Site Orthogonal to the Acetyl-Lysine Binding Sites

10.26434/chemrxiv.8859098 ◽

2019 ◽

Author(s):

Michael Olp ◽

Daniel Sprague ◽

Stefan Kathman ◽

Ziyang Xu ◽

Alexandar Statsyuk ◽

...

Keyword(s):

Mass Spectrometry ◽

Binding Site ◽

Binding Sites ◽

Computational Prediction ◽

Chemical Probes ◽

Computational Docking ◽

Fragment Screening ◽

Covalent Inhibitors ◽

Bet Protein ◽

Proof Of Principle

<p>Brd4, a member of the bromodomain and extraterminal domain (BET) family, has emerged as a promising epigenetic target in cancer and inflammatory disorders. All reported BET family ligands bind within the bromodomain acetyl-lysine binding sites and competitively inhibit BET protein interaction with acetylated chromatin. Alternative chemical probes that act orthogonally to the highly-conserved acetyl-lysine binding sites may exhibit selectivity within the BET family and avoid recently reported toxicity in clinical trials of BET bromodomain inhibitors. Here, we report the first identification of a ligandable site on a bromodomain outside the acetyl-lysine binding site. Inspired by our computational prediction of hotspots adjacent to non-homologous cysteine residues within the <i>C</i>-terminal Brd4 bromodomain (Brd4-BD2), we performed a mid-throughput mass spectrometry screen to identify cysteine-reactive fragments that covalently and selectively modify Brd4. Subsequent mass spectrometry, NMR and computational docking analyses of electrophilic fragment hits revealed a novel ligandable site near Cys356 that is unique to Brd4 among all human bromodomains. This site is orthogonal to the Brd4-BD2 acetyl-lysine binding site as Cys356 modification did not impact binding of the pan-BET bromodomain inhibitor JQ1 in fluorescence polarization assays. Finally, we tethered covalent fragments to JQ1 and performed NanoBRET assays to provide proof of principle that this orthogonal site can be covalently targeted in intact human cells. Overall, we demonstrate the potential of targeting sites orthogonal to bromodomain acetyl-lysine binding sites to develop bivalent and covalent inhibitors that displace Brd4 from chromatin.</p>

Download Full-text

Effects of Multiple Binding Sites on Studies of Hydrogen Bonding between Nitroxide Radicals and Solvent Molecules

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc19930047 ◽

1993 ◽

Vol 58 (1) ◽

pp. 47-52 ◽

Cited By ~ 2

Author(s):

Imad Al-Bala'a ◽

Richard D. Bates

Keyword(s):

Hydrogen Bonding ◽

Magnetic Resonance ◽

Binding Site ◽

Binding Sites ◽

Hyperfine Coupling Constant ◽

Hydrogen Donor ◽

Complex Formation Constants ◽

Multiple Binding Sites ◽

Multiple Binding ◽

Solvent Molecules

The role of more than one binding site on a nitroxide free radical in magnetic resonance determinations of the properties of the complex formed with a hydrogen donor is examined. The expression that relates observed hyperfine couplings in EPR spectra to complex formation constants and concentrations of each species in solution becomes much more complex when multiple binding sites are present, but reduces to a simpler form when binding at the two sites occurs independently and the binding at the non-nitroxide site does not produce significant differences in the hyperfine coupling constant in the complexed radical. Effects on studies of hydrogen bonding between multiple binding site nitroxides and hydrogen donor solvent molecules by other magnetic resonance methods are potentially more extreme.

Download Full-text

Tuning of Electronic Properties in Conducting Polymers

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc20011208 ◽

2001 ◽

Vol 66 (8) ◽

pp. 1208-1218 ◽

Cited By ~ 3

Author(s):

Guofeng Li ◽

Mira Josowicz ◽

Jiří Janata

Keyword(s):

Hydrogen Bonding ◽

Binding Site ◽

Binding Sites ◽

Polymer Chains ◽

Electronic Transitions ◽

Weakly Bound ◽

Ordered Structures ◽

Doped Polymer ◽

Positively Charged ◽

Repeated Cycling

Structural and electronic transitions in poly(thiophenyleneiminophenylene), usually referred to as poly(phenylenesulfidephenyleneamine) (PPSA) upon electrochemical doping with LiClO4 have been investigated. The unusual electrochemical behavior of PPSA indicates that the dopant anions are bound in two energetically different sites. In the so-called "binding site", the ClO4- anion is Coulombically attracted to the positively charged S or N sites on one chain and simultaneously hydrogen-bonded with the N-H group on a neighboring polymer chain. This strong interaction causes a re-organization of the polymer chains, resulting in the formation of a networked structure linked together by these ClO4- Coulombic/hydrogen bonding "bridges". However, in the "non-binding site", the ClO4- anion is very weakly bound, involves only the electrostatic interaction and can be reversibly exchanged when the doped polymer is reduced. In the repeated cycling, the continuous and alternating influx and expulsion of ClO4- ions serves as a self-organizing process for such networked structures, giving rise to a diminishing number of available "non-binding" sites. The occurrence of these ordered structures has a major impact on the electrochemical activity and the morphology of the doped polymer. Also due to stabilization of the dopant ions, the doped polymer can be kept in a stable and desirable oxidation state, thus both work function and conductivity of the polymer can be electrochemically controlled.

Download Full-text

Genes encoding components of the olfactory signal transduction cascade contain a DNA binding site that may direct neuronal expression.

Molecular and Cellular Biology ◽

10.1128/mcb.13.9.5805 ◽

1993 ◽

Vol 13 (9) ◽

pp. 5805-5813 ◽

Cited By ~ 52

Author(s):

M M Wang ◽

R Y Tsai ◽

K A Schrader ◽

R R Reed

Keyword(s):

Signal Transduction ◽

Dna Binding ◽

Binding Site ◽

Binding Sites ◽

Consensus Sequence ◽

Initiation Site ◽

Olfactory Neuron ◽

Promoter Regions ◽

Transcriptional Initiation ◽

Genes Encoding

Genes which mediate odorant signal transduction are expressed at high levels in neurons of the olfactory epithelium. The molecular mechanism governing the restricted expression of these genes likely involves tissue-specific DNA binding proteins which coordinately activate transcription through sequence-specific interactions with olfactory promoter regions. We have identified binding sites for the olfactory neuron-specific transcription factor, Olf-1, in the sequences surrounding the transcriptional initiation site of five olfactory neuron-specific genes. The Olf-1 binding sites described define the consensus sequence YTCCCYRGGGAR. In addition, we have identified a second binding site, the U site, in the olfactory cyclic nucleotide gated channel and type III cyclase promoters, which binds factors present in all tissue examined. These experiments support a model in which expression of Olf-1 in the sensory neurons coordinately activates a set of olfactory neuron-specific genes. Furthermore, expression of a subset of these genes may be modulated by additional binding factors.

Download Full-text