Genome-wide discovery of local RNA structural elements in Zika virus

10.7287/peerj.preprints.27101v1 ◽

2018 ◽

Author(s):

Ryan J Andrews ◽

Julien Roche ◽

Walter N Moss

Keyword(s):

Zika Virus ◽

Genome Replication ◽

Rna Structures ◽

Coding Region ◽

Step Size ◽

Base Pairs ◽

Local Structures ◽

Genome Wide ◽

Viral Polyprotein ◽

Functional Rna

In addition to encoding RNA primary structures, genomes also encode RNA secondary and tertiary structures that play roles in gene regulation and, in the case of RNA viruses, genome replication. Methods for the identification of functional RNA structures in genomes typically rely on scanning analysis windows, where multiple partially-overlapping windows are used to predict RNA structures and folding metrics to deduce regions likely to form functional structure. Separate structural models are produced for each window, where the step size can greatly affect the returned model. This makes deducing unique local structures challenging, as the same nucleotides in each window can be alternatively base paired. In the presented approach, all base pairs from all analysis windows are considered and weighted by favorable folding metrics throughout all windows. This results in unique base pairing throughout the genome and the generation of local regions/structures that can be ranked by their propensity to form unusually thermodynamically stable folds. This approach was applied to the Zika virus (ZIKV) genome. ZIKV is linked to a variety of neurological ailments including microcephaly and Guillain-Barré syndrome and its (+)-sense RNA genome encodes two, previously described, functionally essential structured RNA regions. Our approach is able to successfully identify and model the structures of these regions, while also finding additional regions likely to form functional RNA structures throughout the viral polyprotein coding region. All data for the ZIKV genome have been archived at the RNAStructuromeDB, a repository of RNA folding data for humans and their pathogens.

Download Full-text

ScanFold: an approach for genome-wide discovery of local RNA structural elements—applications to Zika virus and HIV

PeerJ ◽

10.7717/peerj.6136 ◽

2018 ◽

Vol 6 ◽

pp. e6136 ◽

Cited By ~ 15

Author(s):

Ryan J. Andrews ◽

Julien Roche ◽

Walter N. Moss

Keyword(s):

Zika Virus ◽

Genome Replication ◽

Rna Structures ◽

Step Size ◽

Base Pairs ◽

Rna Motifs ◽

Tertiary Structures ◽

Local Structures ◽

Genome Wide ◽

Functional Rna

In addition to encoding RNA primary structures, genomes also encode RNA secondary and tertiary structures that play roles in gene regulation and, in the case of RNA viruses, genome replication. Methods for the identification of functional RNA structures in genomes typically rely on scanning analysis windows, where multiple partially-overlapping windows are used to predict RNA structures and folding metrics to deduce regions likely to form functional structure. Separate structural models are produced for each window, where the step size can greatly affect the returned model. This makes deducing unique local structures challenging, as the same nucleotides in each window can be alternatively base paired. We are presenting here a new approach where all base pairs from analysis windows are considered and weighted by favorable folding. This results in unique base pairing throughout the genome and the generation of local regions/structures that can be ranked by their propensity to form unusually thermodynamically stable folds. We applied this approach to the Zika virus (ZIKV) and HIV-1 genomes. ZIKV is linked to a variety of neurological ailments including microcephaly and Guillain–Barré syndrome and its (+)-sense RNA genome encodes two, previously described, functionally essential structured RNA regions. HIV, the cause of AIDS, contains multiple functional RNA motifs in its genome, which have been extensively studied. Our approach is able to successfully identify and model the structures of known functional motifs in both viruses, while also finding additional regions likely to form functional structures. All data have been archived at the RNAStructuromeDB (www.structurome.bb.iastate.edu), a repository of RNA folding data for humans and their pathogens.

Download Full-text

Comparative analysis of protein evolution and RNA structural changes in the genome of pre-epidemic and epidemic Zika virus

10.1101/050278 ◽

2016 ◽

Author(s):

Arunachalam Ramaiah ◽

Lei Dai ◽

Deisy Contreras ◽

Sanjeev Sinha ◽

Ren Sun ◽

...

Keyword(s):

Virus Replication ◽

Protein Evolution ◽

Zika Virus ◽

Structural Changes ◽

Yellow Fever Virus ◽

Rna Structures ◽

Stem Loop ◽

Human Host ◽

Genome Wide ◽

Poor Pregnancy Outcome

ABSTRACTZika virus (ZIKV) infection is associated with microcephaly, neurological disorders and poor pregnancy outcome1-3and no vaccine is available. Although ZIKV was first discovered in 1947, the exact mechanism of virus replication and pathogenesis still remains unknown. Recent outbreaks of Zika virus in the Americas clearly suggest a better adaptation of viral strains to human host. Understanding the conserved and adaptive features in the evolution of ZIKV genome will reveal the molecular mechanism of virus replication and host adaptation. Here, we show comprehensive analysis of protein evolution and changes in RNA secondary structures of ZIKV strains including the current 2015-16 outbreak. To identify the constraints on ZIKV evolution, selection pressure at individual codons, immune epitopes, co-evolving sites, and RNA structures were analyzed. The proteome of current 2015/16 epidemic ZIKV strains of Asian genotype is found to be genetically conserved due to genome-wide negative selection on codons, with limited positive selection. Predicted RNA structures at the 5’ and 3’ ends of ZIKV strains reveal substantial changes such as an additional stem loop which makes it similar to that of Yellow Fever Virus. Concisely, the targeted changes at both the amino acid and the RNA levels contribute to the better adaptation of ZIKV strains to human host with an enhanced neurotropism.

Download Full-text

Functional genomics of autoimmune diseases

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2019-216794 ◽

2021 ◽

pp. annrheumdis-2019-216794

Author(s):

Akari Suzuki ◽

Matteo Maurizio Guerrini ◽

Kazuhiko Yamamoto

Keyword(s):

Autoimmune Diseases ◽

Association Studies ◽

Linkage Disequilibrium Block ◽

Causal Variant ◽

Genome Wide Association Studies ◽

Coding Region ◽

Risk Variants ◽

Non Coding Rna ◽

Genome Wide ◽

Long Non Coding Rna

For more than a decade, genome-wide association studies have been applied to autoimmune diseases and have expanded our understanding on the pathogeneses. Genetic risk factors associated with diseases and traits are essentially causative. However, elucidation of the biological mechanism of disease from genetic factors is challenging. In fact, it is difficult to identify the causal variant among multiple variants located on the same haplotype or linkage disequilibrium block and thus the responsible biological genes remain elusive. Recently, multiple studies have revealed that the majority of risk variants locate in the non-coding region of the genome and they are the most likely to regulate gene expression such as quantitative trait loci. Enhancer, promoter and long non-coding RNA appear to be the main target mechanisms of the risk variants. In this review, we discuss functional genetics to challenge these puzzles.

Download Full-text

Conserved long-range base pairings are associated with pre-mRNA processing of human genes

Nature Communications ◽

10.1038/s41467-021-22549-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Svetlana Kalmykova ◽

Marina Kalinina ◽

Stepan Denisov ◽

Alexey Mironov ◽

Dmitry Skvortsov ◽

...

Keyword(s):

Long Range ◽

Rna Folding ◽

Current Knowledge ◽

Rna Structures ◽

Base Pairs ◽

Protein Coding ◽

Proximity Ligation ◽

Transcriptional Suppression ◽

Human Genes ◽

Cleavage And Polyadenylation

AbstractThe ability of nucleic acids to form double-stranded structures is essential for all living systems on Earth. Current knowledge on functional RNA structures is focused on locally-occurring base pairs. However, crosslinking and proximity ligation experiments demonstrated that long-range RNA structures are highly abundant. Here, we present the most complete to-date catalog of conserved complementary regions (PCCRs) in human protein-coding genes. PCCRs tend to occur within introns, suppress intervening exons, and obstruct cryptic and inactive splice sites. Double-stranded structure of PCCRs is supported by decreased icSHAPE nucleotide accessibility, high abundance of RNA editing sites, and frequent occurrence of forked eCLIP peaks. Introns with PCCRs show a distinct splicing pattern in response to RNAPII slowdown suggesting that splicing is widely affected by co-transcriptional RNA folding. The enrichment of 3’-ends within PCCRs raises the intriguing hypothesis that coupling between RNA folding and splicing could mediate co-transcriptional suppression of premature pre-mRNA cleavage and polyadenylation.

Download Full-text

Characterization of a Novel Thermobifida fusca Bacteriophage P318

Viruses ◽

10.3390/v11111042 ◽

2019 ◽

Vol 11 (11) ◽

pp. 1042

Author(s):

Cheepudom ◽

Lin ◽

Lee ◽

Meng

Keyword(s):

Plant Cell Wall ◽

Hydrolytic Enzymes ◽

Thermobifida Fusca ◽

Genome Replication ◽

Putative Orfs ◽

Base Pairs ◽

Double Stranded Dna ◽

Virion Morphogenesis ◽

Genome Information

Thermobifida fusca is of biotechnological interest due to its ability to produce an array of plant cell wall hydrolytic enzymes. Nonetheless, only one T. fusca bacteriophage with genome information has been reported to date. This study was aimed at discovering more relevant bacteriophages to expand the existing knowledge of phage diversity for this host species. With this end in view, a thermostable T. fusca bacteriophage P318, which belongs to the Siphoviridae family, was isolated and characterized. P318 has a double-stranded DNA genome of 48,045 base pairs with 3′-extended COS ends, on which 52 putative ORFs are organized into clusters responsible for the order of genome replication, virion morphogenesis, and the regulation of the lytic/lysogenic cycle. In comparison with T. fusca and the previously discovered bacteriophage P1312, P318 has a much lower G+C content in its genome except at the region encompassing ORF42, which produced a protein with unknown function. P1312 and P318 share very few similarities in their genomes except for the regions encompassing ORF42 of P318 and ORF51 of P1312 that are homologous. Thus, acquisition of ORF42 by lateral gene transfer might be an important step in the evolution of P318.

Download Full-text

RAD3 gene of Saccharomyces cerevisiae: nucleotide sequence of wild-type and mutant alleles, transcript mapping, and aspects of gene regulation

Molecular and Cellular Biology ◽

10.1128/mcb.5.1.17-26.1985 ◽

1985 ◽

Vol 5 (1) ◽

pp. 17-26

Author(s):

L Naumovski ◽

G Chu ◽

P Berg ◽

E C Friedberg

Keyword(s):

Saccharomyces Cerevisiae ◽

Nucleotide Sequence ◽

Coding Region ◽

Base Pairs ◽

S1 Nuclease ◽

Calculated Molecular Weight ◽

Base Pair Deletion ◽

S1 Nuclease Mapping ◽

Essential Function ◽

Nuclease Mapping

We determined the complete nucleotide sequence of the RAD3 gene of Saccharomyces cerevisiae. The coding region of the gene contained 2,334 base pairs that could encode a protein with a calculated molecular weight of 89,796. Analysis of RAD3 mRNA by Northern blots and by S1 nuclease mapping indicated that the transcript was approximately 2.5 kilobases and did not contain intervening sequences. Fusions between the RAD3 gene and the lac'Z gene of Escherichia coli were constructed and used to demonstrate that the RAD3 gene was not inducible by DNA damage caused by UV radiation or 4-nitroquinoline-1-oxide. Two UV-sensitive chromosomal mutant alleles of RAD3, rad3-1 and rad3-2, were rescued by gap repair of a centromeric plasmid, and their sequences were determined. The rad3-1 mutation changed a glutamic acid to lysine, and the rad3-2 mutation changed a glycine to arginine. Previous studies have shown that disruption of the RAD3 gene results in loss of an essential function and is associated with inviability of haploid cells. In the present experiments, plasmids carrying the rad3-1 and rad3-2 mutations were introduced into haploid cells containing a disrupted RAD3 gene. These plasmids expressed the essential function of RAD3 but not its DNA repair function. A 74-base-pair deletion at the 3' end of the RAD3 coding region or a fusion of this deletion to the E. coli lac'Z gene did not affect either function of RAD3.

Download Full-text

Modeling cell-free DNA fragment size densities for non-invasive detection of cancer.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.3058 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. 3058-3058

Author(s):

Jacob Carey ◽

Bryan Chesnick ◽

Denise Butler ◽

Michael Rongione ◽

Giovanni Parmigiani ◽

...

Keyword(s):

Fragment Size ◽

Length Distribution ◽

Mixture Component ◽

Base Pairs ◽

Machine Model ◽

Cell Free Dna ◽

Non Invasive ◽

Free Dna ◽

Genome Wide ◽

Low Coverage

3058 Background: Circulating cell-free DNA (cfDNA) is largely nucleosomal in origin with typical fragment lengths of 167 base-pairs reflecting the length of DNA wrapped around-the histone and H1 linker. Given the nucleosomal origin of cfDNA, we have previously used low coverage whole genome sequencing to evaluate DNA fragmentation profiles to sensitively and specifically detect tumor-derived DNA with altered fragment lengths or coverage. Methods: Here we evaluate the use of Bayesian finite mixtures to model the fragment length distribution and demonstrate how the parameters from these models can be useful to distinguish between individuals with and without cancer. We examined the number of cfDNA fragments by size ranging from 100-220bp and approximated the mixture component location, scale, and weight using Markov Chain Monte Carlo. The performance of the method was determined using a ten-fold, ten repeat cross-validation of Gradient Boosted Machine model using 1) our previously described genome-wide fragmentation profile approach, 2) the parameters from the mixture model and 3) a combination of approaches 1) and 2) as features. Results: In this study of 215 cancer patients and 208 cancer-free individuals, we observed cross-validated AUCs of 1) 0.94, 2) 0.95, and 3) 0.97 among the three approaches. Conclusions: Our findings indicate that parsimonious mixture models may improve detection of cancer in conjunction with fragmentation profile analyses across the genome.

Download Full-text

The structure of an RNA dodecamer shows how tandem U–U base pairs increase the range of stable RNA structures and the diversity of recognition sites

Structure ◽

10.1016/s0969-2126(96)00099-8 ◽

1996 ◽

Vol 4 (8) ◽

pp. 917-930 ◽

Cited By ~ 54

Author(s):

Susan E Lietzke ◽

Cindy L Barnes ◽

J Andrew Berglund ◽

Craig E Kundrot

Keyword(s):

Rna Structures ◽

Base Pairs ◽

Recognition Sites

Download Full-text

Star-PAP RNA Binding Landscape Reveals Novel Role of Star-PAP in mRNA Metabolism That Requires RBM10-RNA Association

International Journal of Molecular Sciences ◽

10.3390/ijms22189980 ◽

2021 ◽

Vol 22 (18) ◽

pp. 9980

Author(s):

Ganesh R. Koshre ◽

Feba Shaji ◽

Neeraja K. Mohanan ◽

Nimmy Mohan ◽

Jamshaid Ali ◽

...

Keyword(s):

Rna Binding ◽

Coding Region ◽

High Association ◽

Target Mrnas ◽

Mrna Metabolism ◽

Mrna Targets ◽

Genome Wide ◽

Global Profile ◽

Specific Mrna

Star-PAP is a non-canonical poly(A) polymerase that selects mRNA targets for polyadenylation. Yet, genome-wide direct Star-PAP targets or the mechanism of specific mRNA recognition is still vague. Here, we employ HITS-CLIP to map the cellular Star-PAP binding landscape and the mechanism of global Star-PAP mRNA association. We show a transcriptome-wide association of Star-PAP that is diminished on Star-PAP depletion. Consistent with its role in the 3′-UTR processing, we observed a high association of Star-PAP at the 3′-UTR region. Strikingly, there is an enrichment of Star-PAP at the coding region exons (CDS) in 42% of target mRNAs. We demonstrate that Star-PAP binding de-stabilises these mRNAs indicating a new role of Star-PAP in mRNA metabolism. Comparison with earlier microarray data reveals that while UTR-associated transcripts are down-regulated, CDS-associated mRNAs are largely up-regulated on Star-PAP depletion. Strikingly, the knockdown of a Star-PAP coregulator RBM10 resulted in a global loss of Star-PAP association on target mRNAs. Consistently, RBM10 depletion compromises 3′-end processing of a set of Star-PAP target mRNAs, while regulating stability/turnover of a different set of mRNAs. Our results establish a global profile of Star-PAP mRNA association and a novel role of Star-PAP in the mRNA metabolism that requires RBM10-mRNA association in the cell.

Download Full-text