The C. elegans 3’-UTRome V2: an updated genomic resource to study 3’-UTR biology

Mapping Intimacies ◽

10.1101/704098 ◽

2019 ◽

Author(s):

HS Steber ◽

C Gallante ◽

S O’Brien ◽

P.-L Chiu ◽

M Mangone

Keyword(s):

Living Organism ◽

Regulatory Elements ◽

Untranslated Regions ◽

High Quality ◽

Protein Coding ◽

C Elegans ◽

Mrna Cleavage ◽

Entire Collection ◽

Cleavage And Polyadenylation ◽

Genomic Resource

ABSTRACT3’-Untranslated Regions (3’-UTRs) of mRNAs emerged as central regulators of cellular function as they contain important but poorly-characterized cis-regulatory elements targeted by a multitude of regulatory factors. The model nematode C. elegans is ideal to study these interactions since it possesses a well-defined 3’-UTRome. In order to improve its annotation, we have used a genomics approach to download raw transcriptome data for 1,088 transcriptome datasets corresponding to the entire collection of C. elegans trancriptomes from 2015 to 2018 from the Sequence Read Archive at the NCBI. We then extracted and mapped high-quality 3’-UTR data at ultra-deep coverage. Here we describe and release to the community the updated version of the worm 3’-UTRome, which we named 3’-UTRome v2. This resource contains high-quality 3’-UTR data mapped at single base ultra-resolution for 23,084 3’-UTR isoform variants corresponding to 14,788 protein-coding genes and is updated to the latest release of WormBase. We used this dataset to study and probe principles of mRNA cleavage and polyadenylation in C. elegans. The worm 3’-UTRome v2 represents the most comprehensive and high-resolution 3’-UTR dataset available in C. elegans and provides a novel resource to investigate the mRNA cleavage and polyadenylation reaction, 3’-UTR biology and miRNA targeting in a living organism.

Download Full-text

Chromatin accessibility is dynamically regulated across C. elegans development and ageing

10.1101/279158 ◽

2018 ◽

Cited By ~ 1

Author(s):

Jürgen Jänes ◽

Yan Dong ◽

Michael Schoof ◽

Jacques Serizay ◽

Alex Appert ◽

...

Keyword(s):

Regulatory Mechanism ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Protein Coding ◽

C Elegans ◽

Transcription Profiles ◽

Physiological Processes ◽

Global Identification ◽

Identification And Characterization

AbstractAn essential step for understanding the transcriptional circuits that control development and physiology is the global identification and characterization of regulatory elements. Here we present the first map of regulatory elements across the development and ageing of an animal, identifying 42,245 elements accessible in at least one C. elegans stage. Based on nuclear transcription profiles, we define 15,714 protein-coding promoters and 19,231 putative enhancers, and find that both types of element can drive orientation-independent transcription. Additionally, hundreds of promoters produce transcripts antisense to protein coding genes, suggesting involvement in a widespread regulatory mechanism. We find that the accessibility of most elements is regulated during development and/or ageing and that patterns of accessibility change are linked to specific developmental or physiological processes. The map and characterization of regulatory elements across C. elegans life provides a platform for understanding how transcription controls development and ageing.

Download Full-text

Chromosome-Level Genome Assembly and Annotation of a Sciaenid Fish, Argyrosomus japonicus

Genome Biology and Evolution ◽

10.1093/gbe/evaa246 ◽

2021 ◽

Vol 13 (2) ◽

Author(s):

Linlin Zhao ◽

Shengyong Xu ◽

Zhiqiang Han ◽

Qi Liu ◽

Wensi Ke ◽

...

Keyword(s):

Genome Assembly ◽

Wide Distribution ◽

High Quality ◽

Protein Coding ◽

Repeat Elements ◽

Long Reads ◽

The Family ◽

Solid Foundation ◽

Genomic Resource ◽

Chromosome Level

Abstract Argyrosomus japonicus is an economically and ecologically important fish species in the family Sciaenidae with a wide distribution in the world’s oceans. Here, we report a high-quality, chromosome-level genome assembly of A. japonicus based on PacBio and Hi-C sequencing technology. A 673.7-Mb genome containing 282 contigs with an N50 length of 18.4 Mb was obtained based on PacBio long reads. These contigs were further ordered and clustered into 24 chromosome groups based on Hi-C data. In addition, a total of 217.2 Mb (32.24% of the assembled genome) of sequences were identified as repeat elements, and 23,730 protein-coding genes were predicted based on multiple approaches. More than 97% of BUSCO genes were identified in the A. japonicus genome. The high-quality genome assembled in this work not only provides a valuable genomic resource for future population genetics, conservation biology and selective breeding studies of A. japonicus but also lays a solid foundation for the study of Sciaenidae evolution.

Download Full-text

Chromatin accessibility dynamics across C. elegans development and ageing

eLife ◽

10.7554/elife.37344 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 22

Author(s):

Jürgen Jänes ◽

Yan Dong ◽

Michael Schoof ◽

Jacques Serizay ◽

Alex Appert ◽

...

Keyword(s):

Regulatory Mechanism ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Protein Coding ◽

C Elegans ◽

Transcription Profiles ◽

Physiological Processes ◽

Global Identification ◽

Identification And Characterization

An essential step for understanding the transcriptional circuits that control development and physiology is the global identification and characterization of regulatory elements. Here, we present the first map of regulatory elements across the development and ageing of an animal, identifying 42,245 elements accessible in at least one Caenorhabditis elegans stage. Based on nuclear transcription profiles, we define 15,714 protein-coding promoters and 19,231 putative enhancers, and find that both types of element can drive orientation-independent transcription. Additionally, more than 1000 promoters produce transcripts antisense to protein coding genes, suggesting involvement in a widespread regulatory mechanism. We find that the accessibility of most elements changes during development and/or ageing and that patterns of accessibility change are linked to specific developmental or physiological processes. The map and characterization of regulatory elements across C. elegans life provides a platform for understanding how transcription controls development and ageing.

Download Full-text

The RS Domain of Human CFIm68 Plays a Key Role in Selection Between Alternative Sites of Pre-mRNA Cleavage and Polyadenylation

10.1101/177980 ◽

2017 ◽

Cited By ~ 2

Author(s):

Jessica G. Hardy ◽

Michael Tellier ◽

Shona Murphy ◽

Chris J. Norbury

Keyword(s):

Molecular Mechanisms ◽

Alternative Polyadenylation ◽

Hek293 Cells ◽

Translation Efficiency ◽

Mrna Isoforms ◽

Protein Coding ◽

Mrna Cleavage ◽

Cleavage And Polyadenylation ◽

Alternative Sites

AbstractMany eukaryotic protein-coding genes give rise to alternative mRNA isoforms with identical protein-coding capacities but which differ in the extents of their 3´ untranslated regions (3´UTRs), due to the usage of alternative sites of pre-mRNA cleavage and polyadenylation. By governing the presence of regulatory 3´UTR sequences, this type of alternative polyadenylation (APA) can significantly influence the stability, localisation and translation efficiency of mRNA. Though a variety of molecular mechanisms for APA have been proposed, previous studies have identified a pivotal role for the multi-subunit cleavage factor I (CFIm) in this process in mammals. Here we show that, in line with previous reports, depletion of the CFIm 68 kDa subunit (CFIm68) by CRISPR/Cas9-mediated gene disruption in HEK293 cells leads to a shift towards the use of promoter-proximal poly(A) sites. Using these cells as the basis for a complementation assay, we show that CFIm68 lacking its arginine/serine-rich (RS) domain retains the ability to form a nuclear complex with other CFIm subunits, but selectively lacks the capacity to restore polyadenylation at promoter-distal sites. In addition, nanoparticle-mediated analysis indicates that the RS domain is extensively phosphorylated in vivo. Overall, these results suggest that the CFIm68 RS domain makes a key regulatory contribution to APA.

Download Full-text

The C. elegans 3′ UTRome v2 resource for studying mRNA cleavage and polyadenylation, 3′-UTR biology, and miRNA targeting

Genome Research ◽

10.1101/gr.254839.119 ◽

2019 ◽

Vol 29 (12) ◽

pp. 2104-2116 ◽

Cited By ~ 2

Author(s):

Hannah S. Steber ◽

Christina Gallante ◽

Shannon O'Brien ◽

Po-Lin Chiu ◽

Marco Mangone

Keyword(s):

C Elegans ◽

Mrna Cleavage ◽

Cleavage And Polyadenylation ◽

Mirna Targeting

Download Full-text

Conserved long-range base pairings are associated with pre-mRNA processing of human genes

Nature Communications ◽

10.1038/s41467-021-22549-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Svetlana Kalmykova ◽

Marina Kalinina ◽

Stepan Denisov ◽

Alexey Mironov ◽

Dmitry Skvortsov ◽

...

Keyword(s):

Long Range ◽

Rna Folding ◽

Current Knowledge ◽

Rna Structures ◽

Base Pairs ◽

Protein Coding ◽

Proximity Ligation ◽

Transcriptional Suppression ◽

Human Genes ◽

Cleavage And Polyadenylation

AbstractThe ability of nucleic acids to form double-stranded structures is essential for all living systems on Earth. Current knowledge on functional RNA structures is focused on locally-occurring base pairs. However, crosslinking and proximity ligation experiments demonstrated that long-range RNA structures are highly abundant. Here, we present the most complete to-date catalog of conserved complementary regions (PCCRs) in human protein-coding genes. PCCRs tend to occur within introns, suppress intervening exons, and obstruct cryptic and inactive splice sites. Double-stranded structure of PCCRs is supported by decreased icSHAPE nucleotide accessibility, high abundance of RNA editing sites, and frequent occurrence of forked eCLIP peaks. Introns with PCCRs show a distinct splicing pattern in response to RNAPII slowdown suggesting that splicing is widely affected by co-transcriptional RNA folding. The enrichment of 3’-ends within PCCRs raises the intriguing hypothesis that coupling between RNA folding and splicing could mediate co-transcriptional suppression of premature pre-mRNA cleavage and polyadenylation.

Download Full-text

Structure and expression of canary myc family genes

Molecular and Cellular Biology ◽

10.1128/mcb.11.3.1770-1776.1991 ◽

1991 ◽

Vol 11 (3) ◽

pp. 1770-1776

Author(s):

R G Collum ◽

D F Clayton ◽

F W Alt

Keyword(s):

Untranslated Region ◽

Untranslated Regions ◽

Coding Region ◽

Protein Coding ◽

Coding Regions ◽

Neuronal Precursors ◽

Myc Gene ◽

Mature Neurons

We found that the canary N-myc gene is highly related to mammalian N-myc genes in both the protein-coding region and the long 3' untranslated region. Examined coding regions of the canary c-myc gene were also highly related to their mammalian counterparts, but in contrast to N-myc, the canary and mammalian c-myc genes were quite divergent in their 3' untranslated regions. We readily detected N-myc and c-myc expression in the adult canary brain and found N-myc expression both at sites of proliferating neuronal precursors and in mature neurons.

Download Full-text

The Long Road to Understanding RNAPII Transcription Initiation and Related Syndromes

Annual Review of Biochemistry ◽

10.1146/annurev-biochem-090220-112253 ◽

2021 ◽

Vol 90 (1) ◽

pp. 193-219

Author(s):

Emmanuel Compe ◽

Jean-Marc Egly

Keyword(s):

Transcription Factors ◽

Rna Polymerase Ii ◽

Chromatin Remodeling ◽

Transcription Initiation ◽

Rna Synthesis ◽

Regulatory Elements ◽

Initiation Mechanism ◽

Protein Coding ◽

Core Promoters ◽

General Transcription Factors

In eukaryotes, transcription of protein-coding genes requires the assembly at core promoters of a large preinitiation machinery containing RNA polymerase II (RNAPII) and general transcription factors (GTFs). Transcription is potentiated by regulatory elements called enhancers, which are recognized by specific DNA-binding transcription factors that recruit cofactors and convey, following chromatin remodeling, the activating cues to the preinitiation complex. This review summarizes nearly five decades of work on transcription initiation by describing the sequential recruitment of diverse molecular players including the GTFs, the Mediator complex, and DNA repair factors that support RNAPII to enable RNA synthesis. The elucidation of the transcription initiation mechanism has greatly benefited from the study of altered transcription components associated with human diseases that could be considered transcription syndromes.

Download Full-text

The dark matter of the cancer genome: aberrations in regulatory elements, untranslated regions, splice sites, non‐coding RNA and synonymous mutations

EMBO Molecular Medicine ◽

10.15252/emmm.201506055 ◽

2016 ◽

Vol 8 (5) ◽

pp. 442-457 ◽

Cited By ~ 115

Author(s):

Sven Diederichs ◽

Lorenz Bartsch ◽

Julia C Berkmann ◽

Karin Fröse ◽

Jana Heitmann ◽

...

Keyword(s):

Dark Matter ◽

Regulatory Elements ◽

Cancer Genome ◽

Untranslated Regions ◽

Splice Sites ◽

Synonymous Mutations ◽

Non Coding Rna

Download Full-text

Whole-Genome Sequencing of Chinese Yellow Catfish Provides a Valuable Genetic Resource for High-Throughput Identification of Toxin Genes

Toxins ◽

10.3390/toxins10120488 ◽

2018 ◽

Vol 10 (12) ◽

pp. 488 ◽

Cited By ~ 5

Author(s):

Shiyong Zhang ◽

Jia Li ◽

Qin Qin ◽

Wei Liu ◽

Chao Bian ◽

...

Keyword(s):

High Throughput ◽

Genome Assembly ◽

Raw Materials ◽

Pelteobagrus Fulvidraco ◽

Yellow Catfish ◽

High Quality ◽

Protein Coding ◽

Toxin Genes ◽

Sequencing Platforms ◽

High Quality Genome

Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ≈6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.

Download Full-text