scholarly journals Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA

2019 ◽  
Author(s):  
Hiro Takahashi ◽  
Noriya Hayashi ◽  
Yui Yamashita ◽  
Satoshi Naito ◽  
Anna Takahashi ◽  
...  

AbstractBackgroundUpstream open reading frames (uORFs) in the 5′-untranslated regions (5′-UTRs) of certain eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control translation of downstream main ORFs (mORFs). For genome-wide searches for uORFs with conserved peptide sequences (CPuORFs), comparative genomic studies have been conducted, in which uORF sequences were compared between selected species. To increase chances of identifying CPuORFs, we previously developed an approach in which uORF sequences were compared using BLAST between Arabidopsis and any other plant species with available transcript sequence databases. If this approach is applied to multiple plant species belonging to phylogenetically distant clades, it is expected to further comprehensively identify CPuORFs conserved in various plant lineages, including those conserved among relatively small taxonomic groups.ResultsTo efficiently compare uORF sequences among many species and efficiently identify CPuORFs conserved in various taxonomic lineages, we developed a novel pipeline, ESUCA. We applied ESUCA to the genomes of five angiosperm species, which belong to phylogenetically distant clades, and selected CPuORFs conserved among at least three different orders. Through these analyses, we identified 88 novel CPuORF families. As expected, ESUCA analysis of each of the five angiosperm genomes identified many CPuORFs that were not identified from ESUCA analyses of the other four species. However, unexpectedly, these CPuORFs include those conserved in wide taxonomic ranges, indicating that the approach used here is useful not only for comprehensive identification of narrowly conserved CPuORFs but also for that of widely conserved CPuORFs. Examination of the effects of 11 selected CPuORFs on mORF translation revealed that CPuORFs conserved only in relatively narrow taxonomic ranges can have sequence-dependent regulatory effects, suggesting that most of the identified CPuORFs are conserved because of functional constraints of their encoded peptides.ConclusionsThis study demonstrates that ESUCA is capable of efficiently identifying CPuORFs likely to be conserved because of the functional importance of their encoded peptides. Furthermore, our data show that the approach in which uORF sequences from multiple species are compared with those of many other species, using ESUCA, is highly effective in comprehensively identifying CPuORFs conserved in various taxonomic ranges.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Hiro Takahashi ◽  
Shido Miyaki ◽  
Hitoshi Onouchi ◽  
Taichiro Motomura ◽  
Nobuo Idesako ◽  
...  

Abstract Upstream open reading frames (uORFs) are present in the 5′-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.


2019 ◽  
Author(s):  
Hiro Takahashi ◽  
Shido Miyaki ◽  
Hitoshi Onouchi ◽  
Taichiro Motomura ◽  
Nobuo Idesako ◽  
...  

AbstractUpstream open reading frames (uORFs) are present in the 5’-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1,517 (1,373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.


2021 ◽  
Author(s):  
Yuta Hiragori ◽  
Hiro Takahashi ◽  
Noriya Hayashi ◽  
Shun Sasaki ◽  
Kodai Nakao ◽  
...  

Upstream open reading frames (uORFs) are short ORFs found in the 5′-UTRs of many eukaryotic transcripts and can influence the translation of protein-coding main ORFs (mORFs). Recent genome-wide ribosome profiling studies have revealed that thousands of uORFs initiate translation at non-AUG start codons. However, the physiological significance of these non-AUG uORFs has so far been demonstrated for only a few of them. It is conceivable that physiologically important non-AUG uORFs are evolutionarily conserved across species. In this study, using a combination of bioinformatics and experimental approaches, we searched the Arabidopsis genome for non-AUG-initiated uORFs with conserved sequences that control the expression of the mORF-encoded proteins. As a result, we identified four novel regulatory non-AUG uORFs. Among these, two exerted repressive effects on mORF expression in an amino acid sequence-dependent manner. These two non-AUG uORFs are likely to encode regulatory peptides that cause ribosome stalling, thereby enhancing their repressive effects. In contrast, one of the identified regulatory non-AUG uORFs promoted mORF expression by alleviating the inhibitory effect of a downstream AUG-initiated uORF. These findings provide insights into the mechanisms that enable non-AUG uORFs to play regulatory roles despite their low translation initiation efficiencies.


2020 ◽  
Vol 21 (17) ◽  
pp. 6238
Author(s):  
Ting Zhang ◽  
Anqi Wu ◽  
Yaping Yue ◽  
Yu Zhao

Gene expression is regulated at many levels, including mRNA transcription, translation, and post-translational modification. Compared with transcriptional regulation, mRNA translational control is a more critical step in gene expression and allows for more rapid changes of encoded protein concentrations in cells. Translation is highly regulated by complex interactions between cis-acting elements and trans-acting factors. Initiation is not only the first phase of translation, but also the core of translational regulation, because it limits the rate of protein synthesis. As potent cis-regulatory elements in eukaryotic mRNAs, upstream open reading frames (uORFs) generally inhibit the translation initiation of downstream major ORFs (mORFs) through ribosome stalling. During the past few years, with the development of RNA-seq and ribosome profiling, functional uORFs have been identified and characterized in many organisms. Here, we review uORF identification, uORF classification, and uORF-mediated translation initiation. More importantly, we summarize the translational regulation of uORFs in plant metabolic pathways, morphogenesis, disease resistance, and nutrient absorption, which open up an avenue for precisely modulating the plant growth and development, as well as environmental adaption. Additionally, we also discuss prospective applications of uORFs in plant breeding.


2005 ◽  
Vol 2 (1) ◽  
pp. 59-66
Author(s):  
Jin Yong-Feng ◽  
Jin Hui-Qing ◽  
Zhou Ping ◽  
Bian Teng-Fei

AbstractUpstream open reading frames (uORFs) in 5′-untranslated regions (5′-UTRs) of eukaryotic mRNAs play an important role in translation efficiency. Computational analysis of the upstream ATG (uATG) and uORFs of 5′-UTRs of plant mRNAs, adopted from the nucleotide sequence databank, was carried out. Statistical analysis revealed that up to 18% of 5′-UTRs contain uATG, which is much higher than the earlier estimate. Among them, about 50% of the genes have one uATG and nearly 20% of them have two uATGs. About 85% of uORFs are non-overlapping. Thirty per cent of uORF peptides comprise 1–5 aa, and about 80% of uORFs fall in the range of below 20 aa. Sequences flanking the uATG codon differ strikingly from the functional initiation codon and the uATG triplet is more frequently located in a non-optimal context. Consensus sequences of the ATG codon context of mRNA with and without uATG are similar, whereas the ATG codon context of mRNA without uATG is more frequently located in an optimal context than is mRNA with uATG. Most mRNAs with uATGs are possibly related to regulatory functions. In addition, most mRNA uORFs have no similarity between plant species whereas sequences of a few uORFs are highly conserved. For example, mRNA uORFs encoding S-adenosyl-l-methionine decarboxylase (AdoMetDC) share 75–100% homology between plant species, which is much more conserved than AdoMetDC protein.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yi Ren ◽  
Yue Song ◽  
Lipeng Zhang ◽  
Dinghan Guo ◽  
Juan He ◽  
...  

Peptides composed of a short chain of amino acids can play significant roles in plant growth, development, and stress responses. Most of these functional peptides are derived by either processing precursor proteins or direct translation of small open reading frames present in the genome and sometimes located in the untranslated region sequence of a messenger RNA. Generally, canonical peptides serve as local signal molecules mediating short- or long-distance intercellular communication. Also, they are commonly used as ligands perceived by an associated receptor, triggering cellular signaling transduction. In recent years, increasing pieces of evidence from studies in both plants and animals have revealed that peptides are also encoded by RNAs currently defined as non-coding RNAs (ncRNAs), including long ncRNAs, circular RNAs, and primary microRNAs. Primary microRNAs (miRNAs) have been reported to encode regulatory peptides in Arabidopsis, grapevine, soybean, and Medicago, called miRNA-encoded peptides (miPEPs). Remarkably, overexpression or exogenous applications of miPEPs specifically increase the expression level of their corresponding miRNAs by enhancing the transcription of the MIRNA (MIR) genes. Here, we first outline the current knowledge regarding the coding of putative ncRNAs. Notably, we review in detail the limited studies available regarding the translation of miPEPs and their relevant regulatory mechanisms. Furthermore, we discuss the potential cellular and molecular mechanisms in which miPEPs might be involved in plants and raise problems that needed to be solved.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Audrey Montigny ◽  
Patrizia Tavormina ◽  
Carine Duboe ◽  
Hélène San Clémente ◽  
Marielle Aguilar ◽  
...  

Abstract Background Recent genome-wide studies of many species reveal the existence of a myriad of RNAs differing in size, coding potential and function. Among these are the long non-coding RNAs, some of them producing functional small peptides via the translation of short ORFs. It now appears that any kind of RNA presumably has a potential to encode small peptides. Accordingly, our team recently discovered that plant primary transcripts of microRNAs (pri-miRs) produce small regulatory peptides (miPEPs) involved in auto-regulatory feedback loops enhancing their cognate microRNA expression which in turn controls plant development. Here we investigate whether this regulatory feedback loop is present in Drosophila melanogaster. Results We perform a survey of ribosome profiling data and reveal that many pri-miRNAs exhibit ribosome translation marks. Focusing on miR-8, we show that pri-miR-8 can produce a miPEP-8. Functional assays performed in Drosophila reveal that miPEP-8 affects development when overexpressed or knocked down. Combining genetic and molecular approaches as well as genome-wide transcriptomic analyses, we show that miR-8 expression is independent of miPEP-8 activity and that miPEP-8 acts in parallel to miR-8 to regulate the expression of hundreds of genes. Conclusion Taken together, these results reveal that several Drosophila pri-miRs exhibit translation potential. Contrasting with the mechanism described in plants, these data shed light on the function of yet undescribed primary-microRNA-encoded peptides in Drosophila and their regulatory potential on genome expression.


Author(s):  
Zhen Tian ◽  
Xiaodong Qin ◽  
Hui Wang ◽  
Ji Li ◽  
Jinfeng Chen

AbstractThe CONSTANS-like (COL) gene family is one of the plant-specific transcription factor families that play important roles in plant growth and development. However, the knowledge of COLs related in cucumber is limited, and their biological functions, especially in the photoperiod-dependent flowering process, are still unclear. In this study, twelve CsaCOL genes were identified in the cucumber genome. Phylogenetic and conserved motif analyses provided insights into the evolutionary relationship between the CsaCOLs. Further, the comparative genome analysis revealed that COL genes are conserved in different plant species, especially collinearity gene pairs related to CsaCOL5. Ten kinds of cis-acting elements were vividly detected in CsaCOLs promoter regions, including five light-responsive elements, which echo the diurnal rhythm expression patterns of seven CsaCOL genes under SD and LD photoperiod regimes. Combined with the expression data of developmental stage, three CsaCOL genes are involved in the flowering network and play pivotal roles for the floral induction process. Our results provide useful information for further elucidating the structural characteristics, expression patterns, and biological functions of COL family genes in many plants


Sign in / Sign up

Export Citation Format

Share Document