scholarly journals Formation of human long intergenic non-coding RNA genes and pseudogenes: ancestral sequences are key players

2019 ◽  
Author(s):  
Nicholas Delihas

AbstractPathways leading to formation of non-coding RNA and protein genes are varied and complex. We report finding a highly conserved repeat sequence present in both human and chimpanzee genomes that appears to have originated from a common primate ancestor. This sequence is repeatedly copied in human chromosome 22 (chr22) low copy repeats (LCR22) or segmental duplications and forms twenty-one different genes, which include human long intergenic non-coding RNA (lincRNA) gene and pseudogene families, as well as the gamma-glutamyltransferase (GGT) protein gene family and the RNA pseudogenes that originate from GGT sequences. In sharp contrast, only predicted protein genes stem from the homologous repeat sequence present in chr22 of chimpanzee. The data point to an ancestral DNA sequence, highly conserved through evolution and duplicated in humans by chromosomal repeat sequences that serves as a functional genomic element in the development of new and diverse genes in humans and chimpanzee.

2018 ◽  
Vol 4 (3) ◽  
pp. 16 ◽  
Author(s):  
Nicholas Delihas

A family of long intergenic noncoding RNA (lincRNA) genes, FAM230 is formed via gene sequence duplication, specifically in human chromosomal low copy repeats (LCR) or segmental duplications. This is the first group of lincRNA genes known to be formed by segmental duplications and is consistent with current views of evolution and the creation of new genes via DNA low copy repeats. It appears to be an efficient way to form multiple lincRNA genes. But as these genes are in a critical chromosomal region with respect to the incidence of abnormal translocations and resulting genetic abnormalities, the 22q11.2 region, and also carry a translocation breakpoint motif, several intriguing questions arise concerning the presence and function of the translocation breakpoint sequence in RNA genes situated in LCR22s.


2020 ◽  
Vol 6 (3) ◽  
pp. 36
Author(s):  
Nicholas Delihas

A small phylogenetically conserved sequence of 11,231 bp, termed FAM247, is repeated in human chromosome 22 by segmental duplications. This sequence forms part of diverse genes that span evolutionary time, the protein genes being the earliest as they are present in zebrafish and/or mice genomes, and the long noncoding RNA genes and pseudogenes the most recent as they appear to be present only in the human genome. We propose that the conserved sequence provides a nucleation site for new gene development at evolutionarily conserved chromosomal loci where the FAM247 sequences reside. The FAM247 sequence also carries information in its open reading frames that provides protein exon amino acid sequences; one exon plays an integral role in immune system regulation, specifically, the function of ubiquitin-specific protease (USP18) in the regulation of interferon. An analysis of this multifaceted sequence and the genesis of genes that contain it is presented.


Author(s):  
Nicholas Delihas

A small phylogenetically conserved sequence of 11,231 bp termed FAM247 is repeated in human chromosome 22 by segmental duplications. This sequence forms part of diverse genes that span evolutionary time, the protein genes being the earliest as they are present in zebrafish and/or mice genomes, the long non-coding RNA genes and pseudogenes the most recent as they appear to be present only in the human genome. We propose that the conserved sequence provides a nucleation site for new gene development at evolutionary conserved chromosomal loci where the FAM247 sequences reside. The FAM247 sequence also carries information in its open reading frames that provides protein exon amino acid sequences; one exon plays an integral role in immune system regulation, specifically, the function of ubiquitin specific protease (USP18) in the regulation of interferon. An analysis of this multifaceted sequence and the genesis of genes that contain it are presented.


Genetics ◽  
2002 ◽  
Vol 161 (3) ◽  
pp. 1065-1075
Author(s):  
David K Butler ◽  
David Gillespie ◽  
Brandi Steele

Abstract Large DNA palindromes form sporadically in many eukaryotic and prokaryotic genomes and are often associated with amplified genes. The presence of a short inverted repeat sequence near a DNA double-strand break has been implicated in the formation of large palindromes in a variety of organisms. Previously we have established that in Saccharomyces cerevisae a linear DNA palindrome is efficiently formed from a single-copy circular plasmid when a DNA double-strand break is introduced next to a short inverted repeat sequence. In this study we address whether the linear palindromes form by an intermolecular reaction (that is, a reaction between two identical fragments in a head-to-head arrangement) or by an unusual intramolecular reaction, as it apparently does in other examples of palindrome formation. Our evidence supports a model in which palindromes are primarily formed by an intermolecular reaction involving homologous recombination of short inverted repeat sequences. We have also extended our investigation into the requirement for DNA double-strand break repair genes in palindrome formation. We have found that a deletion of the RAD52 gene significantly reduces palindrome formation by intermolecular recombination and that deletions of two other genes in the RAD52-epistasis group (RAD51 and MRE11) have little or no effect on palindrome formation. In addition, palindrome formation is dramatically reduced by a deletion of the nucleotide excision repair gene RAD1.


2021 ◽  
Vol 11 (8) ◽  
pp. 1306-1312
Author(s):  
Li Song ◽  
Ningchao Du ◽  
Haitao Luo ◽  
Furong Li

This study aimed to identify the association of protein coding and long non coding RNA genes with immunotherapy response in melanoma. Based on RNA sequencing data of melanoma specimens, the expression levels of protein coding and long non coding RNA genes were calculated using the Kallisto RNA-seq quantification method, and differently expressed genes were detected using the DESeq2 method. Cox proportional hazards regression was used to evaluate the effects of gene expression on survival. According to the clinical data of 14 patients with drug response and 11 patients without drug response, 18 protein coding genes and 14 long non coding RNAs showed differential expressions (multiple of difference > 2 and P < 0.01 after correction), among which the coding genes of differential expression were significantly enriched through the process of cell adhesion (P < 0.01). The results of survival analysis showed that 18 coding genes and 14 long non coding RNA genes had significant effects on patient survival (P < 0.01). In this study, magnetic nanoparticles can be used to extract genomic DNA and total RNA due to their paramagnetism and biocompatibility, then transcriptome high-throughput sequencing was performed. The method has the advantages of removing dangerous reagents such as phenol and chloroform, replacing inorganic coating such as silica with organic oil, and shortening reaction time. Protein coding and long non coding RNA genes as well as magnetic nanoparticles may serve as potential cancer immune biomarker targets for developing future oncological treatments.


2016 ◽  
Vol 18 (suppl_6) ◽  
pp. vi84-vi85
Author(s):  
Siyuan Liu ◽  
Max Horlbeck ◽  
Seung Woo Cho ◽  
Harjus Birk ◽  
Martina Malatesta ◽  
...  

2006 ◽  
Vol 22 (21) ◽  
pp. 2590-2596 ◽  
Author(s):  
C. Wang ◽  
C. Ding ◽  
R. F. Meraz ◽  
S. R. Holbrook

Genetics ◽  
1987 ◽  
Vol 115 (4) ◽  
pp. 597-604
Author(s):  
Daniel L Wulff ◽  
Michael E Mahoney

ABSTRACT We have investigated the activation of transcription from the pRE promoters of phages λ, 21 and P22 by the λ and 21 cII proteins and the P22 c1 (cII-like) protein, using an in vivo system in which cII protein from a derepressed prophage activates transcription from a pRE DNA fragment on a multicopy plasmid. We find that each protein is highly specific for its own cognate pRE promoter, although measureable cross-reactions are observed. The primary recognition sequence for cII protein on λ pRE is a pair of TTGC repeat sequences in the sequence 5′-TTGCN6TTGC-3′ at the -35 region of the promoter. This same sequence is found in 21 pRE, while P22 pRE has the sequence 5′-TTGCN6TTGT-3′, which is the same as that of λctr1, a pRE  + variant of λ. λctr1 pRE is half as active as λ+  pRE when assayed with either the λ cII or the P22 c1 proteins. Therefore, the single base change in the P22 repeat sequence cannot explain why the P22 c1 protein is much more active with P22 pRE than λ pRE. The dya5 mutation, a G→A change at position -43 of pRE, makes pRE a stronger promoter when assayed with either the λ or 21 cII proteins or the P22 c1 protein. We conclude that efficient activation of a cII-dependent promoter by a cII protein requires sequence information in addition to the TTGC repeat sequences. We do not know the characteristics of the proteins which are responsible for the specificity of each protein for its own cognate promoter. However, λdya8, which has a Glu27→Lys alteration in the λ cII protein and a cII  + phenotype, results in a mutant cII protein that is much more highly specific than wild-type cII protein for its own cognate λ pRE promoter. This is especially remarkable because the dya8 amino acid alteration makes the helix-2 region (the region of the protein predicted to make contact with the phosphodiester backbone of the DNA) of λ cII protein conform exactly with the helix-2 region of the P22 c1 protein in both charge and charge distribution.


Sign in / Sign up

Export Citation Format

Share Document