scholarly journals Gene Fusions Derived by Transcriptional Readthrough are Driven by Segmental Duplication in Human

2019 ◽  
Vol 11 (9) ◽  
pp. 2678-2690 ◽  
Author(s):  
Ann M McCartney ◽  
Edel M Hyland ◽  
Paul Cormican ◽  
Raymond J Moran ◽  
Andrew E Webb ◽  
...  

Abstract Gene fusion occurs when two or more individual genes with independent open reading frames becoming juxtaposed under the same open reading frame creating a new fused gene. A small number of gene fusions described in detail have been associated with novel functions, for example, the hominid-specific PIPSL gene, TNFSF12, and the TWE-PRIL gene family. We use Sequence Similarity Networks and species level comparisons of great ape genomes to identify 45 new genes that have emerged by transcriptional readthrough, that is, transcription-derived gene fusion. For 35 of these putative gene fusions, we have been able to assess available RNAseq data to determine whether there are reads that map to each breakpoint. A total of 29 of the putative gene fusions had annotated transcripts (9/29 of which are human-specific). We carried out RT-qPCR in a range of human tissues (placenta, lung, liver, brain, and testes) and found that 23 of the putative gene fusion events were expressed in at least one tissue. Examining the available ribosome foot-printing data, we find evidence for translation of three of the fused genes in human. Finally, we find enrichment for transcription-derived gene fusions in regions of known segmental duplication in human. Together, our results implicate chromosomal structural variation brought about by segmental duplication with the emergence of novel transcripts and translated protein products.

2013 ◽  
Vol 57 (6) ◽  
pp. 2603-2612 ◽  
Author(s):  
Narutoshi Uda ◽  
Yasuyuki Matoba ◽  
Takanori Kumagai ◽  
Kosuke Oda ◽  
Masafumi Noda ◽  
...  

ABSTRACTWe have recently cloned a DNA fragment containing a gene cluster that is responsible for the biosynthesis of an antituberculosis antibiotic,d-cycloserine. The gene cluster is composed of 10 open reading frames, designateddcsAtodcsJ. Judging from the sequence similarity between each putative gene product and known proteins, DcsC, which displays high homology to diaminopimelate epimerase, may catalyze the racemization ofO-ureidoserine. DcsD is similar toO-acetylserine sulfhydrylase, which generatesl-cysteine usingO-acetyl-l-serine with sulfide, and therefore, DcsD may be a synthase to generateO-ureido-l-serine usingO-acetyl-l-serine and hydroxyurea. DcsG, which exhibits similarity to a family of enzymes with an ATP-grasp fold, may be an ATP-dependent synthetase convertingO-ureido-d-serine intod-cycloserine. In the present study, to characterize the enzymatic functions of DcsC, DcsD, and DcsG, each protein was overexpressed inEscherichia coliand purified to near homogeneity. The biochemical function of each of the reactions catalyzed by these three proteins was verified by thin-layer chromatography (TLC), high-performance liquid chromatography (HPLC), and, in some cases, mass spectrometry. The results from this study demonstrate that by using a mixture of the three purified enzymes and the two commercially available substratesO-acetyl-l-serine and hydroxyurea, synthesis ofd-cycloserine was successfully attained. Thesein vitrostudies yield the conclusion that DcsD and DcsG are necessary for the syntheses ofO-ureido-l-serine andd-cycloserine, respectively. DcsD was also able to catalyze the synthesis ofl-cysteine when sulfide was added instead of hydroxyurea. Furthermore, the present study shows that DcsG can also form other cyclicd-amino acid analogs, such asd-homocysteine thiolactone.


2004 ◽  
Vol 78 (21) ◽  
pp. 11544-11550 ◽  
Author(s):  
Paul Kraft ◽  
Andrea Oeckinghaus ◽  
Daniel Kümmel ◽  
George H. Gauss ◽  
John Gilmore ◽  
...  

ABSTRACT Sulfolobus spindle-shaped viruses (SSVs), or Fuselloviridae, are ubiquitous crenarchaeal viruses found in high-temperature acidic hot springs around the world (pH ≤4.0; temperature of ≥70°C). Because they are relatively easy to isolate, they represent the best studied of the crenarchaeal viruses. This is particularly true for the type virus, SSV1, which contains a double-stranded DNA genome of 15.5 kilobases, encoding 34 putative open reading frames. Interestingly, the genome shows little sequence similarity to organisms other than its SSV homologues. Together, sequence similarity and biochemical analyses have suggested functions for only 6 of the 34 open reading frames. Thus, even though SSV1 is the best-studied crenarchaeal virus, functions for most (28) of its open reading frames remain unknown. We have undertaken biochemical and structural studies for the gene product of open reading frame F-93. We find that F-93 exists as a homodimer in solution and that a tight dimer is also present in the 2.7-Å crystal structure. Further, the crystal structure reveals a fold that is homologous to the SlyA and MarR subfamilies of winged-helix DNA binding proteins. This strongly suggests that F-93 functions as a transcription factor that recognizes a (pseudo-)palindromic DNA target sequence.


1989 ◽  
Vol 35 (1) ◽  
pp. 200-204 ◽  
Author(s):  
Johannes Auer ◽  
Konrad Lechner ◽  
August Bock

Two transcriptional units coding for ribosomal proteins and protein synthesis elongation factors in Methanococcus vannielii have been cloned and analysed in detail. They correspond to the "streptomycin operon" and "spectinomycin operon" of the Escherichia coli chromosome. The following general conclusions can be drawn from comparison of the nucleotide and the derived amino acid sequences of ribosomal proteins from Methanococcus with those from eubacteria and eukaryotes. (i) Ribosomal protein and elongation factor genes in Methanococcus are clustered in transcriptional units corresponding closely to E. coli ribosomal protein operons with respect to both gene composition and organization. (ii) These transcriptional units contain, in addition, a few open reading frames whose putative gene products share sequence similarity with eukaryotic 80S but not with eubacterial, ribosomal proteins. They may correspond to "additional" ribosomal proteins of the Methanococcus ribosome, there being no functional homologues in the eubacterial ribosome. (iii) Methanococcus ribosomal proteins and elongation factors almost exclusively exhibit a higher sequence similarity to eukaryotic 80S ribosomal proteins than to those of eubacteria. (iv) Many Methanococcus ribosomal proteins have a size intermediate between those of their eukaryotic and eubacterial homologues. These results are discussed in terms of a hypothesis which implies that the recent eubacterial ribosome developed by a "minimization" process from a more complex organelle and that the archaebacterial ribosome has maintained features of this ancestor.Key words: archaebacteria, Methanococcus, transcription factors, clonal analysis.


1990 ◽  
Vol 10 (6) ◽  
pp. 3067-3077
Author(s):  
P D Friesen ◽  
M S Nissen

A single copy of the retrotransposon TED, from the moth Trichoplusia ni (a lepidopteran noctuid), was identified within the DNA genome of the baculovirus Autographa californica nuclear polyhedrosis virus. Determination of the complete nucleotide sequence (7,510 base pairs) of the integrated copy indicated that TED belongs to the family of retrotransposons that includes Drosophila melanogaster elements 17.6 and gypsy and thus represents the first nondipteran member of this invertebrate group to be identified. The internal portion of TED, flanked by long terminal repeats (LTRs), is composed of three long open reading frames comparable in size and location to the gag, pol, and env genes of the vertebrate retroviruses. Sequence similarity with the dipteran elements was the highest within individual domains of TED open reading frame 2 (pol region) that are also conserved among the retroviruses and encode protease, reverse transcriptase, and integrase functions, respectively. Mapping the 5' and 3' termini of TED RNAs indicated that the LTRs have a retroviral U3-R-U5 structural organization that is capable of directing the synthesis of transcripts that represent potential substrates for reverse transcription and intermediates in transposition. Abundant RNAs were also initiated from a site within the 5' LTR that matches the consensus motif for the promoter of late, hyperexpressed baculovirus genes. The presence of this viruslike promoter within TED and its subsequent activation only after integration within the viral genome suggest a possible symbiotic relationship with the baculovirus that could extend transposon host range.


Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 648
Author(s):  
Yaqing Ou ◽  
James O. McInerney

The formation of new genes by combining parts of existing genes is an important evolutionary process. Remodelled genes, which we call composites, have been investigated in many species, however, their distribution across all of life is still unknown. We set out to examine the extent to which genomes from cells and mobile genetic elements contain composite genes. We identify composite genes as those that show partial homology to at least two unrelated component genes. In order to identify composite and component genes, we constructed sequence similarity networks (SSNs) of more than one million genes from all three domains of life, as well as viruses and plasmids. We identified non-transitive triplets of nodes in this network and explored the homology relationships in these triplets to see if the middle nodes were indeed composite genes. In total, we identified 221,043 (18.57%) composites genes, which were distributed across all genomic and functional categories. In particular, the presence of composite genes is statistically more likely in eukaryotes than prokaryotes.


2020 ◽  
Vol 36 (19) ◽  
pp. 4827-4832
Author(s):  
C S Casimiro-Soriguer ◽  
M M Rigual ◽  
A M Brokate-Llanos ◽  
M J Muñoz ◽  
A Garzón ◽  
...  

Abstract Motivation Short bioactive peptides encoded by small open reading frames (sORFs) play important roles in eukaryotes. Bioinformatics prediction of ORFs is an early step in a genome sequence analysis, but sORFs encoding short peptides, often using non-AUG initiation codons, are not easily discriminated from false ORFs occurring by chance. Results AnABlast is a computational tool designed to highlight putative protein-coding regions in genomic DNA sequences. This protein-coding finder is independent of ORF length and reading frame shifts, thus making of AnABlast a potentially useful tool to predict sORFs. Using this algorithm, here, we report the identification of 82 putative new intergenic sORFs in the Caenorhabditis elegans genome. Sequence similarity, motif presence, expression data and RNA interference experiments support that the underlined sORFs likely encode functional peptides, encouraging the use of AnABlast as a new approach for the accurate prediction of intergenic sORFs in annotated eukaryotic genomes. Availability and implementation AnABlast is freely available at http://www.bioinfocabd.upo.es/ab/. The C.elegans genome browser with AnABlast results, annotated genes and all data used in this study is available at http://www.bioinfocabd.upo.es/celegans. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
James Gallant ◽  
Jomien Mouton ◽  
Roy Ummels ◽  
Corinne ten Hagen-Jongman ◽  
Nastassja Kriel ◽  
...  

Abstract Mycobacterium tuberculosis is a facultative intracellular pathogen responsible for causing tuberculosis. The harsh environment in which M. tuberculosis survives requires this pathogen to continuously adapt in order to maintain an evolutionary advantage. However, the apparent absence of horizontal gene transfer in M. tuberculosis imposes restrictions in the ways by which evolution can occur. Large-scale changes in the genome can be introduced through genome reduction, recombination events and structural variation. Here, we identify a functional chimeric protein in the ppe38–71 locus, the absence of which is known to have an impact on protein secretion and virulence. To examine whether this approach was used more often by this pathogen, we further develop software that detects potential gene fusion events from multigene deletions using whole genome sequencing data. With this software we could identify a number of other putative gene fusion events within the genomes of M. tuberculosis isolates. We were able to demonstrate the expression of one of these gene fusions at the protein level using mass spectrometry. Therefore, gene fusions may provide an additional means of evolution for M. tuberculosis in its natural environment whereby novel chimeric proteins and functions can arise.


2013 ◽  
Vol 29 (7) ◽  
pp. 837-844 ◽  
Author(s):  
Pierre-Alain Jachiet ◽  
Romain Pogorelcnik ◽  
Anne Berry ◽  
Philippe Lopez ◽  
Eric Bapteste

1990 ◽  
Vol 10 (6) ◽  
pp. 3067-3077 ◽  
Author(s):  
P D Friesen ◽  
M S Nissen

A single copy of the retrotransposon TED, from the moth Trichoplusia ni (a lepidopteran noctuid), was identified within the DNA genome of the baculovirus Autographa californica nuclear polyhedrosis virus. Determination of the complete nucleotide sequence (7,510 base pairs) of the integrated copy indicated that TED belongs to the family of retrotransposons that includes Drosophila melanogaster elements 17.6 and gypsy and thus represents the first nondipteran member of this invertebrate group to be identified. The internal portion of TED, flanked by long terminal repeats (LTRs), is composed of three long open reading frames comparable in size and location to the gag, pol, and env genes of the vertebrate retroviruses. Sequence similarity with the dipteran elements was the highest within individual domains of TED open reading frame 2 (pol region) that are also conserved among the retroviruses and encode protease, reverse transcriptase, and integrase functions, respectively. Mapping the 5' and 3' termini of TED RNAs indicated that the LTRs have a retroviral U3-R-U5 structural organization that is capable of directing the synthesis of transcripts that represent potential substrates for reverse transcription and intermediates in transposition. Abundant RNAs were also initiated from a site within the 5' LTR that matches the consensus motif for the promoter of late, hyperexpressed baculovirus genes. The presence of this viruslike promoter within TED and its subsequent activation only after integration within the viral genome suggest a possible symbiotic relationship with the baculovirus that could extend transposon host range.


1998 ◽  
Vol 64 (9) ◽  
pp. 3140-3146 ◽  
Author(s):  
Christoph Heidrich ◽  
Ulrike Pag ◽  
Michaele Josten ◽  
Jörg Metzger ◽  
Ralph W. Jack ◽  
...  

ABSTRACT Epicidin 280 is a novel type A lantibiotic produced byStaphylococcus epidermidis BN 280. During C18reverse-phase high-performance liquid chromatography two epicidin 280 peaks were obtained; the two compounds had molecular masses of 3,133 ± 1.5 and 3,136 ± 1.5 Da, comparable antibiotic activities, and identical amino acid compositions. Amino acid sequence analysis revealed that epicidin 280 exhibits 75% similarity to Pep5. The strains that produce epicidin 280 and Pep5 exhibit cross-immunity, indicating that the immunity peptides cross-function in antagonization of both lantibiotics. The complete epicidin 280 gene cluster was cloned and was found to comprise at least five open reading frames (eciI, eciA, eciP,eciB, and eciC, in that order). The proteins encoded by these open reading frames exhibit significant sequence similarity to the biosynthetic proteins of the Pep5 operon ofStaphylococcus epidermidis 5. A gene for an ABC transporter, which is present in the Pep5 gene cluster but is necessary only for high yields (G. Bierbaum, M. Reis, C. Szekat, and H.-G. Sahl, Appl. Environ. Microbiol. 60:4332–4338, 1994), was not detected. Instead, upstream of the immunity gene eciI we found an open reading frame, eciO, which could code for a novel lantibiotic modification enzyme involved in reduction of an N-terminally located oxopropionyl residue. Epicidin 280 produced by the heterologous host Staphylococcus carnosus TM 300 after introduction of eciIAPBC (i.e., no eciO was present) behaved homogeneously during reverse-phase chromatography.


Sign in / Sign up

Export Citation Format

Share Document