scholarly journals SplicedFamAlign: CDS-to-gene spliced alignment and identification of transcript orthology groups

2018 ◽  
Author(s):  
Safa Jammali ◽  
Jean-David Aguilar ◽  
Esaie Kuitche ◽  
Aïda Ouangraoua

AbstractMotivationThe inference of splicing orthology relationships between gene transcripts is a basic step for the prediction of transcripts and the annotation of gene structures in genomes. Spliced alignment that consists in aligning a spliced cDNA sequence against an unspliced genomic sequence, constitutes a promising, yet unexplored approach for the identification of splicing orthology relationships. Existing spliced alignment algorithms do not exploit the information on the splicing structure of the input sequences, namely the exon structure of the cDNA sequence and the exon-intron structure of the genomic sequences. Yet, this information is often available for coding DNA sequences (CDS) and gene sequences annotated in databases, and it can help improve the accuracy of the computed spliced alignments. To address this issue, we introduce a new spliced alignment problem and a method called SplicedFamAlign (SFA) for computing the alignment of a spliced CDS against a gene sequence while accounting for the splicing structures of the input sequences, and then the inference of transcript splicing orthology groups in a gene family based on spliced alignments.ResultsThe experimental results show that SFA outperforms existing spliced alignment methods in terms of accuracy and execution time for CDS-to-gene alignment. We also show that the performance of SFA remains high for various levels of sequence similarity between input sequences, thanks to accounting for the splicing structure of the input sequences. It is important to notice that unlike all current spliced alignment methods that are meant for cDNA-to-genome alignments and can be used for CDS-to-gene alignments, SFA is the first method specifically designed for CDS-to-gene alignments. We show its usefulness for the comparison of genes and transcripts within a gene family for the purpose of analyzing splicing orthologies. It can also be used for gene structure annotation and alternative splicing analyses.AvailabilitySplicedFamAlign was implemented in Python. Source code is freely available at https://github.com/UdeS-CoBIUS/[email protected]

Genome ◽  
1999 ◽  
Vol 42 (6) ◽  
pp. 1077-1087 ◽  
Author(s):  
Erin N Yoshida ◽  
Bernhard F Benkel ◽  
Ying Fong ◽  
Donal A Hickey

To optimize gene expression under different environmental conditions, many organisms have evolved systems which can quickly up- and down-regulate the activity of other genes. Recently, the SNF1 kinase complex from yeast and the AMP-activated protein kinase complex from mammals have been shown to represent homologous metabolic sensors that are key to regulating energy levels under times of metabolic stress. Using heterologous probing, we have cloned the Drosophila melanogaster homologue of SNF4, the noncatalytic effector subunit from this kinase complex. A sequence corresponding to the partial genomic sequence as well as the full-length cDNA was obtained, and shows that the D. melanogaster SNF4 is encoded in a 1944-bp cDNA representing a protein of 648 amino acids (aa). Southern analysis of Drosophila genomic DNA in concert with a survey of mammalian SNF4 ESTs indicates that in metazoans, SNF4 is a duplicated gene, and possibly even a larger gene family. We propose that one gene copy codes for a short (330 aa) protein, whereas the second locus codes for a longer version (<410 aa) that is extended at the carboxy terminus, as typified by the Drosophila homologue presented here. Phylogenetic analysis of yeast, invertebrate, and multiple mammalian isoforms of SNF4 shows that the gene duplication likely occurred early in the metazoan lineage, as the protein products of the different loci are relatively divergent. When the phylogeny was extended beyond the SNF4 gene family, SNF4 shares sequence similarity with other cystathionine-β-synthase domain-containing proteins, including IMP dehydrogenase and a variety of uncharacterized Methanococcus proteins.Key words: SNF4, AMPK gamma subunit, derepression, gene family, phylogeny.


2005 ◽  
Vol 32 (5) ◽  
pp. 467 ◽  
Author(s):  
Hans H. Gehrig ◽  
Joshua A. Wood ◽  
Mary Ann Cushman ◽  
Aurelio Virgo ◽  
John C. Cushman ◽  
...  

Clones coding for a 1100-bp cDNA sequence of phosphoenolpyruvate carboxylase (PEPC) of the constitutive crassulacean acid metabolism (CAM) plant Kalanchoe pinnata (Lam.) Pers., were isolated by reverse transcription-polymerase chain reaction (RT–PCR) and characterised by restriction fragment length polymorphism analysis and DNA sequencing. Seven distinct PEPC isogenes were recovered, four in leaves and three in roots (EMBL accession numbers: AJ344052–AJ344058). Sequence similarity comparisons and distance neighbour-joining calculations separate the seven PEPC isoforms into two clades, one of which contains the three PEPCs found in roots. The second clade contains the four isoforms found in leaves and is divided into two branches, one of which contains two PEPCs most similar with described previously CAM isoforms. Of these two isoforms, however, only one exhibited abundant expression in CAM-performing leaves, but not in very young leaves, which do not exhibit CAM, suggesting this isoform encodes a CAM-specific PEPC. Protein sequence calculations suggest that all isogenes are likely derived from a common ancestor gene, presumably by serial gene duplication events. To our knowledge, this is the most comprehensive identification of a PEPC gene family from a CAM plant, and the greatest number of PEPC isogenes reported for any vascular plant to date.


Agronomy ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 1855
Author(s):  
Dan Luo ◽  
Ziqi Jia ◽  
Yong Cheng ◽  
Xiling Zou ◽  
Yan Lv

The β-amylase (BAM) gene family, known for their property of catalytic ability to hydrolyze starch to maltose units, has been recognized to play critical roles in metabolism and gene regulation. To date, BAM genes have not been characterized in oil crops. In this study, the genome-wide survey revealed the identification of 30 BnaBAM genes in Brassica napus L. (B. napus L.), 11 BraBAM genes in Brassica rapa L. (B. rapa L.), and 20 BoBAM genes in Brassica oleracea L. (B. oleracea L.), which were divided into four subfamilies according to the sequence similarity and phylogenetic relationships. All the BAM genes identified in the allotetraploid genome of B. napus, as well as two parental-related species (B. rapa and B. oleracea), were analyzed for the gene structures, chromosomal distribution and collinearity. The sequence alignment of the core glucosyl-hydrolase domains was further applied, demonstrating six candidate β-amylase (BnaBAM1, BnaBAM3.1-3.4 and BnaBAM5) and 25 β-amylase-like proteins. The current results also showed that 30 BnaBAMs, 11 BraBAMs and 17 BoBAMs exhibited uneven distribution on chromosomes of Brassica L. crops. The similar structural compositions of BAM genes in the same subfamily suggested that they were relatively conserved. Abiotic stresses pose one of the significant constraints to plant growth and productivity worldwide. Thus, the responsiveness of BnaBAM genes under abiotic stresses was analyzed in B. napus. The expression patterns revealed a stress-responsive behaviour of all members, of which BnaBAM3s were more prominent. These differential expression patterns suggested an intricate regulation of BnaBAMs elicited by environmental stimuli. Altogether, the present study provides first insights into the BAM gene family of Brassica crops, which lays the foundation for investigating the roles of stress-responsive BnaBAM candidates in B. napus.


Genetics ◽  
1993 ◽  
Vol 135 (2) ◽  
pp. 575-588 ◽  
Author(s):  
S M Cocciolone ◽  
K C Cone

Abstract Anthocyanins are purple pigments that can be produced in virtually all parts of the maize plant. The spatial distribution of anthocyanin synthesis is dictated by the organ-specific expression of a few regulatory genes that control the transcription of the structural genes. The regulatory genes are grouped into families based on functional identity and DNA sequence similarity. The C1/Pl gene family consists of C1, which controls pigmentation of the kernel, and Pl, which controls pigmentation of the vegetative and floral organs. We have determined the relationship of another gene, Blotched (Bh), to the C1 gene family. Bh was originally described as a gene that conditions blotches of pigmentation in kernels homozygous for recessive c1, suggesting that Bh could functionally replace C1 in the kernel. Our genetic and molecular analyses indicate that Bh is an allele of Pl, that we designate Pl-Bh. Pl-Bh differs from wild-type Pl alleles in two respects. In contrast to the uniform pigmentation observed in plants carrying Pl, the pattern of pigmentation in plants carrying Pl-Bh is variegated. Pl-Bh leads to variegated pigmentation in virtually all tissues of the plant, including the kernel, an organ not pigmented by other Pl alleles. To address the molecular basis for the unusual pattern of expression of Pl-Bh, we cloned and sequenced the gene. The nucleotide sequence of Pl-Bh showed only a single base-pair difference from that of Pl. However, genomic DNA sequences associated with Pl-Bh were found to be hypermethylated relative to the same sequences around the wild-type Pl allele. The methylation was inversely correlated with Pl mRNA levels in variegated plant tissues. Thus, we conclude that DNA methylation may play a role in regulating Pl-Bh expression.


2020 ◽  
Author(s):  
Yan Lv ◽  
Dan Luo ◽  
Ziqi Jia ◽  
Yong Cheng ◽  
Xiling Zou

Abstract Background: The β amylase (BAM) gene family, known for their property of catalytic ability to hydrolyze starch to maltose units, has been recognized to play critical roles in metabolism and gene regulation. To date, BAM genes have not been characterized in oil crops.Results: In this study, the genome wide survey revealed the identification of 30 BnaBAM genes in Brassica napus (B. napus), 11 BraBAM genes in Brassica rapa (B. rapa), 20 BoBAM genes in Brassica oleracea (B. oleracea), which were divided into 4 subfamilies according to the sequence similarity and phylogenetic relationships. All the BAM genes identified in the allotetraploid genome of B. napus, as well as two parental related species (B. rapa and B. oleracea), were analyzed for the gene structures, chromosomal distribution and collinearity, the sequence alignment of the core glucosyl hydrolase domains was further applied. 30 BnaBAMs, 11 BraBAMs and 17 BoBAMs exhibited uneven distribution on chromosomes of Brassica crops. The similar structural compositions of BAM genes in the same subfamily suggested that they were relatively conserved. Abiotic stresses pose one of the major constraints to plant growth and productivity worldwide. Thus, the responsiveness of BnaBAM genes under abiotic stresses were analyzed in B. napus. The expression patterns revealed a stress responsive behavior of all members, of which BnaBAM3s were more prominent. These differential expression patterns suggested an intricate regulation of BnaBAMs elicited by environmental stimuli. Conclusion: Altogether, the present study provides first insights into the BAM gene family of Brassica crops, which lays the foundation for investigating the roles of stress--responsive BnaBAM candidates in B. napus.


Genetics ◽  
2004 ◽  
Vol 166 (2) ◽  
pp. 947-957 ◽  
Author(s):  
John G Jelesko ◽  
Kristy Carter ◽  
Whitney Thompson ◽  
Yuki Kinoshita ◽  
Wilhelm Gruissem

Abstract Paralogous genes organized as a gene cluster can rapidly evolve by recombination between misaligned paralogs during meiosis, leading to duplications, deletions, and novel chimeric genes. To model unequal recombination within a specific gene cluster, we utilized a synthetic RBCSB gene cluster to isolate recombinant chimeric genes resulting from meiotic recombination between paralogous genes on sister chromatids. Several F1 populations hemizygous for the synthRBCSB1 gene cluster gave rise to Luc+ F2 plants at frequencies ranging from 1 to 3 × 10-6. A nonuniform distribution of recombination resolution sites resulted in the biased formation of recombinant RBCS3B/1B::LUC genes with nonchimeric exons. The positioning of approximately half of the mapped resolution sites was effectively modeled by the fractional length of identical DNA sequences. In contrast, the other mapped resolution sites fit an alternative model in which recombination resolution was stimulated by an abrupt transition from a region of relatively high sequence similarity to a region of low sequence similarity. Thus, unequal recombination between paralogous RBCSB genes on sister chromatids created an allelic series of novel chimeric genes that effectively resulted in the diversification rather than the homogenization of the synthRBCSB1 gene cluster.


Genome ◽  
2009 ◽  
Vol 52 (7) ◽  
pp. 647-657 ◽  
Author(s):  
P. J. Maughan ◽  
T. B. Turner ◽  
C. E. Coleman ◽  
D. B. Elzinga ◽  
E. N. Jellen ◽  
...  

Salt tolerance is an agronomically important trait that affects plant species around the globe. The Salt Overly Sensitive 1 (SOS1) gene encodes a plasma membrane Na+/H+ antiporter that plays an important role in germination and growth of plants in saline environments. Quinoa (Chenopodium quinoa Willd.) is a halophytic, allotetraploid grain crop of the family Amaranthaceae with impressive nutritional content and an increasing worldwide market. Many quinoa varieties have considerable salt tolerance, and research suggests quinoa may utilize novel mechanisms to confer salt tolerance. Here we report the cloning and characterization of two homoeologous SOS1 loci (cqSOS1A and cqSOS1B) from C. quinoa, including full-length cDNA sequences, genomic sequences, relative expression levels, fluorescent in situ hybridization (FISH) analysis, and a phylogenetic analysis of SOS1 genes from 13 plant taxa. The cqSOS1A and cqSOS1B genes each span 23 exons spread over 3477 bp and 3486 bp of coding sequence, respectively. These sequences share a high level of similarity with SOS1 homologs of other species and contain two conserved domains, a Nhap cation-antiporter domain and a cyclic-nucleotide binding domain. Genomic sequence analysis of two BAC clones (98 357 bp and 132 770 bp) containing the homoeologous SOS1 genes suggests possible conservation of synteny across the C. quinoa sub-genomes. This report represents the first molecular characterization of salt-tolerance genes in a halophytic species in the Amaranthaceae as well as the first comparative analysis of coding and non-coding DNA sequences of the two homoeologous genomes of C. quinoa.


2007 ◽  
pp. 373-390
Author(s):  
Alex Aronov ◽  
Al Pierce ◽  
Guy Bemis ◽  
Marc Jacobs ◽  
Harmon Zuccola ◽  
...  

Genetics ◽  
1992 ◽  
Vol 130 (2) ◽  
pp. 263-271
Author(s):  
C E Paquin ◽  
M Dorsey ◽  
S Crable ◽  
K Sprinkel ◽  
M Sondej ◽  
...  

Abstract A spontaneous antimycin A-resistant mutant carrying approximately four extra copies of ADH2 on chromosome XII was isolated from yeast strain 315-1D which lacks a functional copy of ADH1 and thus is antimycin A-sensitive. The additional copies of the normally glucose-repressed ADH2 are expressed during growth on glucose accounting for the antimycin A resistance. These extra copies are inserted into nonadjacent ribosomal DNA sequences (rDNA) near the recombination stimulating sequence HOT1. Each extra copy of the ADH2 gene (1548 bp) replaces most of the 37S transcript (approximately 7400 bp) in one of the approximately 200 copies of the rDNA present in the yeast genome. All four extra copies of ADH2 are lost at a rate of approximately 1 x 10(-5) deletions per cell per generation. One of the joints between the rDNA and ADH2 DNA is located 7 nucleotides downstream from 20 adenine residues in the normal copy of ADH2. This joint occurs at the end of a stretch of 16-29 thymidines in the rDNA which has been expanded to 57-59 thymidines. The other novel joint is located in a short region of sequence similarity between ADH2 and the rDNA. These observations suggest that amplification of ADH2 was a two step process: first the ADH2 gene was inserted into the rDNA, then multiple copies were generated by unequal crossing over or gene conversion within the rDNA.


Sign in / Sign up

Export Citation Format

Share Document