Variation across Species and Levels: Implications for Model Species Research

Georg F. Striedter

doi:10.1159/000499664

Variation across Species and Levels: Implications for Model Species Research

Brain Behavior and Evolution ◽

10.1159/000499664 ◽

2019 ◽

Vol 93 (2-3) ◽

pp. 57-69 ◽

Cited By ~ 4

Author(s):

Georg F. Striedter

Keyword(s):

Species Differences ◽

Divergence Time ◽

Orthologous Gene ◽

Phylogenetic Distance ◽

Biological Organization ◽

Model Species ◽

Protein Coding ◽

Homologous Genes ◽

Generation Times ◽

High Level

The selection of model species tends to involve two typically unstated assumptions, namely: (1) that the similarity between species decreases steadily with phylogenetic distance, and (2) that similarities are greater at lower levels of biological organization. The first assumption holds on average, but species similarities tend to decrease with the square root of divergence time, rather than linearly, and lineages with short generation times (which includes most model species) tend to diverge faster than average, making the decrease in similarity non-monotonic. The second assumption is more difficult to test. Comparative molecular research has traditionally emphasized species similarities over differences, whereas comparative research at higher levels of organization frequently highlights the species differences. However, advances in comparative genomics have brought to light a great variety of species differences, not just in gene regulation but also in protein coding genes. Particularly relevant are cases in which homologous high-level characters are based on non-homologous genes. This phenomenon of non-orthologous gene displacement, or “deep non-homology,” indicates that species differences at the molecular level can be surprisingly large. Given these observations, it is not surprising that some findings obtained in model species do not generalize across species as well as researchers had hoped, even if the research is molecular.

Download Full-text

Discovering the Hidden Secondary Metabolome of Myxococcus xanthus: a Study of Intraspecific Diversity

Applied and Environmental Microbiology ◽

10.1128/aem.02863-07 ◽

2008 ◽

Vol 74 (10) ◽

pp. 3058-3068 ◽

Cited By ~ 105

Author(s):

Daniel Krug ◽

Gabriela Zurek ◽

Ole Revermann ◽

Michiel Vos ◽

Gregory J. Velicer ◽

...

Keyword(s):

Secondary Metabolites ◽

Myxococcus Xanthus ◽

Single Species ◽

Intraspecific Diversity ◽

Metabolic Diversity ◽

Model Species ◽

Metabolite Profiles ◽

High Level ◽

New Compounds ◽

Promising Source

ABSTRACT As a monophyletic group, the myxobacteria are known to produce a broad spectrum of secondary metabolites. However, the degree of metabolic diversity that can be found within a single species remains unexplored. The model species Myxococcus xanthus produces several metabolites also present in other myxobacterial species, but only one compound unique to M. xanthus has been found to date. Here, we compare the metabolite profiles of 98 M. xanthus strains that originate from 78 locations worldwide and include 20 centimeter-scale isolates from one location. This screen reveals a strikingly high level of intraspecific diversity in the M. xanthus secondary metabolome. The identification of 37 nonubiquitous candidate compounds greatly exceeds the small number of secondary metabolites previously known to derive from this species. These results suggest that M. xanthus may be a promising source of future natural products and that thorough intraspecific screens of other species could reveal many new compounds of interest.

Download Full-text

EnTAP: Bringing Faster and Smarter Functional Annotation to Non-Model Eukaryotic Transcriptomes

10.1101/307868 ◽

2018 ◽

Cited By ~ 5

Author(s):

Alexander J. Hart ◽

Samuel Ginzburg ◽

Muyang (Sam) Xu ◽

Cera R. Fisher ◽

Nasim Rahmatpour ◽

...

Keyword(s):

Similarity Search ◽

De Novo ◽

Gene Annotation ◽

Enrichment Analysis ◽

Orthologous Gene ◽

Protein Domain ◽

Family Assessment ◽

Ontology Term ◽

Protein Coding ◽

Functional Gene Annotation

ABSTRACTEnTAP (Eukaryotic Non-Model Transcriptome Annotation Pipeline) was designed to improve the accuracy, speed, and flexibility of functional gene annotation for de novo assembled transcriptomes in non-model eukaryotes. This software package addresses the fragmentation and related assembly issues that result in inflated transcript estimates and poor annotation rates, while focusing primarily on protein-coding transcripts. Following filters applied through assessment of true expression and frame selection, open-source tools are leveraged to functionally annotate the translated proteins. Downstream features include fast similarity search across three repositories, protein domain assignment, orthologous gene family assessment, and Gene Ontology term assignment. The final annotation integrates across multiple databases and selects an optimal assignment from a combination of weighted metrics describing similarity search score, taxonomic relationship, and informativeness. Researchers have the option to include additional filters to identify and remove contaminants, identify associated pathways, and prepare the transcripts for enrichment analysis. This fully featured pipeline is easy to install, configure, and runs significantly faster than comparable annotation packages. EnTAP is optimized to generate extensive functional information for the gene space of organisms with limited or poorly characterized genomic resources.

Download Full-text

Gene recruitments and dismissals in argonaut octopus genome provide insights to pelagic lifestyle adaptation and shell-like eggcase reacquisition

10.1101/2021.11.08.467834 ◽

2021 ◽

Author(s):

Masa-aki Yoshida ◽

Kazuki Hirota ◽

Junichi Imoto ◽

Miki Okuno ◽

Hiroyuki Tanaka ◽

...

Keyword(s):

Genome Sequence ◽

Average Length ◽

Draft Genome ◽

Gene Clusters ◽

Draft Genome Sequence ◽

Shell Formation ◽

Protein Coding ◽

Homologous Genes ◽

Phenotypic Structure ◽

Comparative Genomics Analysis

The paper nautilus, Argonauta argo, also known as the greater argonaut, is a species of octopods distinctly characterized by its pelagic lifestyle and by the presence of a spiral-shaped shell-like eggcase in females. The eggcase functions by protecting the eggs laid inside it, and by building and keeping air intakes for buoyancy. To reveal the genomic background of the species′ adaptation to pelagic lifestyle and the acquisition of its shell-like eggcase, we sequenced the draft genome sequence of the species. The genome size was 1.1 Gb, which is the smallest among the cephalopods known to date, with the top 215 scaffolds (average length 5,064,479 bp) covering 81% (1.09 Gb) of the total assembly. A total of 26,433 protein-coding genes were predicted from 16,802 assembled scaffolds. From these, we identified nearly intact HOX, Parahox, Wnt clusters and some gene clusters probably related to the pelagic lifestyle, such as reflectin, tyrosinase, and opsin. For example, opsin might have undergone an extensive duplication in order to adapt to the pelagic lifestyle, as opposed to other octopuses, which are mostly the benthic. Our gene models also discovered several genes homologous to those related to calcified shell formation in Conchiferan Mollusks, such as Pif-like, SOD, and TRX. Interestingly, comparative genomics analysis revealed that the homologous genes for such genes were also found in the genome of the octopus, which does not have a shell, as well as the basal cephalopods Nautilus. Therefore, the draft genome sequence of A. argo we presented here had not only helped us to gain further insights into the genetic background of the dynamic recruitment and dismissal of genes for the formation of an important, converging extended phenotypic structure such as the shell and the shell-like eggcase, but also the evolution of lifestyles in Cephalopods and the octopods, from benthic to pelagic.

Download Full-text

Term Matrix: a novel Gene Ontology annotation quality control system based on ontology term co-annotation patterns

Open Biology ◽

10.1098/rsob.200149 ◽

2020 ◽

Vol 10 (9) ◽

pp. 200149 ◽

Cited By ~ 1

Author(s):

Valerie Wood ◽

Seth Carbon ◽

Midori A. Harris ◽

Antonia Lock ◽

Stacia R. Engel ◽

...

Keyword(s):

Quality Control ◽

Gene Ontology ◽

Biological Processes ◽

Gene Products ◽

Model Species ◽

Ontology Term ◽

Protein Coding ◽

Ontology Structure ◽

Exclusive Processes ◽

Annotation Quality

Biological processes are accomplished by the coordinated action of gene products. Gene products often participate in multiple processes, and can therefore be annotated to multiple Gene Ontology (GO) terms. Nevertheless, processes that are functionally, temporally and/or spatially distant may have few gene products in common, and co-annotation to unrelated processes probably reflects errors in literature curation, ontology structure or automated annotation pipelines. We have developed an annotation quality control workflow that uses rules based on mutually exclusive processes to detect annotation errors, based on and validated by case studies including the three we present here: fission yeast protein-coding gene annotations over time; annotations for cohesin complex subunits in human and model species; and annotations using a selected set of GO biological process terms in human and five model species. For each case study, we reviewed available GO annotations, identified pairs of biological processes which are unlikely to be correctly co-annotated to the same gene products (e.g. amino acid metabolism and cytokinesis), and traced erroneous annotations to their sources. To date we have generated 107 quality control rules, and corrected 289 manual annotations in eukaryotes and over 52 700 automatically propagated annotations across all taxa.

Download Full-text

Complete Mitogenomes of Three Carangidae (Perciformes) Fishes: Genome Description and Phylogenetic Considerations

International Journal of Molecular Sciences ◽

10.3390/ijms21134685 ◽

2020 ◽

Vol 21 (13) ◽

pp. 4685

Author(s):

Zhenhai Li ◽

Min Li ◽

Shannan Xu ◽

Li Liu ◽

Zuozhi Chen ◽

...

Keyword(s):

Divergence Time ◽

Rrna Genes ◽

12S Rrna ◽

Trna Genes ◽

Coding Region ◽

Protein Coding ◽

Protein Coding Genes ◽

Gc Skew ◽

The Family ◽

Conserved Genes

Carangidae are ecologically and economically important marine fish. The complete mitogenomes of three Carangidae species (Alectis indicus, Decapterus tabl, and Alepes djedaba) were sequenced, characterized, and compared with 29 other species of the family Carangidae in this study. The length of the three mitogenomes ranged from 16,530 to 16,610 bp, and the structures included 2 rRNA genes (12S rRNA and 16S rRNA), 1 control region (a non-coding region), 13 protein-coding genes, and 22 tRNA genes. Among the 22 tRNA genes, only tRNA-Ser (GCT) was not folded into a typical cloverleaf secondary structure and had no recognizable DHU stem. The full-length sequences and protein-coding genes (PCGs) of the mitogenomes of the three species all had obvious AT biases. The majority of the AT-skew and GC-skew values of the PCGs among the three species were negative, demonstrating bases T and C were more plentiful than A and G. Analyses of Ka/Ks and overall p-genetic distance demonstrated that ATP8 showed the highest evolutionary rate and COXI/COXII were the most conserved genes in the three species. The phylogenetic tree based on PCGs sequences of mitogenomes using maximum likelihood and Bayesian inference analyses showed that three clades were divided corresponding to the subfamilies Caranginae, Naucratinae, and Trachinotinae. The monophyly of each superfamily was generally well supported. The divergence time analyses showed that Carangidae evolved during three geological periods, the Cretaceous, Paleogene, and Neogene. A. indicus began to differentiate from other species about 27.20 million years ago (Mya) in the early Miocene, while D. tabl (21.25 Mya) and A. djedaba (14.67 Mya) differentiated in the middle Oligocene.

Download Full-text

LINC01106 drives colorectal cancer growth and stemness through a positive feedback loop to regulate the Gli family factors

Cell Death and Disease ◽

10.1038/s41419-020-03026-3 ◽

2020 ◽

Vol 11 (10) ◽

Author(s):

Kun Guo ◽

Wenbin Gong ◽

Qin Wang ◽

Guosheng Gu ◽

Tao Zheng ◽

...

Keyword(s):

Feedback Loop ◽

Colon Adenocarcinoma ◽

Family Factors ◽

Transcription Activator ◽

Normal Colon ◽

Protein Coding ◽

Non Coding Rnas ◽

High Level ◽

Ucsc Database

Abstract Long non-coding RNAs (lncRNAs) are essential contributors to the progression of various human cancers. Long intergenic non-protein coding RNA 1106 is a member of lncRNAs family. Until now, the specific role of LINC01106 in CRC remains undefined. The aim the current study was to unveil the functions of LINC01106 and explore its potential molecular mechanism in CRC. Based on the data of online database GEPIA, we determined that LINC01106 was expressed at a high level in colon adenocarcinoma (COAD) tissues compared to normal colon tissues. More importantly, high level of LINC01106 had negative correlation with the overall survival of COAD patients. Additionally, we also determined the low level of LINC01106 in normal colon tissues based on UCSC database. Through qRT-PCR, we identified that LINC01106 was highly expressed in CRC tissues compared to adjacent normal ones. Similarly, we detected the expression of LINC01106 and confirmed that LINC01106 was expressed higher in CRC cells than that in normal cells. Subsequently, LINC01106 was mainly distributed in the cytoplasm. LINC01106 induced the proliferation, migration, and stem-like phenotype of CRC cells. Mechanistically, cytoplasmic LINC01106 positively modulated Gli4 in CRC cells by serving as a miR-449b-5p sponge. Furthermore, nuclear LINC01106 could activate the transcription of Gli1 and Gli2 through recruiting FUS to Gli1 and Gli2 promoters. Mechanism of investigation unveiled that Gli2 was a transcription activator of LINC01106. In conclusion, Gli2-induced upregulation of LINC01106 aggravates CRC progression through upregulating Gli2, Gli2, and Gli4.

Download Full-text

lncRNA_Mdeep: An Alignment-Free Predictor for Distinguishing Long Non-Coding RNAs from Protein-Coding Transcripts by Multimodal Deep Learning

International Journal of Molecular Sciences ◽

10.3390/ijms21155222 ◽

2020 ◽

Vol 21 (15) ◽

pp. 5222 ◽

Cited By ~ 1

Author(s):

Xiao-Nan Fan ◽

Shao-Wu Zhang ◽

Song-Yao Zhang ◽

Jin-Jie Ni

Keyword(s):

Deep Learning ◽

Prediction Accuracy ◽

Protein Coding ◽

Learning Framework ◽

Alignment Free ◽

Independent Test ◽

Non Coding Rnas ◽

High Level ◽

Fold Cross Validation ◽

Human Complex

Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing the lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. In this study, we presented an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporated three different input modalities, then a multimodal deep learning framework was built for learning the high-level abstract representations and predicting the probability whether a transcript was lncRNA or not. LncRNA_Mdeep achieved 98.73% prediction accuracy in a 10-fold cross-validation test on humans. Compared with other eight state-of-the-art methods, lncRNA_Mdeep showed 93.12% prediction accuracy independent test on humans, which was 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets showed that lncRNA_Mdeep was a powerful predictor for predicting lncRNAs.

Download Full-text

The genome ofHyperthermus butylicus: a sulfur-reducing, peptide fermenting, neutrophilic Crenarchaeote growing up to 108 °C

Archaea ◽

10.1155/2007/745987 ◽

2007 ◽

Vol 2 (2) ◽

pp. 127-135 ◽

Cited By ~ 30

Author(s):

Kim Brügger ◽

Lanming Chen ◽

Markus Stark ◽

Arne Zibat ◽

Peter Redder ◽

...

Keyword(s):

Genome Structure ◽

Trna Genes ◽

Growing Up ◽

Circular Chromosome ◽

Protein Coding ◽

Electron Transfer Proteins ◽

Good Sequence ◽

High Level ◽

Large Clusters ◽

Single Circular Chromosome

Hyperthermus butylicus, a hyperthermophilic neutrophile and anaerobe, is a member of the archaeal kingdom Crenarchaeota. Its genome consists of a single circular chromosome of 1,667,163 bp with a 53.7% G+C content. A total of 1672 genes were annotated, of which 1602 are protein-coding, and up to a third are specific toH. butylicus. In contrast to some other crenarchaeal genomes, a high level of GUG and UUG start codons are predicted. Twocdc6genes are present, but neither could be linked unambiguously to an origin of replication. Many of the predicted metabolic gene products are associated with the fermentation of peptide mixtures including several peptidases with diverse specificities, and there are many encoded transporters. Most of the sulfur-reducing enzymes, hydrogenases and electron-transfer proteins were identified which are associated with energy production by reducing sulfur to H2S. Two large clusters of regularly interspaced repeats (CRISPRs) are present, one of which is associated with a crenarchaeal-typecasgene superoperon; none of the spacer sequences yielded good sequence matches with known archaeal chromosomal elements. The genome carries no detectable transposable or integrated elements, no inteins, and introns are exclusive to tRNA genes. This suggests that the genome structure is quite stable, possibly reflecting a constant, and relatively uncompetitive, natural environment.

Download Full-text

Structure and tissue-specific expression of the Drosophila melanogaster organellar-type Ca2+-ATPase gene

Biochemical Journal ◽

10.1042/bj3100757 ◽

1995 ◽

Vol 310 (3) ◽

pp. 757-763 ◽

Cited By ~ 18

Author(s):

A Magyar ◽

E Bakos ◽

A Váradi

Keyword(s):

Drosophila Melanogaster ◽

Transcription Initiation ◽

Initiation Site ◽

Drosophila Genome ◽

Coding Region ◽

Specific Expression ◽

S1 Nuclease ◽

Protein Coding ◽

Atpase Gene ◽

High Level

A 14 kb genomic clone covering the organellar-type Ca(2+)-ATPase gene of Drosophila melanogaster has been isolated and characterized. The sequence of a 7132 bp region extending from 1.1 kb 5′ upstream of the initiation ATG codon over the polyadenylation signal at the 3′ end has been determined. The gene consists of nine exons including one with an exceptional size of 2172 bp representing 72% of the protein coding region. Introns are relatively small (< 100 bp) except for the 3′ intron which has a size of 2239 bp, an exceptionally large size among Drosophila introns. Five of the introns are in the same positions in Drosophila, Artemia and rabbit SERCA1 Ca(2+)-ATPase genes. There is only one organellar-type Ca(2+)-ATPase gene in the Drosophila genome, as was shown by Southern-blot analysis [Váradi, Gilmore-Hebert and Benz (1989) FEBS Lett. 258, 203-207] and by chromosomal localization [Magyar and Váradi (1990) Biochem. Biophys. Res. Commun. 173, 872-877]. Primer extension and S1-nuclease assays revealed a potential transcription initiation site 876 bp upstream of the translation initiation ATG with a TATA-box 23 bp upstream of this site. Analysis of the 5′ region of the Drosophila organellar-type Ca(2+)-ATPase gene suggests the presence of potential recognition sequences of various muscle-specific transcription factors and shows a region with remarkable similarity to that in the rabbit SERCA2 gene. The tissue distribution of expression of the organellar-type Ca(2+)-ATPase gene has been studied by in situ RNA-RNA hybridization on microscopic sections. A low mRNA abundance can be detected in each tissue of adult flies, suggesting a housekeeping function for the gene. On the other hand a pronounced tissue specificity of expression has also been found as the organellar-type Ca(2+)-ATPase is expressed at a very high level in cell bodies of the central nervous system and in various muscles.

Download Full-text

Rhabditophanes diutinus a parthenogenetic clade IV nematode with dauer larvae

PLoS Pathogens ◽

10.1371/journal.ppat.1009113 ◽

2020 ◽

Vol 16 (12) ◽

pp. e1009113

Author(s):

Alex Dulovic ◽

Tess Renahan ◽

Waltraud Röseler ◽

Christian Rödelsperger ◽

Ann M. Rose ◽

...

Keyword(s):

Life Cycle ◽

Expression Patterns ◽

Parasitic Nematodes ◽

Comparative Genomic ◽

Phylogenetic Distance ◽

Life Stages ◽

Model Species ◽

Free Living ◽

Infective Larvae ◽

Dauer Larvae

Comparative studies using non-parasitic model species such as Caenorhabditis elegans, have been very helpful in investigating the basic biology and evolution of parasitic nematodes. However, as phylogenetic distance increases, these comparisons become more difficult, particularly when outside of the nematode clade to which C. elegans belongs (V). One of the reasons C. elegans has nevertheless been used for these comparisons, is that closely related well characterized free-living species that can serve as models for parasites of interest are frequently not available. The Clade IV parasitic nematodes Strongyloides are of great research interest due to their life cycle and other unique biological features, as well as their medical and veterinary importance. Rhabditophanes, a closely related free-living genus, forms part of the Strongyloidoidea nematode superfamily. Rhabditophanes diutinus (= R. sp. KR3021) was included in the recent comparative genomic analysis of the Strongyloididae, providing some insight into the genomic nature of parasitism. However, very little is known about this species, limiting its usefulness as a research model. Here we provide a species description, name the species as R. diutinus and investigate its life cycle and subsequently gene expression in multiple life stages. We identified two previously unreported starvation induced life stages: dauer larvae and arrested J2 (J2A) larvae. The dauer larvae are morphologically similar to and are the same developmental stage as dauers in C. elegans and infective larvae in Strongyloides. As in C. elegans and Strongyloides, dauer formation is inhibited by treatment with dafachronic acid, indicating some genetic control mechanisms are conserved. Similarly, the expression patterns of putative dauer/infective larva control genes resemble each other, in particular between R. diutinus and Strongyloides spp. These findings illustrate and increase the usefulness of R. diutinus as a non-parasitic, easy to work with model species for the Strongyloididae for studying the evolution of parasitism as well as many aspects of the biology of Strongyloides spp, in particular the formation of infective larvae.

Download Full-text