scholarly journals Statistical Evidence for Common Ancestry: Testing for Signal in Silent Sites

2016 ◽  
Author(s):  
Martin Bontrager ◽  
Bret Larget ◽  
Cécile Ané ◽  
David Baum

1. The common ancestry of life is supported by an enormous body of evidence and is universally accepted within the scientific community. However, some potential sources of data that can be used to test the thesis of common ancestry have not yet been formally analyzed. 2. We developed a new test of common ancestry based on nucleotide sequences at amino acid invariant sites in aligned homologous protein coding genes. We reasoned that since nucleotide variation at amino acid invariant sites is selectively neutral and, thus, unlikely to be due to convergent evolution, the observation that an amino acid is consistently encoded by the same codon sequence in different species could provide strong evidence of their common ancestry. Our method uses the observed variation in codon sequences at amino acid invariant sites as a test statistic, and compares such variation to that which is expected under three different models of codon frequency under the alternative hypothesis of separate ancestry. We also examine hierarchical structure in the nucleotide sequences at amino acid invariant sites and quantified agreement between trees generated from amino acid sequence and those inferred from the nucleotide sequences at amino acid invariant sites. 3. When these tests are applied to the primate families as a test case, we find that observed nucleotide variation at amino acid invariant sites is considerably lower than nucleotide variation predicted by any model of codon frequency under separate ancestry. Phylogenetic trees generated from amino-acid invariant site nucleotide data agree with those generated from protein-coding data, and there is far more hierarchical structure in amino-acid invariant site data than would be expected under separate ancestry. 4. We definitively reject the separate ancestry of the primate families, and demonstrate that our tests can be applied to any group of interest to test common ancestry.

1980 ◽  
Vol 187 (1) ◽  
pp. 65-74 ◽  
Author(s):  
D Penny ◽  
M D Hendy ◽  
L R Foulds

We have recently reported a method to identify the shortest possible phylogenetic tree for a set of protein sequences [Foulds Hendy & Penny (1979) J. Mol. Evol. 13. 127–150; Foulds, Penny & Hendy (1979) J. Mol. Evol. 13, 151–166]. The present paper discusses issues that arise during the construction of minimal phylogenetic trees from protein-sequence data. The conversion of the data from amino acid sequences into nucleotide sequences is shown to be advantageous. A new variation of a method for constructing a minimal tree is presented. Our previous methods have involved first constructing a tree and then either proving that it is minimal or transforming it into a minimal tree. The approach presented in the present paper progressively builds up a tree, taxon by taxon. We illustrate this approach by using it to construct a minimal tree for ten mammalian haemoglobin alpha-chain sequences. Finally we define a measure of the complexity of the data and illustrate a method to derive a directed phylogenetic tree from the minimal tree.


2017 ◽  
Author(s):  
Jeremy M. Beaulieu ◽  
Brian C. O’Meara ◽  
Russell Zaretzki ◽  
Cedric Landerer ◽  
Juanjuan Chai ◽  
...  

AbstractWe present a new phylogenetic approach SelAC (Selection on Amino acids and Codons), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models which assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein coding DNA under the assumption of consistent, stabilizing selection using cost-benefit approach. This cost-benefit approach allows us generate a set of 20 optimal amino acid specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast dataset of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 104–105 AICc units. Our results indicate there is great potential for more accurate inference of phylogenetic trees and branch lengths from already existing data through the use of nested, mechanistic models. Additional parameters estimated by SelAC indicate that a large amount of non-phylogenetic, but biologically meaningful, information can be inferred from exisiting data. For example, SelAC prediction of gene specific protein synthesis rates correlates well with both empirical (r=0.33−0.48) and other theoretical predictions (r=0.45−0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.


Insects ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 668
Author(s):  
Tinghao Yu ◽  
Yalin Zhang

More studies are using mitochondrial genomes of insects to explore the sequence variability, evolutionary traits, monophyly of groups and phylogenetic relationships. Controversies remain on the classification of the Mileewinae and the phylogenetic relationships between Mileewinae and other subfamilies remain ambiguous. In this study, we present two newly completed mitogenomes of Mileewinae (Mileewa rufivena Cai and Kuoh 1997 and Ujna puerana Yang and Meng 2010) and conduct comparative mitogenomic analyses based on several different factors. These species have quite similar features, including their nucleotide content, codon usage of protein genes and the secondary structure of tRNA. Gene arrangement is identical and conserved, the same as the putative ancestral pattern of insects. All protein-coding genes of U. puerana began with the start codon ATN, while 5 Mileewa species had the abnormal initiation codon TTG in ND5 and ATP8. Moreover, M. rufivena had an intergenic spacer of 17 bp that could not be found in other mileewine species. Phylogenetic analysis based on three datasets (PCG123, PCG12 and AA) with two methods (maximum likelihood and Bayesian inference) recovered the Mileewinae as a monophyletic group with strong support values. All results in our study indicate that Mileewinae has a closer phylogenetic relationship to Typhlocybinae compared to Cicadellinae. Additionally, six species within Mileewini revealed the relationship (U. puerana + (M. ponta + (M. rufivena + M. alara) + (M. albovittata + M. margheritae))) in most of our phylogenetic trees. These results contribute to the study of the taxonomic status and phylogenetic relationships of Mileewinae.


Genetics ◽  
2000 ◽  
Vol 155 (1) ◽  
pp. 431-449 ◽  
Author(s):  
Ziheng Yang ◽  
Rasmus Nielsen ◽  
Nick Goldman ◽  
Anne-Mette Krabbe Pedersen

AbstractComparison of relative fixation rates of synonymous (silent) and nonsynonymous (amino acid-altering) mutations provides a means for understanding the mechanisms of molecular sequence evolution. The nonsynonymous/synonymous rate ratio (ω = dN/dS) is an important indicator of selective pressure at the protein level, with ω = 1 meaning neutral mutations, ω < 1 purifying selection, and ω > 1 diversifying positive selection. Amino acid sites in a protein are expected to be under different selective pressures and have different underlying ω ratios. We develop models that account for heterogeneous ω ratios among amino acid sites and apply them to phylogenetic analyses of protein-coding DNA sequences. These models are useful for testing for adaptive molecular evolution and identifying amino acid sites under diversifying selection. Ten data sets of genes from nuclear, mitochondrial, and viral genomes are analyzed to estimate the distributions of ω among sites. In all data sets analyzed, the selective pressure indicated by the ω ratio is found to be highly heterogeneous among sites. Previously unsuspected Darwinian selection is detected in several genes in which the average ω ratio across sites is <1, but in which some sites are clearly under diversifying selection with ω > 1. Genes undergoing positive selection include the β-globin gene from vertebrates, mitochondrial protein-coding genes from hominoids, the hemagglutinin (HA) gene from human influenza virus A, and HIV-1 env, vif, and pol genes. Tests for the presence of positively selected sites and their subsequent identification appear quite robust to the specific distributional form assumed for ω and can be achieved using any of several models we implement. However, we encountered difficulties in estimating the precise distribution of ω among sites from real data sets.


2001 ◽  
Vol 45 (9) ◽  
pp. 2559-2562 ◽  
Author(s):  
Rui Kano ◽  
Ken Okabayashi ◽  
Yuka Nakamura ◽  
Shinichi Watanabe ◽  
Atsuhiko Hasegawa

ABSTRACT The expression of the ubiquitin (Ub) gene in dermatophytes was examined for its relation to resistance against the antifungal drug fluconazole. The nucleotide sequences and the deduced amino acid sequences of the Ub gene in Microsporum canis were proven to be 99% similar to those of the Ub gene in Trichophyton mentagrophytes. Expression of mRNA of Ub in M. canisand T. mentagrophytes was enhanced when the fungi were cultured with fluconazole. The antifungal activity of fluconazole against these dermatophytes was increased in the presence of Ub proteasome inhibitor.


Insects ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 453
Author(s):  
Zi-Yi Zhang ◽  
Jia-Yin Guan ◽  
Yu-Rou Cao ◽  
Xin-Yi Dai ◽  
Kenneth B. Storey ◽  
...  

We determined the mitochondrial gene sequence of Monochamus alternatus and three other mitogenomes of Lamiinae (Insect: Coleoptera: Cerambycidae) belonging to three genera (Aulaconotus, Apriona and Paraglenea) to enrich the mitochondrial genome database of Lamiinae and further explore the phylogenetic relationships within the subfamily. Phylogenetic trees of the Lamiinae were built using the Bayesian inference (BI) and maximum likelihood (ML) methods and the monophyly of Monochamus, Anoplophora, and Batocera genera was supported. Anoplophora chinensis, An. glabripennis and Aristobia reticulator were closely related, suggesting they may also be potential vectors for the transmission of the pine wood pathogenic nematode (Bursaphelenchus xylophilus) in addition to M. alternatus, a well-known vector of pine wilt disease. There is a special symbiotic relationship between M. alternatus and Bursaphelenchus xylophilus. As the native sympatric sibling species of B. xylophilus, B. mucronatus also has a specific relationship that is often overlooked. The analysis of mitochondrial gene expression aimed to explore the effect of B. mucronatus on the energy metabolism of the respiratory chain of M. alternatus adults. Using RT-qPCR, we determined and analyzed the expression of eight mitochondrial protein-coding genes (COI, COII, COIII, ND1, ND4, ND5, ATP6, and Cty b) between M. alternatus infected by B. mucronatus and M. alternatus without the nematode. Expression of all the eight mitochondrial genes were up-regulated, particularly the ND4 and ND5 gene, which were up-regulated by 4–5-fold (p < 0.01). Since longicorn beetles have immune responses to nematodes, we believe that their relationship should not be viewed as symbiotic, but classed as parasitic.


Virology ◽  
1993 ◽  
Vol 193 (1) ◽  
pp. 66-72 ◽  
Author(s):  
Mohinderjit S. Sidhu ◽  
Walter Husar ◽  
Stuart D. Cook ◽  
Peter C. Dowling ◽  
Stephen A. Udem

Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 936
Author(s):  
Dan Wang

In this paper, a ratio test based on bootstrap approximation is proposed to detect the persistence change in heavy-tailed observations. This paper focuses on the symmetry testing problems of I(1)-to-I(0) and I(0)-to-I(1). On the basis of residual CUSUM, the test statistic is constructed in a ratio form. I prove the null distribution of the test statistic. The consistency under alternative hypothesis is also discussed. However, the null distribution of the test statistic contains an unknown tail index. To address this challenge, I present a bootstrap approximation method for determining the rejection region of this test. Simulation studies of artificial data are conducted to assess the finite sample performance, which shows that our method is better than the kernel method in all listed cases. The analysis of real data also demonstrates the excellent performance of this method.


Genetics ◽  
1997 ◽  
Vol 145 (2) ◽  
pp. 311-323 ◽  
Author(s):  
Brent Richter ◽  
Manyuan Long ◽  
R C Lewontin ◽  
Eiji Nitasaka

A study of polymorphism and species divergence of the dpp gene of Drosophila has been made. Eighteen lines from a population of D. melanogaster were sequenced for 5200 bp of the Hin region of the gene, coding for the dpp polypeptide. A comparison was made with sequence from D. simulans. Ninety-six silent polymorphisms and three amino acid replacement polymorphisms were found. The overall silent polymorphism (0.0247) is low, but haplotype diversity (0.0066 for effectively silent sites and 0.0054 for all sites) is in the range found for enzyme loci. Amino acid variation is absent in the N-terminal signal peptide, the C-terminal TGF-β peptide and in the N-terminal half of the pro-protein region. At the nucleotide level there is strong conservation in the middle half of the large intron and in the 3′ untranslated sequence of the last exon. The 3′ untranslated conservation, which is perfect for 110 bp among all the divergent species, is unexplained. There is strong positive linkage disequilibrium among polymorphic sites, with stretches of apparent gene conversion among originally divergent sequences. The population apparently is a migration mixture of divergent clades.


Sign in / Sign up

Export Citation Format

Share Document