scholarly journals Characterizing gene tree conflict in plastome-inferred phylogenies

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7747 ◽  
Author(s):  
Joseph F. Walker ◽  
Nathanael Walker-Hale ◽  
Oscar M. Vargas ◽  
Drew A. Larson ◽  
Gregory W. Stull

Evolutionary relationships among plants have been inferred primarily using chloroplast data. To date, no study has comprehensively examined the plastome for gene tree conflict. Using a broad sampling of angiosperm plastomes, we characterize gene tree conflict among plastid genes at various time scales and explore correlates to conflict (e.g., evolutionary rate, gene length, molecule type). We uncover notable gene tree conflict against a backdrop of largely uninformative genes. We find alignment length and tree length are strong predictors of concordance, and that nucleotides outperform amino acids. Of the most commonly used markers, matK, greatly outperforms rbcL; however, the rarely used gene rpoC2 is the top-performing gene in every analysis. We find that rpoC2 reconstructs angiosperm phylogeny as well as the entire concatenated set of protein-coding chloroplast genes. Our results suggest that longer genes are superior for phylogeny reconstruction. The alleviation of some conflict through the use of nucleotides suggests that stochastic and systematic error is likely the root of most of the observed conflict, but further research on biological conflict within plastome is warranted given documented cases of heteroplasmic recombination. We suggest that researchers should filter genes for topological concordance when performing downstream comparative analyses on phylogenetic data, even when using chloroplast genomes.

2019 ◽  
Author(s):  
Joseph F. Walker ◽  
Gregory W. Stull ◽  
Nathanael Walker-Hale ◽  
Oscar M. Vargas ◽  
Drew A. Larson

ABSTRACTPremise of the studyEvolutionary relationships among plants have been inferred primarily using chloroplast data. To date, no study has comprehensively examined the plastome for gene tree conflict.MethodsUsing a broad sampling of angiosperm plastomes, we characterized gene tree conflict among plastid genes at various time scales and explore correlates to conflict (e.g., evolutionary rate, gene length, molecule type).Key resultsWe uncover notable gene tree conflict against a backdrop of largely uninformative genes. We find gene length is the strongest correlate to concordance, and that nucleotides outperform amino acids. Of the most commonly used markers, matK greatly outperforms rbcL; however, the rarely used gene rpoC2 is the top-performing gene in every analysis. We find that rpoC2 reconstructs angiosperm phylogeny as well as the entire concatenated set of protein-coding chloroplast genes.ConclusionsOur results suggest that longer genes are superior for phylogeny reconstruction. The alleviation of some conflict through the use of nucleotides suggests that systematic error is likely the root of most of the observed conflict, but further research on biological conflict within plastome is warranted given the documented cases of heteroplasmic recombination. We suggest rpoC2 as a useful marker for reconstructing angiosperm phylogeny, reducing the effort and expense of assembling and analyzing entire plastomes.


2020 ◽  
Author(s):  
Hong-Xin Wang ◽  
Diego F. Morales-Briones ◽  
Michael J. Moore ◽  
Jun Wen ◽  
Hua-Feng Wang

AbstractThe use of diverse datasets in phylogenetic studies aiming for understanding evolutionary histories of species can yield conflicting inference. Phylogenetic conflicts observed in animal and plant systems have often been explained by hybridization, incomplete lineage sorting (ILS), or horizontal gene transfer. Here, we employed target enrichment data and species tree and species network approaches to infer the backbone phylogeny of the family Caprifoliaceae, while distinguishing among sources of incongruence. We used 713 nuclear loci and 46 protein-coding sequences of plastome data from 43 samples representing 38 species from all major clades to reconstruct the phylogeny of the group using concatenation and coalescence approaches. We found significant nuclear gene tree conflict as well as cytonuclear discordance. Additionally, coalescent simulations and phylogenetic species network analyses suggest putative ancient hybridization among subfamilies of Caprifoliaceae, which seems to be the main source of phylogenetic discordance. Ancestral state reconstruction of six morphological characters revealed some homoplasy for each character examined. By dating the branching events, we inferred the origin of Caprifoliaceae at approximately 69.38 Ma in the late Cretaceous. By integrating evidence from molecular phylogeny, divergence times, and morphology, we herein recognize Zabelioideae as a new subfamily in Caprifoliaceae. This work shows the necessity to use a combination of multiple approaches to identify the sources of gene tree discordance. Our study also highlights the importance of using data from both nuclear and chloroplast genomes to reconstruct deep and shallow phylogenies of plants.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Joonhyung Jung ◽  
Changkyun Kim ◽  
Joo-Hwan Kim

Abstract Background Commelinaceae (Commelinales) comprise 41 genera and are widely distributed in both the Old and New Worlds, except in Europe. The relationships among genera in this family have been suggested in several morphological and molecular studies. However, it is difficult to explain their relationships due to high morphological variations and low support values. Currently, many researchers have been using complete chloroplast genome data for inferring the evolution of land plants. In this study, we completed 15 new plastid genome sequences of subfamily Commelinoideae using the Mi-seq platform. We utilized genome data to reveal the structural variations and reconstruct the problematic positions of genera for the first time. Results All examined species of Commelinoideae have three pseudogenes (accD, rpoA, and ycf15), and the former two might be a synapomorphy within Commelinales. Only four species in tribe Commelineae presented IR expansion, which affected duplication of the rpl22 gene. We identified inversions that range from approximately 3 to 15 kb in four taxa (Amischotolype, Belosynapsis, Murdannia, and Streptolirion). The phylogenetic analysis using 77 chloroplast protein-coding genes with maximum parsimony, maximum likelihood, and Bayesian inference suggests that Palisota is most closely related to tribe Commelineae, supported by high support values. This result differs significantly from the current classification of Commelinaceae. Also, we resolved the unclear position of Streptoliriinae and the monophyly of Dichorisandrinae. Among the ten CDS (ndhH, rpoC2, ndhA, rps3, ndhG, ndhD, ccsA, ndhF, matK, and ycf1), which have high nucleotide diversity values (Pi > 0.045) and over 500 bp length, four CDS (ndhH, rpoC2, matK, and ycf1) show that they are congruent with the topology derived from 77 chloroplast protein-coding genes. Conclusions In this study, we provide detailed information on the 15 complete plastid genomes of Commelinoideae taxa. We identified characteristic pseudogenes and nucleotide diversity, which can be used to infer the family evolutionary history. Also, further research is needed to revise the position of Palisota in the current classification of Commelinaceae.


Forests ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 744
Author(s):  
Yunyan Zhang ◽  
Yongjing Tian ◽  
David Y. P. Tng ◽  
Jingbo Zhou ◽  
Yuntian Zhang ◽  
...  

Litsea Lam. is an ecological and economic important genus of the “core Lauraceae” group in the Lauraceae. The few studies to date on the comparative chloroplast genomics and phylogenomics of Litsea have been conducted as part of other studies on the Lauraceae. Here, we sequenced the whole chloroplast genome sequence of Litsea auriculata, an endangered tree endemic to eastern China, and compared this with previously published chloroplast genome sequences of 11 other Litsea species. The chloroplast genomes of the 12 Litsea species ranged from 152,132 (L. szemaois) to 154,011 bp (L. garrettii) and exhibited a typical quadripartite structure with conserved genome arrangement and content, with length variations in the inverted repeat regions (IRs). No codon usage preferences were detected within the 30 codons used in the chloroplast genomes, indicating a conserved evolution model for the genus. Ten intergenic spacers (psbE–petL, trnH–psbA, petA–psbJ, ndhF–rpl32, ycf4–cemA, rpl32–trnL, ndhG–ndhI, psbC–trnS, trnE–trnT, and psbM–trnD) and five protein coding genes (ndhD, matK, ccsA, ycf1, and ndhF) were identified as divergence hotspot regions and DNA barcodes of Litsea species. In total, 876 chloroplast microsatellites were located within the 12 chloroplast genomes. Phylogenetic analyses conducted using the 51 additional complete chloroplast genomes of “core Lauraceae” species demonstrated that the 12 Litsea species grouped into four sub-clades within the Laurus-Neolitsea clade, and that Litsea is polyphyletic and closely related to the genera Lindera and Laurus. Our phylogeny strongly supported the monophyly of the following three clades (Laurus–Neolitsea, Cinnamomum–Ocotea, and Machilus–Persea) among the above investigated “core Lauraceae” species. Overall, our study highlighted the taxonomic utility of chloroplast genomes in Litsea, and the genetic markers identified here will facilitate future studies on the evolution, conservation, population genetics, and phylogeography of L. auriculata and other Litsea species.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yiheng Wang ◽  
Sheng Wang ◽  
Yanlei Liu ◽  
Qingjun Yuan ◽  
Jiahui Sun ◽  
...  

Abstract Background Atractylodes DC is the basic original plant of the widely used herbal medicines “Baizhu” and “Cangzhu” and an endemic genus in East Asia. Species within the genus have minor morphological differences, and the universal DNA barcodes cannot clearly distinguish the systemic relationship or identify the species of the genus. In order to solve these question, we sequenced the chloroplast genomes of all species of Atractylodes using high-throughput sequencing. Results The results indicate that the chloroplast genome of Atractylodes has a typical quadripartite structure and ranges from 152,294 bp (A. carlinoides) to 153,261 bp (A. macrocephala) in size. The genome of all species contains 113 genes, including 79 protein-coding genes, 30 transfer RNA genes and four ribosomal RNA genes. Four hotspots, rpl22-rps19-rpl2, psbM-trnD, trnR-trnT(GGU), and trnT(UGU)-trnL, and a total of 42–47 simple sequence repeats (SSR) were identified as the most promising potentially variable makers for species delimitation and population genetic studies. Phylogenetic analyses of the whole chloroplast genomes indicate that Atractylodes is a clade within the tribe Cynareae; Atractylodes species form a monophyly that clearly reflects the relationship within the genus. Conclusions Our study included investigations of the sequences and structural genomic variations, phylogenetics and mutation dynamics of Atractylodes chloroplast genomes and will facilitate future studies in population genetics, taxonomy and species identification.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bobby Lim-Ho Kong ◽  
Hyun-Seung Park ◽  
Tai-Wai David Lau ◽  
Zhixiu Lin ◽  
Tae-Jin Yang ◽  
...  

AbstractIlex is a monogeneric plant group (containing approximately 600 species) in the Aquifoliaceae family and one of the most commonly used medicinal herbs. However, its taxonomy and phylogenetic relationships at the species level are debatable. Herein, we obtained the complete chloroplast genomes of all 19 Ilex types that are native to Hong Kong. The genomes are conserved in structure, gene content and arrangement. The chloroplast genomes range in size from 157,119 bp in Ilex graciliflora to 158,020 bp in Ilex kwangtungensis. All these genomes contain 125 genes, of which 88 are protein-coding and 37 are tRNA genes. Four highly varied sequences (rps16-trnQ, rpl32-trnL, ndhD-psaC and ycf1) were found. The number of repeats in the Ilex genomes is mostly conserved, but the number of repeating motifs varies. The phylogenetic relationship among the 19 Ilex genomes, together with eight other available genomes in other studies, was investigated. Most of the species could be correctly assigned to the section or even series level, consistent with previous taxonomy, except Ilex rotunda var. microcarpa, Ilex asprella var. tapuensis and Ilex chapaensis. These species were reclassified; I. rotunda was placed in the section Micrococca, while the other two were grouped with the section Pseudoaquifolium. These studies provide a better understanding of Ilex phylogeny and refine its classification.


2018 ◽  
Vol 19 (12) ◽  
pp. 3780 ◽  
Author(s):  
Dingxuan He ◽  
Andrew Gichira ◽  
Zhizhong Li ◽  
John Nzei ◽  
Youhao Guo ◽  
...  

The order Nymphaeales, consisting of three families with a record of eight genera, has gained significant interest from botanists, probably due to its position as a basal angiosperm. The phylogenetic relationships within the order have been well studied; however, a few controversial nodes still remain in the Nymphaeaceae. The position of the Nuphar genus and the monophyly of the Nymphaeaceae family remain uncertain. This study adds to the increasing number of the completely sequenced plastid genomes of the Nymphaeales and applies a large chloroplast gene data set in reconstructing the intergeneric relationships within the Nymphaeaceae. Five complete chloroplast genomes were newly generated, including a first for the monotypic Euryale genus. Using a set of 66 protein-coding genes from the chloroplast genomes of 17 taxa, the phylogenetic position of Nuphar was determined and a monophyletic Nymphaeaceae family was obtained with convincing statistical support from both partitioned and unpartitioned data schemes. Although genomic comparative analyses revealed a high degree of synteny among the chloroplast genomes of the ancient angiosperms, key minor variations were evident, particularly in the contraction/expansion of the inverted-repeat regions and in RNA-editing events. Genome structure, and gene content and arrangement were highly conserved among the chloroplast genomes. The intergeneric relationships defined in this study are congruent with those inferred using morphological data.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8450 ◽  
Author(s):  
Sunan Huang ◽  
Xuejun Ge ◽  
Asunción Cano ◽  
Betty Gaby Millán Salazar ◽  
Yunfei Deng

The genus Dicliptera (Justicieae, Acanthaceae) consists of approximately 150 species distributed throughout the tropical and subtropical regions of the world. Newly obtained chloroplast genomes (cp genomes) are reported for five species of Dilciptera (D. acuminata, D. peruviana, D. montana, D. ruiziana and D. mucronata) in this study. These cp genomes have circular structures of 150,689–150,811 bp and exhibit quadripartite organizations made up of a large single copy region (LSC, 82,796–82,919 bp), a small single copy region (SSC, 17,084–17,092 bp), and a pair of inverted repeat regions (IRs, 25,401–25,408 bp). Guanine-Cytosine (GC) content makes up 37.9%–38.0% of the total content. The complete cp genomes contain 114 unique genes, including 80 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes. Comparative analyses of nucleotide variability (Pi) reveal the five most variable regions (trnY-GUA-trnE-UUC, trnG-GCC, psbZ-trnG-GCC, petN-psbM, and rps4-trnL-UUA), which may be used as molecular markers in future taxonomic identification and phylogenetic analyses of Dicliptera. A total of 55-58 simple sequence repeats (SSRs) and 229 long repeats were identified in the cp genomes of the five Dicliptera species. Phylogenetic analysis identified a close relationship between D. ruiziana and D. montana, followed by D. acuminata, D. peruviana, and D. mucronata. Evolutionary analysis of orthologous protein-coding genes within the family Acanthaceae revealed only one gene, ycf15, to be under positive selection, which may contribute to future studies of its adaptive evolution. The completed genomes are useful for future research on species identification, phylogenetic relationships, and the adaptive evolution of the Dicliptera species.


Molecules ◽  
2018 ◽  
Vol 23 (9) ◽  
pp. 2137 ◽  
Author(s):  
Xiang-Xiao Meng ◽  
Yan-Fang Xian ◽  
Li Xiang ◽  
Dong Zhang ◽  
Yu-Hua Shi ◽  
...  

The genus Sanguisorba, which contains about 30 species around the world and seven species in China, is the source of the medicinal plant Sanguisorba officinalis, which is commonly used as a hemostatic agent as well as to treat burns and scalds. Here we report the complete chloroplast (cp) genome sequences of four Sanguisorba species (S. officinalis, S. filiformis, S. stipulata, and S. tenuifolia var. alba). These four Sanguisorba cp genomes exhibit typical quadripartite and circular structures, and are 154,282 to 155,479 bp in length, consisting of large single-copy regions (LSC; 84,405–85,557 bp), small single-copy regions (SSC; 18,550–18,768 bp), and a pair of inverted repeats (IRs; 25,576–25,615 bp). The average GC content was ~37.24%. The four Sanguisorba cp genomes harbored 112 different genes arranged in the same order; these identical sections include 78 protein-coding genes, 30 tRNA genes, and four rRNA genes, if duplicated genes in IR regions are counted only once. A total of 39–53 long repeats and 79–91 simple sequence repeats (SSRs) were identified in the four Sanguisorba cp genomes, which provides opportunities for future studies of the population genetics of Sanguisorba medicinal plants. A phylogenetic analysis using the maximum parsimony (MP) method strongly supports a close relationship between S. officinalis and S. tenuifolia var. alba, followed by S. stipulata, and finally S. filiformis. The availability of these cp genomes provides valuable genetic information for future studies of Sanguisorba identification and provides insights into the evolution of the genus Sanguisorba.


Molecules ◽  
2018 ◽  
Vol 23 (9) ◽  
pp. 2165 ◽  
Author(s):  
Xiao Zhang ◽  
Tao Zhou ◽  
Jia Yang ◽  
Jingjing Sun ◽  
Miaomiao Ju ◽  
...  

Cucurbitaceae is the fourth most important economic plant family with creeping herbaceous species mainly distributed in tropical and subtropical regions. Here, we described and compared the complete chloroplast genome sequences of ten representative species from Cucurbitaceae. The lengths of the ten complete chloroplast genomes ranged from 155,293 bp (C. sativus) to 158,844 bp (M. charantia), and they shared the most common genomic features. 618 repeats of three categories and 813 microsatellites were found. Sequence divergence analysis showed that the coding and IR regions were highly conserved. Three protein-coding genes (accD, clpP, and matK) were under selection and their coding proteins often have functions in chloroplast protein synthesis, gene transcription, energy transformation, and plant development. An unconventional translation initiation codon of psbL gene was found and provided evidence for RNA editing. Applying BI and ML methods, phylogenetic analysis strongly supported the position of Gomphogyne, Hemsleya, and Gynostemma as the relatively original lineage in Cucurbitaceae. This study suggested that the complete chloroplast genome sequences were useful for phylogenetic studies. It would also determine potential molecular markers and candidate DNA barcodes for coming studies and enrich the valuable complete chloroplast genome resources of Cucurbitaceae.


Sign in / Sign up

Export Citation Format

Share Document