Sequencing and Structural Analysis of the Complete Chloroplast Genome of the Medicinal Plant Lycium chinense Mill

Zerui Yang; Yuying Huang; Wenli An; Xiasheng Zheng; Song Huang; Lingling Liang

doi:10.3390/plants8040087

Complete chloroplast genome sequence and structural analysis of the medicinal plant Lycium chinense Mill

10.7287/peerj.preprints.27305v1 ◽

2018 ◽

Author(s):

Zerui Yang ◽

Yuying Huang ◽

Xiasheng Zheng ◽

Song Huang ◽

Lingling Liang

Keyword(s):

Chloroplast Genome ◽

Gc Content ◽

Single Copy ◽

Start Codon ◽

Lycium Chinense ◽

Protein Coding ◽

Engineering Research ◽

Complete Sequences ◽

Chloroplast Genome Sequence ◽

Cp Genome

Lycium chinense Mill, an important Chinese herbal medicine, is emphasized as a healthy food and is widely used as a dietary supplement. Here we sequenced and analyzed the complete chloroplast (CP) genome of the L. chinense, which is 155,756 bp in length and with 37.8% GC content. This CP genome consists of a pair of inverted repeat regions (IRa and IRb) of 25,476 bp, separated by a large single-copy region (LSC) and a small single-copy region (SSC), with length of 86,595 and 18,209 bp, respectively. Annotation results revealed that the L. chinense CP genome contains 114 genes, 16 of which are duplicated genes. Most of the 85 protein-coding genes have a usual ATG start codon, except for 3 genes including rps12, psbL and ndhD. Furthermore, most of the simple sequence repeats (SSRs) are short polyadenine or polythymine repeats that contribute to the high AT content of the chloroplast genome. Revealing of the complete sequences and annotation of the L. chinense chloroplast genome will facilitate phylogenic, population and genetic engineering research investigations involving this particular species.

Download Full-text

Complete chloroplast genome sequence and structural analysis of the medicinal plant Lycium chinense Mill

10.7287/peerj.preprints.27305 ◽

2018 ◽

Author(s):

Zerui Yang ◽

Yuying Huang ◽

Xiasheng Zheng ◽

Song Huang ◽

Lingling Liang

Keyword(s):

Chloroplast Genome ◽

Gc Content ◽

Single Copy ◽

Start Codon ◽

Lycium Chinense ◽

Protein Coding ◽

Engineering Research ◽

Complete Sequences ◽

Chloroplast Genome Sequence ◽

Cp Genome

Lycium chinense Mill, an important Chinese herbal medicine, is emphasized as a healthy food and is widely used as a dietary supplement. Here we sequenced and analyzed the complete chloroplast (CP) genome of the L. chinense, which is 155,756 bp in length and with 37.8% GC content. This CP genome consists of a pair of inverted repeat regions (IRa and IRb) of 25,476 bp, separated by a large single-copy region (LSC) and a small single-copy region (SSC), with length of 86,595 and 18,209 bp, respectively. Annotation results revealed that the L. chinense CP genome contains 114 genes, 16 of which are duplicated genes. Most of the 85 protein-coding genes have a usual ATG start codon, except for 3 genes including rps12, psbL and ndhD. Furthermore, most of the simple sequence repeats (SSRs) are short polyadenine or polythymine repeats that contribute to the high AT content of the chloroplast genome. Revealing of the complete sequences and annotation of the L. chinense chloroplast genome will facilitate phylogenic, population and genetic engineering research investigations involving this particular species.

Download Full-text

Complete Chloroplast Genomes from Sanguisorba: Identity and Variation Among Four Species

Molecules ◽

10.3390/molecules23092137 ◽

2018 ◽

Vol 23 (9) ◽

pp. 2137 ◽

Cited By ~ 6

Author(s):

Xiang-Xiao Meng ◽

Yan-Fang Xian ◽

Li Xiang ◽

Dong Zhang ◽

Yu-Hua Shi ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Future Studies ◽

Chloroplast Genomes ◽

Close Relationship ◽

Cp Genome ◽

Sanguisorba Officinalis

The genus Sanguisorba, which contains about 30 species around the world and seven species in China, is the source of the medicinal plant Sanguisorba officinalis, which is commonly used as a hemostatic agent as well as to treat burns and scalds. Here we report the complete chloroplast (cp) genome sequences of four Sanguisorba species (S. officinalis, S. filiformis, S. stipulata, and S. tenuifolia var. alba). These four Sanguisorba cp genomes exhibit typical quadripartite and circular structures, and are 154,282 to 155,479 bp in length, consisting of large single-copy regions (LSC; 84,405–85,557 bp), small single-copy regions (SSC; 18,550–18,768 bp), and a pair of inverted repeats (IRs; 25,576–25,615 bp). The average GC content was ~37.24%. The four Sanguisorba cp genomes harbored 112 different genes arranged in the same order; these identical sections include 78 protein-coding genes, 30 tRNA genes, and four rRNA genes, if duplicated genes in IR regions are counted only once. A total of 39–53 long repeats and 79–91 simple sequence repeats (SSRs) were identified in the four Sanguisorba cp genomes, which provides opportunities for future studies of the population genetics of Sanguisorba medicinal plants. A phylogenetic analysis using the maximum parsimony (MP) method strongly supports a close relationship between S. officinalis and S. tenuifolia var. alba, followed by S. stipulata, and finally S. filiformis. The availability of these cp genomes provides valuable genetic information for future studies of Sanguisorba identification and provides insights into the evolution of the genus Sanguisorba.

Download Full-text

Characterization of the complete chloroplast genome sequence and phylogenetic analysis of B. oleracea var. italica

10.21203/rs.2.20976/v1 ◽

2020 ◽

Author(s):

Zhenchao Zhang ◽

Zhongliang Dai ◽

Yuemei Yao ◽

Yongfei Pan ◽

Guosheng Sun ◽

...

Keyword(s):

Chloroplast Genome ◽

Genome Sequence ◽

Genomic Structure ◽

Gc Content ◽

Single Copy ◽

Biological Research ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome ◽

Functional Components

Abstract Backgrounds: Broccoli (Brassica. oleracea var. italica L.) is known as one of the most nutritionally rich vegetables, as well as rich in functional components that benefit to health. The main purposes of this research were sequencing, assembling and annotation of chloroplast genome of broccoli based on Illumina HiSeq2500 sequencing platform. Results: The size of the broccoli cp genome is 153,364 bp, including two inverted repeat (IR) regions of 26,197 bp each, separated by a small single copy (SSC) region of 17,834 bp and a large single copy (LSC) region of 83,136 bp. The GC content of the complete genome is 36.36%, while those of SSC, LSC, and IR are 29.1%, 34.15% and 42.35%, respectively. It harbors 134 functional genes, including 87 protein-coding genes, 39 tRNAs and 8 rRNAs, with 31 duplicates in the IRs. The most abundant amino acid in the protein-coding genes is leucine, while the least is cysteine. Codon usage frequency showed bias for A/T-ending codons in the cp genome. In the repeat structure analysis, a total of 34 repeat sequences and 291 simple sequence repeat (SSRs) were detected in the work. Although cp genomic structure and size are highly conserved, the SC-IR boundary regions are variable between the 7 cp genomes. The phylogenetic relationships based on complete cp genome from 9 species suggest that B. oleracea var. italica is closely related to Brassica juncea. Conclusions: The complete cp genome sequence was obtained and annotated for broccoli for the first time. The information acquired from this research will be useful for further species identification, population genetics and biological research of broccoli.

Download Full-text

Complete Chloroplast Genome Sequence of Justicia flava: Genome Comparative Analysis and Phylogenetic Relationships among Acanthaceae

BioMed Research International ◽

10.1155/2019/4370258 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Samaila S. Yaradua ◽

Dhafer A. Alzahrani ◽

Enas J. Albokhary ◽

Abidina Abba ◽

Abubakar Bello

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Phylogenetic Relationships ◽

Inverted Repeat ◽

Gc Content ◽

Single Copy ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Protein Coding Genes ◽

Cp Genome

The complete chloroplast genome of J. flava, an endangered medicinal plant in Saudi Arabia, was sequenced and compared with cp genome of three Acanthaceae species to characterize the cp genome, identify SSRs, and also detect variation among the cp genomes of the sampled Acanthaceae. NOVOPlasty was used to assemble the complete chloroplast genome from the whole genome data. The cp genome of J. flava was 150, 888bp in length with GC content of 38.2%, and has a quadripartite structure; the genome harbors one pair of inverted repeat (IRa and IRb 25, 500bp each) separated by large single copy (LSC, 82, 995 bp) and small single copy (SSC, 16, 893 bp). There are 132 genes in the genome, which includes 80 protein coding genes, 30 tRNA, and 4 rRNA; 113 are unique while the remaining 19 are duplicated in IR regions. The repeat analysis indicates that the genome contained all types of repeats with palindromic occurring more frequently; the analysis also identified total number of 98 simple sequence repeats (SSR) of which majority are mononucleotides A/T and are found in the intergenic spacer. The comparative analysis with other cp genomes sampled indicated that the inverted repeat regions are conserved than the single copy regions and the noncoding regions show high rate of variation than the coding region. All the genomes have ndhF and ycf1 genes in the border junction of IRb and SSC. Sequence divergence analysis of the protein coding genes showed that seven genes (petB, atpF, psaI, rpl32, rpl16, ycf1, and clpP) are under positive selection. The phylogenetic analysis revealed that Justiceae is sister to Ruellieae. This study reported the first cp genome of the largest genus in Acanthaceae and provided resources for studying genetic diversity of J. flava as well as resolving phylogenetic relationships within the core Acanthaceae.

Download Full-text

Characterization of the complete chloroplast genome sequence of Vitis vinifera ‘Guifeimeigui’

Scientific Journal of Genetics and Gene Therapy ◽

10.17352/sjggt.000019 ◽

2021 ◽

pp. 001-003

Author(s):

Liu Li ◽

Yang Yang ◽

Li Xiujie ◽

Li Bo

Keyword(s):

Vitis Vinifera ◽

Gc Content ◽

Single Copy ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Chloroplast Genome Sequence ◽

Cp Genome ◽

Eurasian Species ◽

Rna Genes

Vitis vinifera ‘Guifeimeigui’ is a diploid table grape, a Eurasian species. This research first reported the complete chloroplast (cp) genome of Vitis vinifera ‘Guifeimeigui’. The size of the complete cp genome is 160,928 bp and its GC content is 37.38%, including a pair of inverted repeats (26,353 bp each) separated by large (89,150 bp) and small (19,072 bp) single-copy regions. It encodes 85 genes, including 40 protein coding genes, 37 transfer RNA genes (tRNA), and 8 ribosomal RNA genes (rRNA). The Maximum Likelihood (ML) phylogenetic tree demonstrated that Vitis vinifera ‘Guifeimeigui’ is close to Vitis vinifera.

Download Full-text

Characterization of the Complete Chloroplast Genome of Buddleja Lindleyana

Journal of AOAC International ◽

10.1093/jaoacint/qsab066 ◽

2021 ◽

Author(s):

Shanshan Liu ◽

Shiyin Feng ◽

Yuying Huang ◽

Wenli An ◽

Zerui Yang ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Future Research ◽

Trna Genes ◽

Similar Species ◽

Protein Coding ◽

Genome Data ◽

Cp Genome ◽

Genomic Resource

Abstract Background Buddleja lindleyana Fort., which belongs to the Loganiaceae with a distribution throughout the tropics, is widely used as an ornamental plant in China. Buddleja contains several morphologically similar species, which need to be identified by molecular identification. But there is little molecular research on the genus Buddleja. Objective Using molecular biology techniques to sequence and analyze the complete chloroplast (cp) genome of B. lindleyana Methods According to next-generation sequencing to sequence the genome data, a series of bioinformatics software were used to assembly and analysis the molecular structure of cp genome of B. lindleyana. Results The complete cp genome of B. lindleyana is a circular 154,487-bp-long molecule with a GC content of 38.1%. It has a familiar quadripartite structure, including a large single-copy region (LSC; 85,489 bp), a small single-copy region (SSC; 17,898bp) and a pair of inverted repeats (IRs; 25,550 bp). A total of 133 genes were identified in the genome, including 86 protein-coding genes, 37 tRNA genes, 8 rRNA genes and 2 pseudogenes. Conclusions These results suggested that B. lindelyana cp genome could be used as a potential genomic resource to resolve the phylogenetic positions and relationships of Loganiaceae, and will offer valuable information for future research in the identification of Buddleja species and will conduce to genomic investigations of these species.

Download Full-text

Comparative Analyses of Euonymus Chloroplast Genomes: Genetic Structure, Screening for Loci With Suitable Polymorphism, Positive Selection Genes, and Phylogenetic Relationships Within Celastrineae

Frontiers in Plant Science ◽

10.3389/fpls.2020.593984 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yongtan Li ◽

Yan Dong ◽

Yichao Liu ◽

Xiaoyue Yu ◽

Minsheng Yang ◽

...

Keyword(s):

Positive Selection ◽

Chloroplast Genome ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Evolutionary Relationships ◽

Trna Genes ◽

Protein Coding ◽

Chloroplast Genomes ◽

Cp Genome

In this study, we assembled and annotated the chloroplast (cp) genome of the Euonymus species Euonymus fortunei, Euonymus phellomanus, and Euonymus maackii, and performed a series of analyses to investigate gene structure, GC content, sequence alignment, and nucleic acid diversity, with the objectives of identifying positive selection genes and understanding evolutionary relationships. The results indicated that the Euonymus cp genome was 156,860–157,611bp in length and exhibited a typical circular tetrad structure. Similar to the majority of angiosperm chloroplast genomes, the results yielded a large single-copy region (LSC) (85,826–86,299bp) and a small single-copy region (SSC) (18,319–18,536bp), separated by a pair of sequences (IRA and IRB; 26,341–26,700bp) with the same encoding but in opposite directions. The chloroplast genome was annotated to 130–131 genes, including 85–86 protein coding genes, 37 tRNA genes, and eight rRNA genes, with GC contents of 37.26–37.31%. The GC content was variable among regions and was highest in the inverted repeat (IR) region. The IR boundary of Euonymus happened expanding resulting that the rps19 entered into IR region and doubled completely. Such fluctuations at the border positions might be helpful in determining evolutionary relationships among Euonymus. The simple-sequence repeats (SSRs) of Euonymus species were composed primarily of single nucleotides (A)n and (T)n, and were mostly 10–12bp in length, with an obvious A/T bias. We identified several loci with suitable polymorphism with the potential use as molecular markers for inferring the phylogeny within the genus Euonymus. Signatures of positive selection were seen in rpoB protein encoding genes. Based on data from the whole chloroplast genome, common single copy genes, and the LSC, SSC, and IR regions, we constructed an evolutionary tree of Euonymus and related species, the results of which were consistent with traditional taxonomic classifications. It showed that E. fortunei sister to the Euonymus japonicus, whereby E. maackii appeared as sister to Euonymus hamiltonianus. Our study provides important genetic information to support further investigations into the phylogenetic development and adaptive evolution of Euonymus species.

Download Full-text

Comprehensive Analysis of Rhodomyrtus tomentosa Chloroplast Genome

Plants ◽

10.3390/plants8040089 ◽

2019 ◽

Vol 8 (4) ◽

pp. 89 ◽

Cited By ~ 7

Author(s):

Yuying Huang ◽

Zerui Yang ◽

Song Huang ◽

Wenli An ◽

Jing Li ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Sister Relationship ◽

Protein Coding ◽

Protein Coding Genes ◽

Plastid Genomes ◽

Cp Genome ◽

Rhodomyrtus Tomentosa

In the last decade, several studies have relied on a small number of plastid genomes to deduce deep phylogenetic relationships in the species-rich Myrtaceae. Nevertheless, the plastome of Rhodomyrtus tomentosa, an important representative plant of the Rhodomyrtus (DC.) genera, has not yet been reported yet. Here, we sequenced and analyzed the complete chloroplast (CP) genome of R. tomentosa, which is a 156,129-bp-long circular molecule with 37.1% GC content. This CP genome displays a typical quadripartite structure with two inverted repeats (IRa and IRb), of 25,824 bp each, that are separated by a small single copy region (SSC, 18,183 bp) and one large single copy region (LSC, 86,298 bp). The CP genome encodes 129 genes, including 84 protein-coding genes, 37 tRNA genes, eight rRNA genes and three pseudogenes (ycf1, rps19, ndhF). A considerable number of protein-coding genes have a universal ATG start codon, except for psbL and ndhD. Premature termination codons (PTCs) were found in one protein-coding gene, namely atpE, which is rarely reported in the CP genome of plants. Phylogenetic analysis revealed that R. tomentosa has a sister relationship with Eugenia uniflora and Psidium guajava. In conclusion, this study identified unique characteristics of the R. tomentosa CP genome providing valuable information for further investigations on species identification and the phylogenetic evolution between R. tomentosa and related species.

Download Full-text

The complete chloroplast genome ofColobanthus apetalus(Labill.) Druce: genome organization and comparison with related species

PeerJ ◽

10.7717/peerj.4723 ◽

2018 ◽

Vol 6 ◽

pp. e4723 ◽

Cited By ~ 3

Author(s):

Piotr Androsiuk ◽

Jan Paweł Jastrzębski ◽

Łukasz Paukszto ◽

Adam Okorski ◽

Agnieszka Pszczółkowska ◽

...

Keyword(s):

Gc Content ◽

Large Family ◽

Detailed Comparison ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome ◽

Unique Genes

Colobanthus apetalusis a member of the genusColobanthus, one of the 86 genera of the large family Caryophyllaceae which groups annual and perennial herbs (rarely shrubs) that are widely distributed around the globe, mainly in the Holarctic. The genusColobanthusconsists of 25 species, includingColobanthus quitensis, an extremophile plant native to the maritime Antarctic. Complete chloroplast (cp) genomes are useful for phylogenetic studies and species identification. In this study, next-generation sequencing (NGS) was used to identify the cp genome ofC. apetalus.The complete cp genome ofC. apetalushas the length of 151,228 bp, 36.65% GC content, and a quadripartite structure with a large single copy (LSC) of 83,380 bp and a small single copy (SSC) of 17,206 bp separated by inverted repeats (IRs) of 25,321 bp. The cp genome contains 131 genes, including 112 unique genes and 19 genes which are duplicated in the IRs. The group of 112 unique genes features 73 protein-coding genes, 30 tRNA genes, four rRNA genes and five conserved chloroplast open reading frames (ORFs). A total of 12 forward repeats, 10 palindromic repeats, five reverse repeats and three complementary repeats were detected. In addition, a simple sequence repeat (SSR) analysis revealed 41 (mono-, di-, tri-, tetra-, penta- and hexanucleotide) SSRs, most of which were AT-rich. A detailed comparison ofC. apetalusandC. quitensiscp genomes revealed identical gene content and order. A phylogenetic tree was built based on the sequences of 76 protein-coding genes that are shared by the eleven sequenced representatives of Caryophyllaceae andC. apetalus,and it revealed thatC. apetalusandC. quitensisform a clade that is closely related toSilenespecies andAgrostemma githago. Moreover, the genusSileneappeared as a polymorphic taxon. The results of this study expand our knowledge about the evolution and molecular biology of Caryophyllaceae.

Download Full-text