scholarly journals Long-reads reveal that Rhododendron delavayi plastid genome contains extensive repeat sequences, and recombination exists among plastid genomes of photosynthetic Ericaceae

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9048
Author(s):  
Huie Li ◽  
Qiqiang Guo ◽  
Qian Li ◽  
Lan Yang

Background Rhododendron delavayi Franch. var. delavayi is a wild ornamental plant species in Guizhou Province, China. The lack of its plastid genome information seriously hinders the further application and conservation of the valuable resource. Methods The complete plastid genome of R. delavayi was assembled from long sequence reads. The genome was then characterized, and compared with those of other photosynthetic Ericaceae species. Results The plastid genome of R. delavayi has a typical quadripartite structure, and a length of 202,169 bp. It contains a large number of repeat sequences and shows preference for codon usage. The comparative analysis revealed the irregular recombination of gene sets, including rearrangement and inversion, in the large single copy region. The extreme expansion of the inverted repeat region shortened the small single copy, and expanded the full length of the genome. In addition, consistent with traditional taxonomy, R. delavayi with nine other species of the same family were clustered into Ericaceae based on the homologous protein-coding sequences of the plastid genomes. Thus, the long-read assembly of the plastid genome of R. delavayi would provide basic information for the further study of the evolution, genetic diversity, and conservation of R. delavayi and its relatives.

2021 ◽  
Vol 104 (4) ◽  
pp. 003685042110599
Author(s):  
Dhafer Alzahrani ◽  
Enas Albokhari ◽  
Abidina Abba ◽  
Samaila Yaradua

Caylusea hexagyna and Ochradenus baccatus are two species in the Resedaceae family. In this study, we analysed the complete plastid genomes of these two species using high-throughput sequencing technology and compared their genomic data. The length of the plastid genome of C. hexagyna was 154,390 bp while that of O. baccatus was 153,380 bp. The lengths of the inverted repeats (IR) regions were 26,526 bp and 26,558 bp, those of the large single copy (LSC) regions were 83,870 bp and 83,023 bp; and those of the small single copy (SSC) regions were 17,468 bp and 17,241 bp in C. hexagyna and O. baccatus, respectively. Both genomes consisted of 113 genes: 79 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Repeat analysis showed that the plastid genome included all types of repeats, with more frequent occurrences of palindromic sequences. Comparative studies of SSR markers showed that there were 256 markers in C. hexagyna and 255 in O. baccatus; the majority of the SSRs in these plastid genomes were mononucleotide repeats (A/T). All the clusters in the phylogenetic tree had high support. This study reported the first complete plastid genomes of the genera Caylusea and Ochradenus and the first for the Resedaceae family.


2021 ◽  
Author(s):  
Mahtab Moghaddam ◽  
Atsushi Ohta ◽  
Motoki Shimizu ◽  
Ryohei Terauchi ◽  
Shahrokh Kazempour-Osaloo

Abstract Plastid genome sequences provide valuable markers for surveying the evolutionary relationships and population genetics of plant species. In the present study, the complete plastid genome of Onobrychis gaubae, endemic to Iran, was sequenced using Illumina paired-end sequencing and was compared with previously known genomes of the IRLC species of legumes. The O. gaubae plastid genome was 123,645 bp in length and included a large single-copy (LSC) region of 81,034 bp, a small single-copy (SSC) region of 13,788 bp and one copy of the inverted repeat (IRb) of 28,823 bp. The genome encoded 110 genes, including 76 protein-coding genes, 30 transfer RNA (tRNA) genes and four ribosome RNA (rRNA) genes and possessed 89 simple sequence repeats (SSRs) and 28 repeated structures with the highest proportion in the LSC. Comparative analysis of the chloroplast genomes across IRLC revealed three hotspot genes (ycf1, ycf2, clpP) which could be used as molecular markers for resolving phylogenetic relationships and species identification. IRLC plastid genomes also showed multiple gene losses and inversions. Phylogenetic analyses revealed that O. gaubae is closely related to Hedysarum. The complete O. gaubae genome is a valuable resource for investigating evolution of Onobrychis species and can be used to identify related species.


2021 ◽  
Author(s):  
Chi yang ◽  
Lu Ma ◽  
Donglai Xiao ◽  
Xiaoyu Liu ◽  
Xiaoling Jiang ◽  
...  

Sparassis latifolia is a valuable edible mushroom cultivated in China. In 2018, our research group reported an incomplete and low quality genome of S. latifolia was obtained by Illumina HiSeq 2500 sequencing. These limitations in the available genome have constrained genetic and genomic studies in this mushroom resource. Herein, an updated draft genome sequence of S. latifolia was generated by Oxford Nanopore sequencing and the Hi-C technique. A total of 8.24 Gb of Oxford Nanopore long reads representing ~198.08X coverage of the S. latifolia genome were generated. Subsequently, a high-quality genome of 41.41 Mb, with scaffold and contig N50 sizes of 3.31 Mb and 1.51 Mb, respectively, was assembled. Hi-C scaffolding of the genome resulted in 12 pseudochromosomes containing 93.56% of the bases in the assembled genome. Genome annotation further revealed that 17.47% of the genome was composed of repetitive sequences. In addition, 13,103 protein-coding genes were predicted, among which 98.72% were functionally annotated. BUSCO assay results further revealed that there were 92.07% complete BUSCOs. The improved chromosome-scale assembly and genome features described here will aid further molecular elucidation of various traits, breeding of S. latifolia, and evolutionary studies with related taxa.


Forests ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 1179
Author(s):  
Ueric José Borges de Souza ◽  
Luciana Cristina Vitorino ◽  
Layara Alexandre Bessa ◽  
Fabiano Guimarães Silva

Understanding the plastid genome is extremely important for the interpretation of the genetic mechanisms associated with essential physiological and metabolic functions, the identification of possible marker regions for phylogenetic or phylogeographic analyses, and the elucidation of the modes through which natural selection operates in different regions of this genome. In the present study, we assembled the plastid genome of Artocarpus camansi, compared its repetitive structures with Artocarpus heterophyllus, and searched for evidence of synteny within the family Moraceae. We also constructed a phylogeny based on 56 chloroplast genes to assess the relationships among three families of the order Rosales, that is, the Moraceae, Rhamnaceae, and Cannabaceae. The plastid genome of A. camansi has 160,096 bp, and presents the typical circular quadripartite structure of the Angiosperms, comprising a large single copy (LSC) of 88,745 bp and a small single copy (SSC) of 19,883 bp, separated by a pair of inverted repeat (IR) regions each with a length of 25,734 bp. The total GC content was 36.0%, which is very similar to Artocarpus heterophyllus (36.1%) and other moraceous species. A total of 23,068 codons and 80 SSRs were identified in the A. camansi plastid genome, with the majority of the SSRs being mononucleotide (70.0%). A total of 50 repeat structures were observed in the A. camansi plastid genome, in contrast with 61 repeats in A. heterophyllus. A purifying selection signal was found in 70 of the 79 protein-coding genes, indicating that they have all been highly conserved throughout the evolutionary history of the genus. The comparative analysis of the structural characteristics of the chloroplast among different moraceous species found a high degree of similarity in the sequences, which indicates a highly conserved evolutionary model in these plastid genomes. The phylogenetic analysis also recovered a high degree of similarity between the chloroplast genes of A. camansi and A. heterophyllus, and reconfirmed the hypothesis of the intense conservation of the plastome in the family Moraceae.


Molecules ◽  
2019 ◽  
Vol 24 (2) ◽  
pp. 261 ◽  
Author(s):  
Yongfu Li ◽  
Steven Paul Sylvester ◽  
Meng Li ◽  
Cheng Zhang ◽  
Xuan Li ◽  
...  

Magnolia zenii is a critically endangered species known from only 18 trees that survive on Baohua Mountain in Jiangsu province, China. Little information is available regarding its molecular biology, with no genomic study performed on M. zenii until now. We determined the complete plastid genome of M. zenii and identified microsatellites. Whole sequence alignment and phylogenetic analysis using BI and ML methods were also conducted. The plastome of M. zenii was 160,048 bp long with 39.2% GC content and included a pair of inverted repeats (IRs) of 26,596 bp that separated a large single-copy (LSC) region of 88,098 bp and a small single-copy (SSC) region of 18,757 bp. One hundred thirty genes were identified, of which 79 were protein-coding genes, 37 were transfer RNAs, and eight were ribosomal RNAs. Thirty seven simple sequence repeats (SSRs) were also identified. Comparative analyses of genome structure and sequence data of closely-related species revealed five mutation hotspots, useful for future phylogenetic research. Magnolia zenii was placed as sister to M. biondii with strong support in all analyses. Overall, this study providing M. zenii genomic resources will be beneficial for the evolutionary study and phylogenetic reconstruction of Magnoliaceae.


Plants ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 618 ◽  
Author(s):  
Maria D. Logacheva ◽  
Mikhail I. Schelkunov ◽  
Aleksey N. Fesenko ◽  
Artem S. Kasianov ◽  
Aleksey A. Penin

Fagopyrum esculentum (common buckwheat) is an important agricultural non-cereal grain plant. Despite extensive genetic studies, the information on its mitochondrial genome is still lacking. Using long reads generated by single-molecule real-time technology coupled with circular consensus sequencing (CCS) protocol, we assembled the buckwheat mitochondrial genome and detected that its prevalent form consists of 10 circular chromosomes with a total length of 404 Kb. In order to confirm the presence of a multipartite structure, we developed a new targeted assembly tool capable of processing long reads. The mitogenome contains all genes typical for plant mitochondrial genomes and long inserts of plastid origin (~6.4% of the total mitogenome length). Using this new information, we characterized the genetic diversity of mitochondrial and plastid genomes in 11 buckwheat cultivars compared with the ancestral subspecies, F. esculentum ssp. ancestrale. We found it to be surprisingly low within cultivars: Only three to six variations in the mitogenome and one to two in the plastid genome. In contrast, the divergence with F. esculentum ssp. ancestrale is much higher: 220 positions differ in the mitochondrial genome and 159 in the plastid genome. The SNPs in the plastid genome are enriched in non-synonymous substitutions, in particular in the genes involved in photosynthesis: psbA, psbC, and psbH. This presumably reflects the selection for the increased photosynthesis efficiency as a part of the buckwheat breeding program.


Author(s):  
Wojciech Pląder ◽  
Yasushi Yukawa ◽  
Masahiro Sugiura ◽  
Stefan Malepszy

AbstractThe complete nucleotide sequence of the cucumber (C. sativus L. var. Borszczagowski) chloroplast genome has been determined. The genome is composed of 155,293 bp containing a pair of inverted repeats of 25,191 bp, which are separated by two single-copy regions, a small 18,222-bp one and a large 86,688-bp one. The chloroplast genome of cucumber contains 130 known genes, including 89 protein-coding genes, 8 ribosomal RNA genes (4 rRNA species), and 37 tRNA genes (30 tRNA species), with 18 of them located in the inverted repeat region. Of these genes, 16 contain one intron, and two genes and one ycf contain 2 introns. Twenty-one small inversions that form stem-loop structures, ranging from 18 to 49 bp, have been identified. Eight of them show similarity to those of other species, while eight seem to be cucumber specific. Detailed comparisons of ycf2 and ycf15, and the overall structure to other chloroplast genomes were performed.


2006 ◽  
Vol 84 (9) ◽  
pp. 1434-1443 ◽  
Author(s):  
Gernot G. Presting

All oligonucleotides of the sugarcane chloroplast genome that are conserved in one or more of 36 other completed plastid genomes have been identified by computer-assisted sequence comparison. These regions are of interest because they (i) are indicative of strong selection pressures to maintain specific nucleotide sequences that may yield insights into plastid biology and (ii) can be used as priming sites for amplifying intervening sequences that represent potential DNA barcodes for species identification. The majority of conserved sites are located in the inverted repeat (IR) region, but several sites in the single copy region (predominantly in tRNA and psa/psb genes) are conserved among chloroplasts of all higher plants examined here. Of particular interest are protein coding regions that have been conserved at the nucleotide level, as these may be involved in transcript regulation. This analysis also provides the basis for rational design of a DNA barcode for plastids, and several potential barcode regions have been identified. In particular, two oligonucleotides of length 33 and 25, and separated by approximately 362 nucleotides, are found in all cyanobacteria, red, brown and green algae, as well as diatoms, euglenids, apicomplexans and land plants that have been examined to date. Their widespread occurrence makes the intervening sequence a universal marker for all photosynthetic lineages. Analysis of 160 GenBank accessions illustrates that this region discriminates many algae at the species level, but lacks sufficient variation among the more recently diverged land plants to serve as a single DNA barcode for this taxon. However, this marker should be particularly useful for the DNA barcoding of algal lineages and lichens, as well as for environmental sampling. More rapidly evolving regions of the plastid genome also identified here serve as a starting point to design and test barcodes for more narrowly defined lineages, including the more recently diverged angiosperms.


2017 ◽  
Author(s):  
Jia-Xing Yue ◽  
Gianni Liti

AbstractLong-read sequencing technologies have become increasingly popular in genome projects due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast, Saccharomyces cerevisiae, has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here we present LRSDAY, the first one-stop solution to streamline this process. LRSDAY can produce chromosome-level end-to-end genome assembly and comprehensive annotations for various genomic features (including centromeres, protein-coding genes, tRNAs, transposable elements and telomere-associated elements) that are ready for downstream analysis. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable for virtually any eukaryotic organisms. Applying LRSDAY to a S. cerevisiae strain takes ∼43 hrs to generate a complete and well-annotated genome from ∼100X Pacific Biosciences (PacBio) reads using four threads.


Author(s):  
Priyanka Sharma ◽  
Valentine Murigneux ◽  
Jasmine Haimovitz ◽  
Catherine J. Nock ◽  
Wei Tian ◽  
...  

SummaryMacadamia, a recently domesticated expanding nut crop in the tropical and subtropical regions of the world, is one of the most economically important genera in the diverse and widely adapted Proteaceae family. All four species of Macadamia are rare in the wild with the most recently discovered, M. jansenii, being endangered. The M. jansenii genome has been used as a model for testing sequencing methods using a wide range of long read sequencing techniques. Here we report a chromosome level genome assembly, generated using a combination of Pacific Biosciences sequencing and Hi-C, comprising 14 pseudo-molecules, with a N50 of 58 Mb and a total 758 Mb genome assembly size of which 56% is repetitive. Completeness assessment revealed that the assembly covered 96.9% of the conserved single copy genes. Annotation predicted 31,591 protein coding genes and allowed the characterization of genes encoding biosynthesis of cyanogenic glycosides, fatty acid metabolism and anti-microbial proteins. Re-sequencing of seven other genotypes confirmed low diversity and low heterozygosity within this endangered species. Important morphological characteristics of this species such as small tree size and high kernel recovery suggest that M. jansenii is an important source of these commercial traits for breeding. As a member of a small group of families that are sister to the core eudicots, this high-quality genome also provides a key resource for evolutionary and comparative genomics studies.


Sign in / Sign up

Export Citation Format

Share Document