scholarly journals Transcriptome Sequencing of Different Avocado Ecotypes: de novo Transcriptome Assembly, Annotation, Identification and Validation of EST-SSR Markers

Forests ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 411 ◽  
Author(s):  
Yu Ge ◽  
Lin Tan ◽  
Bin Wu ◽  
Tao Wang ◽  
Teng Zhang ◽  
...  

Avocado (Persea americana Mill.) could be considered as an important tropical and subtropical woody oil crop with high economic and nutritional value. Despite the importance of this species, genomic information is currently unavailable for avocado and closely related congeners. In this study, we generated more than 216 million clean reads from different avocado ecotypes using Illumina HiSeq high-throughput sequencing technology. The high-quality reads were assembled into 154,310 unigenes with an average length of 922 bp. A total of 55,558 simple sequence repeat (SSR) loci detected among the 43,270 SSR-containing unigene sequences were used to develop 74,580 expressed sequence tag (EST)-SSR markers. From these markers, a subset of 100 EST-SSR markers was randomly chosen to identify polymorphic EST-SSR markers in 28 avocado accessions. Sixteen EST-SSR markers with moderate to high polymorphism levels were detected, with polymorphism information contents ranging from 0.33 to 0.84 and averaging 0.63. These 16 polymorphic EST-SSRs could clearly and effectively distinguish the 28 avocado accessions. In summary, our study is the first presentation of transcriptome data of different avocado ecotypes and comprehensive study on the development and analysis of a set of EST-SSR markers in avocado. The application of next-generation sequencing techniques for SSR development is a potentially powerful tool for genetic studies.

2018 ◽  
Vol 54 (No. 1) ◽  
pp. 17-25 ◽  
Author(s):  
D.-D. Vu ◽  
T.T.-X. Bui ◽  
T.H.-N. Nguyen ◽  
S.N.M. Shah ◽  
N.-H. Vu ◽  
...  

A total 20 074 230 sequencing reads were generated by Illumina HiSeq<sup>™ </sup>2500 from three different Toxicodendron vernicifluum tissue samples. In total, 48 693 unigenes with an average length of 703.34 bp were obtained by de novo assembly. 3392 potential EST-SSRs (expressed sequence tag-simple sequence repeat) were identified as potential molecular markers from unigenes with lengths exceeding 1 kb. A total of 80 pairs of PCR primers were randomly selected to validate the assembly quality and develop EST-SSR markers from genomic DNA. Of these primer pairs, 14 primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism within the lacquer tree population in Langao, Shaanxi province, China. There were high genetic diversities (number of alleles per locus (A) = 2.93, polymorphic information content (PIC) = 0.53, observed heterozygosity (Ho) = 0.62 and expected heterozygosity (He) = 0.85) in the lacquer tree natural population. The four loci were significantly deviated from Hardy-Weinberg equilibrium. These results suggested high homozygosity in the population and low or deficiency in heterozygosity (inbreeding coefficient (Fis) = 0.27). These polymorphic EST-SSR markers will provide the base for further studies of genetic structure and breeding in T. vernicifluum.


Author(s):  
Boyun Yang ◽  
Huolin Luo ◽  
Yuan Tao ◽  
Wenjing Yu ◽  
Liping Luo

Cymbidium kanran is an important commercially grown member of the Chinese orchid family. However, little information regarding the molecular biology of this species is available. In this study, the C. kanran root, shoot, stem, leaf, and flower transcriptomes were sequenced with the Illumina HiSeq 4000 system, which resulted in 8.9 Gb of clean reads that were assembled into 74,620 unigenes, with an average length and N50 of 983 bp and 1,640 bp, respectively. The screening of seven databases (NR, NT, GO, KOG, KEGG, Swiss-Prot, and InterPro) for similar sequences resulted in the functional annotation of 49,813 unigenes. Additionally, 173 MADS-box genes, which help to control major aspects of plant development, were identified and their codon usage bias was analyzed. Only 26 genes had a low ENC (less than or equal to 35), suggesting the codon usage bias was weak. Base mutations were the major determinants of codon usage, although natural selection pressure also influenced codon usage bias. Moreover, 22 optimal codons were identified based on ΔRSCU, and 20 codons ended with A/U. The results of this study provide the foundation for the molecular breeding of new varieties


2020 ◽  
Author(s):  
Duy Dinh Vu ◽  
Syed Noor Muhammad Shah ◽  
Mai Phuong Pham ◽  
Van Thang Bui ◽  
Minh Tam Nguyen ◽  
...  

Abstract Background: Understanding the genetic diversity in threatened species that occur in forest remnants is necessary to establish efficient strategies for the species conservation, restoration and management. Panax vietnamensis Ha et Grushv. is medicinally important, endemic and endangered species of Vietnam. However, genetic diversity and structure of population is unknown due to lack of efficient molecular markers.Results: In this study, we employed Illumina HiSeq TM 4000 sequencing to analyze the transcriptomes of P. vietnamensis (roots, leaves and stems). A total of 23,741,783 raw reads were obtained and assembled, from which, 89,271 unigenes with an average length of 598.3191 nt were generated. During functional annotation, 31,686 unigenes were annotated in Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes pathways, Swiss-Prot database, and Nucleotide Collection (NR/NT) database. In addition, 11,343 expressed sequence tag-simple sequence repeat (EST-SSRs) were detected. From 7,774 primer pairs, 101 were selected for polymorphism validation, in which, 20 primer pairs were successfully amplified to DNA fragments and significant amounts of polymorphism was observed within population. The nine polymorphic microsatellite loci were used to analyze genetic diversity and structure of the natural populations. The obtained results revealed that the shows high levels of genetic diversity in populations, the average observed and expected heterozygosity were H O = 0.422 and H E = 0.479. During the Bottleneck analysis using TPM and SMM models (p < 0.01) shows that targeted population is significantly heterozygote deficient. This suggests sign of bottleneck in all populations. Genetic differentiation among populations was moderate (F ST = 0.133) and indicating limited gene flow (Nm = 1.63). Analysis of molecular variance (AMOVA) showed 63.17% of variation within individuals and 12.45% among populations. These results showed a moderate genetic structure of P. vietnamensis. STRUCTURE analysis and the unweighted pair-group method with arithmetic means (UPGMA) tree revealed strong genetic structure and two genetic clusters related to geographical distances, as well. Conclusion: Our study will assist conservators in future conservation management, breeding, production and habitats restoration of the species.


2020 ◽  
Author(s):  
Duy Dinh Vu ◽  
Syed Noor Muhammad Shah ◽  
Mai Phuong Pham ◽  
Van Thang Bui ◽  
Minh Tam Nguyen ◽  
...  

Abstract Background: Understanding the genetic diversity in threatened species that occur in forest remnants is necessary to establish efficient strategies for the species conservation, restoration and management. Panax vietnamensis Ha et Grushv. is medicinally important, endemic and endangered species of Vietnam. However, genetic diversity and structure of population is unknown due to lack of efficient molecular markers. Results: In this study, we employed Illumina HiSeq TM 4000 sequencing to analyze the transcriptomes of P. vietnamensis (roots, leaves and stems). A total of 23,741,783 raw reads were obtained and assembled, from which, 89,271 unigenes with an average length of 598.3191 nt were generated. During functional annotation, 31,686 unigenes were annotated in Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes pathways, Swiss-Prot database, and Nucleotide Collection (NR/NT) database. In addition, 11,343 expressed sequence tag-simple sequence repeat (EST-SSRs) were detected. From 7,774 primer pairs, 101 were selected for polymorphism validation, in which, 20 primer pairs were successfully amplified to DNA fragments and significant amounts of polymorphism was observed within population. The nine polymorphic microsatellite loci were used to analyze genetic diversity and structure of the natural populations. The obtained results revealed that the shows high levels of genetic diversity in populations, the average observed and expected heterozygosity were H O = 0.422 and H E = 0.479. During the Bottleneck analysis using TPM and SMM models (p < 0.01) shows that targeted population is significantly heterozygote deficient. This suggests sign of bottleneck in all populations. Genetic differentiation among populations was moderate (F ST = 0.133) and indicating limited gene flow (Nm = 1.63). Analysis of molecular variance (AMOVA) showed 63.17% of variation within individuals and 12.45% among populations. These results showed a moderate genetic structure of P. vietnamensis. STRUCTURE analysis and the unweighted pair-group method with arithmetic means (UPGMA) tree revealed strong genetic structure and two genetic clusters related to geographical distances, as well. Conclusion: Our study will assist conservators in future conservation management, breeding, production and habitats restoration of the species.


Agronomy ◽  
2019 ◽  
Vol 9 (9) ◽  
pp. 512 ◽  
Author(s):  
Ge ◽  
Zang ◽  
Tan ◽  
Wang ◽  
Liu ◽  
...  

Avocado (Persea americana Mill.) is an important fruit crop commercially grown in tropical and subtropical regions. Despite the importance of avocado, there is relatively little available genomic information regarding this fruit species. In this study, we functionally annotated the full-length avocado transcriptome sequence based on single-molecule real-time sequencing technology, and predicted the coding sequences (CDSs), transcription factors (TFs), and long non-coding RNA (lncRNA) sequences. Moreover, 76,777 simple sequence repeat (SSR) loci detected among the 42,096 SSR-containing transcript sequences were used to develop 149,733 expressed sequence tag (EST)-SSR markers. A subset of 100 EST-SSR markers was randomly chosen for an analysis that detected 15 polymorphicEST-SSR markers, with an average polymorphism information content of 0.45. These 15markers were able to clearly and effectively characterize46 avocado accessions based on geographical origin. In summary, our study is the first to generate a full-length transcriptome sequence and develop and analyze a set of EST-SSR markers in avocado. The application of third-generation sequencing techniques for developing SSR markers is a potentially powerful tool for genetic studies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Gabriela Torres-Silva ◽  
Ludmila Nayara Freitas Correia ◽  
Diego Silva Batista ◽  
Andréa Dias Koehler ◽  
Sheila Vitória Resende ◽  
...  

Melocactus glaucescens is an endangered cactus highly valued for its ornamental properties. In vitro shoot production of this species provides a sustainable alternative to overharvesting from the wild; however, its propagation could be improved if the genetic regulation underlying its developmental processes were known. The present study generated de novo transcriptome data, describing in vitro shoot organogenesis induction in M. glaucescens. Total RNA was extracted from explants before (control) and after shoot organogenesis induction (treated). A total of 14,478 unigenes (average length, 520 bases) were obtained using Illumina HiSeq 3000 (Illumina Inc., San Diego, CA, USA) sequencing and transcriptome assembly. Filtering for differential expression yielded 2,058 unigenes. Pairwise comparison of treated vs. control genes revealed that 1,241 (60.3%) unigenes exhibited no significant change, 226 (11%) were downregulated, and 591 (28.7%) were upregulated. Based on database analysis, more transcription factor families and unigenes appeared to be upregulated in the treated samples than in controls. Expression of WOUND INDUCED DEDIFFERENTIATION 1 (WIND1) and CALMODULIN (CaM) genes, both of which were upregulated in treated samples, was further validated by real-time quantitative PCR (RT-qPCR). Differences in gene expression patterns between control and treated samples indicate substantial changes in the primary and secondary metabolism of M. glaucescens after the induction of shoot organogenesis. These results help to clarify the molecular genetics and functional genomic aspects underlying propagation in the Cactaceae family.


2020 ◽  
Vol 69 (1) ◽  
pp. 116-122
Author(s):  
Tsam Ju ◽  
Perla Farhat ◽  
Wenjing Tao ◽  
Jibin Miao ◽  
Jialiang Li ◽  
...  

AbstractJuniperus squamata, an endemic conifer of Asia, is an important shrub ecologically and economically. Yet little is known about its genetic diversity and population structure due to lacking of highly polymorphic molecular markers. In this study, expressed sequence tag microsatellite markers (EST-SSR) were developed for Juniperus squamata. Illumina HiSeq data were used to reconstruct the transcriptome of this species by de novo assembly. Based on this transcriptome, 18 SSR markers were designed and successfully amplified. Just one locus was eliminated due to its detection of null alleles and the remaining 17 loci were polymorphic, generating five to 14 alleles per locus in J. squamata. Markers cross-amplification tests were successful in two closely related species of J. squamata. These markers will serve as a basis for further studies to assess the genetic diversity and population structure of J. squamata. As well, they could be useful in promoting sustainable forest management strategies for this species in the face of global climate change.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Qiuyu Sun ◽  
Jie Liu ◽  
Keyu Zhang ◽  
Chong Huang ◽  
Leifu Li ◽  
...  

AbstractSouthern corn rust is a destructive maize disease caused by Puccinia polysora Underw that can lead to severe yield losses. However, genomic information and microsatellite markers are currently unavailable for this disease. In this study, we generated a total of 27,295,216 high-quality cDNA sequence reads using Illumina sequencing technology. These reads were assembled into 17,496 unigenes with an average length of 1015 bp. The functional annotation indicated that 8113 (46.37%), 1933 (11.04%) and 5516 (31.52%) unigenes showed significant similarity to known proteins in the NCBI Nr, Nt and Swiss-Prot databases, respectively. In addition, 2921 (16.70%) unigenes were assigned to KEGG database categories; 4218 (24.11%), to KOG database categories; and 6,603 (37.74%), to GO database categories. Furthermore, we identified 8,798 potential SSRs among 6653 unigenes. A total of 9 polymorphic SSR markers were developed to evaluate the genetic diversity and population structure of 96 isolates collected from Guangdong Province in China. Clonal reproduction of P. polysora in Guangdong was dominant. The YJ (Yangjiang) population had the highest genotypic diversity and the greatest number of the multilocus genotypes, followed by the HY (Heyuan), HZ (Huizhou) and XY (Xinyi) populations. These results provide valuable information for the molecular genetic analysis of P. polysora and related species.


2019 ◽  
Author(s):  
Duy Dinh Vu ◽  
Syed Noor Muhammad Shah ◽  
Mai Phuong Pham ◽  
Van Thang Bui ◽  
Minh Tam Nguyen ◽  
...  

Abstract Background: Understanding the genetic diversity in threatened species that occur in forest remnants is necessary to establish efficient strategies for the species conservation, restoration and management. Panax vietnamensis Ha et Grushv. is medicinally important, endemic and endangered species of Vietnam. However, genetic diversity and structure of population is unknown due to lack of efficient molecular markers. Results: In this study, we employed Illumina HiSeq TM 4000 sequencing to analyze the transcriptomes of P. vietnamensis (roots, leaves and stems). A total of 23,741,783 raw reads were obtained and assembled, from which, 89,271 unigenes with an average length of 598.3191 nt were generated. During functional annotation, 31,686 unigenes were annotated in Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes pathways, Swiss-Prot database, and Nucleotide Collection (NR/NT) database. In addition, 11,343 expressed sequence tag-simple sequence repeat (EST-SSRs) were detected. From 7,774 primer pairs, 101 were selected for polymorphism validation, in which, 20 primer pairs were successfully amplified to DNA fragments and significant amounts of polymorphism was observed within population. The nine polymorphic microsatellite loci were used to analyze genetic diversity and structure of the natural populations. The obtained results revealed that the shows high levels of genetic diversity in populations, the average observed and expected heterozygosity were H O = 0.422 and H E = 0.479. During the Bottleneck analysis using TPM and SMM models (p < 0.01) shows that targeted population is significantly heterozygote deficient. This suggests sign of bottleneck in all populations. Genetic differentiation among populations was moderate (F ST = 0.133) and indicating limited gene flow (Nm = 1.63). Analysis of molecular variance (AMOVA) showed 63.17% of variation within individuals and 12.45% among populations. These results showed a moderate genetic structure of P. vietnamensis. STRUCTURE analysis and the unweighted pair-group method with arithmetic means (UPGMA) tree revealed strong genetic structure and two genetic clusters related to geographical distances, as well. Conclusion: Our study will assist conservators in future conservation management, breeding, production and habitats restoration of the species. Keywords: Conservation, EST-SSRs; Transcriptome; Panax vietnamensis ; Population genetics


2018 ◽  
Vol 16 (6) ◽  
pp. 564-567
Author(s):  
Ning Zhao ◽  
Yanyan Yan ◽  
Weitang Liu ◽  
Jinxin Wang

AbstractShortawn foxtail (Alopecurus aequalis Sobol.) is an invasive and highly troublesome weed species originating from North America that has become widespread across China. Since its proliferation seriously threatens crop production worldwide, understanding its genetic diversity is critical for developing a forecasting system for integrated pest management plans. To accelerate the application of molecular markers in A. aequalis, this study aimed to develop a set of expressed sequence tag-simple sequence repeat (SSR) markers using previous high-throughput sequencing data. In this study, a total of 1411 SSR loci were identified from 95,479 unigenes. Tri-nucleotide repeat motifs were the most abundant type with a frequency of 66.27%, followed by di- (24.95%) and tetra-nucleotide (8.78%). Among the loci, 584 primer pairs were successfully designed for marker development. Subsequently, a subset of 36 primer pairs was randomly selected and synthesized, of which 12 (33.33%) pairs successfully revealed abundant allelic polymorphism. Additionally, to investigate their utility, the genotypes of 160 individuals from 20 natural populations representing diverse wild genotypes of A. aequalis were analysed by using these 12 polymorphic markers. These novel SSR markers developed here are reliable and useful for genetic analysis on this invasive plant and will greatly enrich its genetic resource.


Sign in / Sign up

Export Citation Format

Share Document