Mining of candidate genes involved in the biosynthesis of dextrorotatory borneol in Cinnamomum burmannii by transcriptomic analysis on three chemotypes

PeerJ ◽

10.7717/peerj.9311 ◽

2020 ◽

Vol 8 ◽

pp. e9311 ◽

Cited By ~ 1

Author(s):

Zerui Yang ◽

Wenli An ◽

Shanshan Liu ◽

Yuying Huang ◽

Chunzhu Xie ◽

...

Keyword(s):

Candidate Genes ◽

Biosynthetic Pathway ◽

De Novo ◽

Expression Patterns ◽

Biological Synthesis ◽

Terpenoid Biosynthesis ◽

Monoterpene Synthase ◽

Patent Remedies ◽

Long Time ◽

Additional Approach

Background Dextrorotatory borneol (D-borneol), a cyclic monoterpene, is widely used in traditional Chinese medicine as an efficient topical analgesic drug. Fresh leaves of Cinnamomum trees, e.g., C. burmannii and C. camphor, are the main sources from which D-borneol is extracted by steam distillation, yet with low yields. Insufficient supply of D-borneol has hampered its clinical use and production of patent remedies for a long time. Biological synthesis of D-borneol offers an additional approach; however, mechanisms of D-borneol biosynthesis remain mostly unresolved. Hence, it is important and necessary to elucidate the biosynthetic pathway of D-borneol. Results Comparative analysis on the gene expression patterns of different D-borneol production C. burmannii samples facilitates elucidation on the underlying biosynthetic pathway of D-borneol. Herein, we collected three different chemotypes of C. burmannii, which harbor different contents of D-borneol.A total of 100,218 unigenes with an N50 of 1,128 bp were assembled de novo using Trinity from a total of 21.21 Gb clean bases. We used BLASTx analysis against several public databases to annotate 45,485 unigenes (45.38%) to at least one database, among which 82 unigenes were assigned to terpenoid biosynthesis pathways by KEGG annotation. In addition, we defined 8,860 unigenes as differentially expressed genes (DEGs), among which 13 DEGs were associated with terpenoid biosynthesis pathways. One 1-deoxy-D-xylulose-5-phosphate synthase (DXS) and two monoterpene synthase, designated as CbDXS9, CbTPS2 and CbTPS3, were up-regulated in the high-borneol group compared to the low-borneol and borneol-free groups, and might be vital to biosynthesis of D-borneol in C. burmannii. In addition, we identified one WRKY, two BHLH, one AP2/ERF and three MYB candidate genes, which exhibited the same expression patterns as CbTPS2 and CbTPS3, suggesting that these transcription factors might potentially regulate D-borneol biosynthesis. Finally, quantitative real-time PCR was conducted to detect the actual expression level of those candidate genes related to the D-borneol biosynthesis pathway, and the result showed that the expression patterns of the candidate genes related to D-borneol biosynthesis were basically consistent with those revealed by transcriptome analysis. Conclusions We used transcriptome sequencing to analyze three different chemotypes of C. burmannii, identifying three candidate structural genes (one DXS, two monoterpene synthases) and seven potential transcription factor candidates (one WRKY, two BHLH, one AP2/ERF and three MYB) involved in D-borneol biosynthesis. These results provide new insight into our understanding of the production and accumulation of D-borneol in C. burmannii.

Download Full-text

Full-Length Transcriptome Analysis Reveals Candidate Genes Involved in Terpenoid Biosynthesis in Artemisia argyi

Frontiers in Genetics ◽

10.3389/fgene.2021.659962 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yupeng Cui ◽

Xinqiang Gao ◽

Jianshe Wang ◽

Zengzhen Shang ◽

Zhibin Zhang ◽

...

Keyword(s):

Candidate Genes ◽

Single Molecule ◽

Molecular Mechanisms ◽

De Novo ◽

Enrichment Analysis ◽

Full Length ◽

Functional Enrichment ◽

Terpenoid Biosynthesis ◽

Sequencing Data ◽

Artemisia Argyi

Artemisia argyi is an important medicinal plant widely utilized for moxibustion heat therapy in China. The terpenoid biosynthesis process in A. argyi is speculated to play a key role in conferring its medicinal value. However, the molecular mechanism underlying terpenoid biosynthesis remains unclear, in part because the reference genome of A. argyi is unavailable. Moreover, the full-length transcriptome of A. argyi has not yet been sequenced. Therefore, in this study, de novo transcriptome sequencing of A. argyi's root, stem, and leaf tissues was performed to obtain those candidate genes related to terpenoid biosynthesis, by combining the PacBio single-molecule real-time (SMRT) and Illumina sequencing NGS platforms. And more than 55.4 Gb of sequencing data and 108,846 full-length reads (non-chimeric) were generated by the Illumina and PacBio platform, respectively. Then, 53,043 consensus isoforms were clustered and used to represent 36,820 non-redundant transcripts, of which 34,839 (94.62%) were annotated in public databases. In the comparison sets of leaves vs roots, and leaves vs stems, 13,850 (7,566 up-regulated, 6,284 down-regulated) and 9,502 (5,284 up-regulated, 4,218 down-regulated) differentially expressed transcripts (DETs) were obtained, respectively. Specifically, the expression profile and KEGG functional enrichment analysis of these DETs indicated that they were significantly enriched in the biosynthesis of amino acids, carotenoids, diterpenoids and flavonoids, as well as the metabolism processes of glycine, serine and threonine. Moreover, multiple genes encoding significant enzymes or transcription factors related to diterpenoid biosynthesis were highly expressed in the A. argyi leaves. Additionally, several transcription factor families, such as RLK-Pelle_LRR-L-1 and RLK-Pelle_DLSV, were also identified. In conclusion, this study offers a valuable resource for transcriptome information, and provides a functional genomic foundation for further research on molecular mechanisms underlying the medicinal use of A. argyi leaves.

Download Full-text

Transcriptome analysis reveals candidate genes involved in luciferin metabolism inLuciola aquatilis(Coleoptera: Lampyridae)

PeerJ ◽

10.7717/peerj.2534 ◽

2016 ◽

Vol 4 ◽

pp. e2534 ◽

Cited By ~ 10

Author(s):

Wanwipa Vongsangnak ◽

Pramote Chumnanpuen ◽

Ajaraporn Sriboonlert

Keyword(s):

Candidate Genes ◽

Transcriptome Analysis ◽

Biosynthetic Pathway ◽

De Novo ◽

Rt Pcr ◽

Reverse Transcription Pcr ◽

Key Enzyme ◽

Coleopteran Insect ◽

Living Organisms ◽

Pathway Databases

Bioluminescence, which living organisms such as fireflies emit light, has been studied extensively for over half a century. This intriguing reaction, having its origins in nature where glowing insects can signal things such as attraction or defense, is now widely used in biotechnology with applications of bioluminescence and chemiluminescence. Luciferase, a key enzyme in this reaction, has been well characterized; however, the enzymes involved in the biosynthetic pathway of its substrate, luciferin, remains unsolved at present. To elucidate the luciferin metabolism, we performed ade novotranscriptome analysis using larvae of the firefly species,Luciola aquatilis. Here, a comparative analysis is performed with the model coleopteran insectTribolium casteneumto elucidate the metabolic pathways inL. aquatilis. Based on a template luciferin biosynthetic pathway, combined with a range of protein and pathway databases, and various prediction tools for functional annotation, the candidate genes, enzymes, and biochemical reactions involved in luciferin metabolism are proposed forL. aquatilis. The candidate gene expression is validated in the adultL. aquatilisusing reverse transcription PCR (RT-PCR). This study provides useful information on the bio-production of luciferin in the firefly and will benefit to future applications of the valuable firefly bioluminescence system.

Download Full-text

A machine learning approach to predicting autism risk genes: Validation of known genes and discovery of new candidates

10.1101/463547 ◽

2018 ◽

Cited By ~ 4

Author(s):

Ying Lin ◽

Anjali M. Rajadhyaksha ◽

James B. Potash ◽

Shizhong Han

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Human Brain ◽

Candidate Genes ◽

De Novo ◽

Expression Patterns ◽

Autism Spectrum ◽

Gene Expression Patterns ◽

Risk Genes ◽

Gene Level

AbstractAutism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role ofde novomutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to predict ASD risk genes using features from spatiotemporal gene expression patterns in human brain, gene-level constraint metrics, and other gene variation features. The genes identified through our prediction model were enriched for independent sets of ASD risk genes, and tended to be differentially expressed in ASD brains, especially in the frontal and parietal cortex. The highest-ranked genes not only included those with strong prior evidence for involvement in ASD (for example,TCF20andFBOX11), but also indicated potentially novel candidates, such asDOCK3,MYCBP2andCAND1, which are all involved in neuronal development. Through extensive validations, we also showed that our method outperformed state-of-the-art scoring systems for ranking ASD candidate genes. Gene ontology enrichment analysis of our predicted risk genes revealed biological processes clearly relevant to ASD, including neuronal signaling, neurogenesis, and chromatin remodeling, but also highlighted other potential mechanisms that might underlie ASD, such as regulation of RNA alternative splicing and ubiquitination pathway related to protein degradation. Our study demonstrates that human brain spatiotemporal gene expression patterns and gene-level constraint metrics can help predict ASD risk genes. Our gene ranking system provides a useful resource for prioritizing ASD candidate genes.

Download Full-text

Insights into the heat-responsive transcriptional network of tomato contrasting genotypes

Plant Genetic Resources ◽

10.1017/s1479262121000083 ◽

2021 ◽

Vol 19 (1) ◽

pp. 44-57

Author(s):

Sirine Werghi ◽

Charfeddine Gharsallah ◽

Nishi Kant Bhardwaj ◽

Hatem Fakhfakh ◽

Faten Gorsane

Keyword(s):

Heat Stress ◽

Heat Shock ◽

Candidate Genes ◽

Expression Patterns ◽

Response Strategies ◽

Transcriptional Reprogramming ◽

Genes Encoding ◽

Breeding Programmes ◽

Gene Transcripts ◽

Resource Data

AbstractDuring recent decades, global warming has intensified, altering crop growth, development and survival. To overcome changes in their environment, plants undergo transcriptional reprogramming to activate stress response strategies/pathways. To evaluate the genetic bases of the response to heat stress, Conserved DNA-derived Polymorphism (CDDP) markers were applied across tomato genome of eight cultivars. Despite scattered genotypes, cluster analysis allowed two neighbouring panels to be discriminate. Tomato CDDP-genotypic and visual phenotypic assortment permitted the selection of two contrasting heat-tolerant and heat-sensitive cultivars. Further analysis explored differential expression in transcript levels of genes, encoding heat shock transcription factors (HSFs, HsfA1, HsfA2, HsfB1), members of the heat shock protein (HSP) family (HSP101, HSP17, HSP90) and ascorbate peroxidase (APX) enzymes (APX1, APX2). Based on discriminating CDDP-markers, a protein functional network was built allowing prediction of candidate genes and their regulating miRNA. Expression patterns analysis revealed that miR156d and miR397 were heat-responsive showing a typical inverse relation with the abundance of their target gene transcripts. Heat stress is inducing a set of candidate genes, whose expression seems to be modulated through a complex regulatory network. Integrating genetic resource data is required for identifying valuable tomato genotypes that can be considered in marker-assisted breeding programmes to improve tomato heat tolerance.

Download Full-text

The ‘Tommy Atkins’ mango genome reveals candidate genes for fruit quality

BMC Plant Biology ◽

10.1186/s12870-021-02858-1 ◽

2021 ◽

Vol 21 (1) ◽

Cited By ~ 1

Author(s):

Ian S. E. Bally ◽

◽

Aureliano Bombarely ◽

Alan H. Chambers ◽

Yuval Cohen ◽

...

Keyword(s):

Candidate Genes ◽

Fruit Quality ◽

De Novo ◽

Mapping Population ◽

Mangifera Indica ◽

Consensus Sequence ◽

Fruit Size ◽

Hybrid Population ◽

High Molecular Weight Dna ◽

Tropical Fruit

Abstract Background Mango, Mangifera indica L., an important tropical fruit crop, is grown for its sweet and aromatic fruits. Past improvement of this species has predominantly relied on chance seedlings derived from over 1000 cultivars in the Indian sub-continent with a large variation for fruit size, yield, biotic and abiotic stress resistance, and fruit quality among other traits. Historically, mango has been an orphan crop with very limited molecular information. Only recently have molecular and genomics-based analyses enabled the creation of linkage maps, transcriptomes, and diversity analysis of large collections. Additionally, the combined analysis of genomic and phenotypic information is poised to improve mango breeding efficiency. Results This study sequenced, de novo assembled, analyzed, and annotated the genome of the monoembryonic mango cultivar ‘Tommy Atkins’. The draft genome sequence was generated using NRGene de-novo Magic on high molecular weight DNA of ‘Tommy Atkins’, supplemented by 10X Genomics long read sequencing to improve the initial assembly. A hybrid population between ‘Tommy Atkins’ x ‘Kensington Pride’ was used to generate phased haplotype chromosomes and a highly resolved phased SNP map. The final ‘Tommy Atkins’ genome assembly was a consensus sequence that included 20 pseudomolecules representing the 20 chromosomes of mango and included ~ 86% of the ~ 439 Mb haploid mango genome. Skim sequencing identified ~ 3.3 M SNPs using the ‘Tommy Atkins’ x ‘Kensington Pride’ mapping population. Repeat masking identified 26,616 genes with a median length of 3348 bp. A whole genome duplication analysis revealed an ancestral 65 MYA polyploidization event shared with Anacardium occidentale. Two regions, one on LG4 and one on LG7 containing 28 candidate genes, were associated with the commercially important fruit size characteristic in the mapping population. Conclusions The availability of the complete ‘Tommy Atkins’ mango genome will aid global initiatives to study mango genetics.

Download Full-text

Combining Metabolic and Monoterpene Synthase Engineering for de Novo Production of Monoterpene Alcohols in Escherichia coli

ACS Synthetic Biology ◽

10.1021/acssynbio.1c00081 ◽

2021 ◽

Author(s):

Dengwei Lei ◽

Zetian Qiu ◽

Jihua Wu ◽

Bin Qiao ◽

Jianjun Qiao ◽

...

Keyword(s):

Escherichia Coli ◽

De Novo ◽

Monoterpene Synthase

Download Full-text

Identification and Expression Analysis of the Genes Involved in the Raffinose Family Oligosaccharides Pathway of Phaseolus vulgaris and Glycine max

Plants ◽

10.3390/plants10071465 ◽

2021 ◽

Vol 10 (7) ◽

pp. 1465

Author(s):

Ramon de Koning ◽

Raphaël Kiekens ◽

Mary Esther Muyoka Toili ◽

Geert Angenon

Keyword(s):

Common Bean ◽

Seed Development ◽

Expression Analysis ◽

De Novo ◽

Expression Patterns ◽

Gene Families ◽

Rna Seq ◽

Raffinose Family Oligosaccharides ◽

Specific Expression ◽

Raffinose Synthase

Raffinose family oligosaccharides (RFO) play an important role in plants but are also considered to be antinutritional factors. A profound understanding of the galactinol and RFO biosynthetic gene families and the expression patterns of the individual genes is a prerequisite for the sustainable reduction of the RFO content in the seeds, without compromising normal plant development and functioning. In this paper, an overview of the annotation and genetic structure of all galactinol- and RFO biosynthesis genes is given for soybean and common bean. In common bean, three galactinol synthase genes, two raffinose synthase genes and one stachyose synthase gene were identified for the first time. To discover the expression patterns of these genes in different tissues, two expression atlases have been created through re-analysis of publicly available RNA-seq data. De novo expression analysis through an RNA-seq study during seed development of three varieties of common bean gave more insight into the expression patterns of these genes during the seed development. The results of the expression analysis suggest that different classes of galactinol- and RFO synthase genes have tissue-specific expression patterns in soybean and common bean. With the obtained knowledge, important galactinol- and RFO synthase genes that specifically play a key role in the accumulation of RFOs in the seeds are identified. These candidate genes may play a pivotal role in reducing the RFO content in the seeds of important legumes which could improve the nutritional quality of these beans and would solve the discomforts associated with their consumption.

Download Full-text

Comprehensive analysis and identification of drought-responsive candidate NAC genes in three semi-arid tropics (SAT) legume crops

BMC Genomics ◽

10.1186/s12864-021-07602-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Sadhana Singh ◽

Himabindu Kudapa ◽

Vanika Garg ◽

Rajeev K. Varshney

Keyword(s):

Drought Stress ◽

Candidate Genes ◽

Developmental Stages ◽

Biological Activities ◽

Expression Patterns ◽

Genome Wide ◽

Legume Crops ◽

Root Tissues ◽

Semi Arid ◽

Comprehensive Study

Abstract Background Chickpea, pigeonpea, and groundnut are the primary legume crops of semi-arid tropics (SAT) and their global productivity is severely affected by drought stress. The plant-specific NAC (NAM - no apical meristem, ATAF - Arabidopsis transcription activation factor, and CUC - cup-shaped cotyledon) transcription factor family is known to be involved in majority of abiotic stresses, especially in the drought stress tolerance mechanism. Despite the knowledge available regarding NAC function, not much information is available on NAC genes in SAT legume crops. Results In this study, genome-wide NAC proteins – 72, 96, and 166 have been identified from the genomes of chickpea, pigeonpea, and groundnut, respectively, and later grouped into 10 clusters in chickpea and pigeonpea, while 12 clusters in groundnut. Phylogeny with well-known stress-responsive NACs in Arabidopsis thaliana, Oryza sativa (rice), Medicago truncatula, and Glycine max (soybean) enabled prediction of putative stress-responsive NACs in chickpea (22), pigeonpea (31), and groundnut (33). Transcriptome data revealed putative stress-responsive NACs at various developmental stages that showed differential expression patterns in the different tissues studied. Quantitative real-time PCR (qRT-PCR) was performed to validate the expression patterns of selected stress-responsive, Ca_NAC (Cicer arietinum - 14), Cc_NAC (Cajanus cajan - 15), and Ah_NAC (Arachis hypogaea - 14) genes using drought-stressed and well-watered root tissues from two contrasting drought-responsive genotypes of each of the three legumes. Based on expression analysis, Ca_06899, Ca_18090, Ca_22941, Ca_04337, Ca_04069, Ca_04233, Ca_12660, Ca_16379, Ca_16946, and Ca_21186; Cc_26125, Cc_43030, Cc_43785, Cc_43786, Cc_22429, and Cc_22430; Ah_ann1.G1V3KR.2, Ah_ann1.MI72XM.2, Ah_ann1.V0X4SV.1, Ah_ann1.FU1JML.2, and Ah_ann1.8AKD3R.1 were identified as potential drought stress-responsive candidate genes. Conclusion As NAC genes are known to play role in several physiological and biological activities, a more comprehensive study on genome-wide identification and expression analyses of the NAC proteins have been carried out in chickpea, pigeonpea and groundnut. We have identified a total of 21 potential drought-responsive NAC genes in these legumes. These genes displayed correlation between gene expression, transcriptional regulation, and better tolerance against drought. The identified candidate genes, after validation, may serve as a useful resource for molecular breeding for drought tolerance in the SAT legume crops.

Download Full-text

MED12-Related (Neuro)Developmental Disorders: A Question of Causality

Genes ◽

10.3390/genes12050663 ◽

2021 ◽

Vol 12 (5) ◽

pp. 663

Author(s):

Stijn van de Plassche ◽

Arjan PM de Brouwer

Keyword(s):

Developmental Disorders ◽

De Novo ◽

Expression Patterns ◽

Mediator Complex ◽

Gene Expression Patterns ◽

Facial Dysmorphism ◽

Regulation Of Transcription ◽

Feeding Difficulties ◽

Missense Variants ◽

Pathogenic Variants

MED12 is a member of the Mediator complex that is involved in the regulation of transcription. Missense variants in MED12 cause FG syndrome, Lujan-Fryns syndrome, and Ohdo syndrome, as well as non-syndromic intellectual disability (ID) in hemizygous males. Recently, female patients with de novo missense variants and de novo protein truncating variants in MED12 were described, resulting in a clinical spectrum centered around ID and Hardikar syndrome without ID. The missense variants are found throughout MED12, whether they are inherited in hemizygous males or de novo in females. They can result in syndromic or nonsyndromic ID. The de novo nonsense variants resulting in Hardikar syndrome that is characterized by facial clefting, pigmentary retinopathy, biliary anomalies, and intestinal malrotation, are found more N-terminally, whereas the more C-terminally positioned variants are de novo protein truncating variants that cause a severe, syndromic phenotype consisting of ID, facial dysmorphism, short stature, skeletal abnormalities, feeding difficulties, and variable other abnormalities. This broad range of distinct phenotypes calls for a method to distinguish between pathogenic and non-pathogenic variants in MED12. We propose an isogenic iNeuron model to establish the unique gene expression patterns that are associated with the specific MED12 variants. The discovery of these patterns would help in future diagnostics and determine the causality of the MED12 variants.

Download Full-text

Rare variant analysis of 4241 pulmonary arterial hypertension cases from an international consortium implicates FBLN2, PDGFD, and rare de novo variants in PAH

Genome Medicine ◽

10.1186/s13073-021-00891-1 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Na Zhu ◽

◽

Emilia M. Swietlik ◽

Carrie L. Welch ◽

Michael W. Pauciulo ◽

...

Keyword(s):

Pulmonary Arterial Hypertension ◽

Candidate Genes ◽

Rare Variant ◽

De Novo ◽

Risk Genes ◽

International Consortium ◽

Rare Variant Analysis ◽

New Genes ◽

Pulmonary Arterial ◽

Variant Analysis

Abstract Background Pulmonary arterial hypertension (PAH) is a lethal vasculopathy characterized by pathogenic remodeling of pulmonary arterioles leading to increased pulmonary pressures, right ventricular hypertrophy, and heart failure. PAH can be associated with other diseases (APAH: connective tissue diseases, congenital heart disease, and others) but often the etiology is idiopathic (IPAH). Mutations in bone morphogenetic protein receptor 2 (BMPR2) are the cause of most heritable cases but the vast majority of other cases are genetically undefined. Methods To identify new risk genes, we utilized an international consortium of 4241 PAH cases with exome or genome sequencing data from the National Biological Sample and Data Repository for PAH, Columbia University Irving Medical Center, and the UK NIHR BioResource – Rare Diseases Study. The strength of this combined cohort is a doubling of the number of IPAH cases compared to either national cohort alone. We identified protein-coding variants and performed rare variant association analyses in unrelated participants of European ancestry, including 1647 IPAH cases and 18,819 controls. We also analyzed de novo variants in 124 pediatric trios enriched for IPAH and APAH-CHD. Results Seven genes with rare deleterious variants were associated with IPAH with false discovery rate smaller than 0.1: three known genes (BMPR2, GDF2, and TBX4), two recently identified candidate genes (SOX17, KDR), and two new candidate genes (fibulin 2, FBLN2; platelet-derived growth factor D, PDGFD). The new genes were identified based solely on rare deleterious missense variants, a variant type that could not be adequately assessed in either cohort alone. The candidate genes exhibit expression patterns in lung and heart similar to that of known PAH risk genes, and most variants occur in conserved protein domains. For pediatric PAH, predicted deleterious de novo variants exhibited a significant burden compared to the background mutation rate (2.45×, p = 2.5e−5). At least eight novel pediatric candidate genes carrying de novo variants have plausible roles in lung/heart development. Conclusions Rare variant analysis of a large international consortium identified two new candidate genes—FBLN2 and PDGFD. The new genes have known functions in vasculogenesis and remodeling. Trio analysis predicted that ~ 15% of pediatric IPAH may be explained by de novo variants.

Download Full-text