Stage specific classification of DEGs via statistical profiling and network analysis reveals potential biomarker associated with various stages of TB

Mapping Intimacies ◽

10.1101/414110 ◽

2018 ◽

Author(s):

Romana Ishrat

Keyword(s):

Gene Expression ◽

Network Analysis ◽

Regulatory Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Body Part ◽

Specific Pattern ◽

Host Responses ◽

Data Set ◽

Potential Biomarker

AbstractBackgroundTuberculosis (TB) is a deadly transmissible disease that can infect almost any body-part of the host but is mostly infect the lungs. It is one of the top 10 causes of death worldwide. In the 30 high TB burden countries, 87% of new TB cases occurred in 2016. Seven countries: India, Indonesia, China, Philippines, Pakistan, Nigeria, and South Africa accounted for 64% of the new TB cases. To stop the infection and progression of the disease, early detection of TB is important. In our study, we used microarray data set and compared the gene expression profiles obtained from blood samples of patients with different datasets of Healthy control, Latent infection, Active TB and performed network-based analysis of DEGs to identify potential biomarker.ObjectivesWe want to observe the transition of genes from normal condition to different stages of the TB and identify, annotate those genes/pathways/processes that play key role in the progression of TB disease during its cyclic interventions in human body.ResultsWe identified 319 genes that are differentially expressed in various stages of TB (Normal to LTTB, Normal to Active TB and LTTB to active TB) and allocated to pathways from multiple databases which comprised of curated class of associated genes. These pathway’s importance was then evaluated according to the no. of DEGs present in the pathway and these genes show the broad spectrum of processes that take part in every state. In addition, we studied the regulatory networks of these classified genes, network analysis does consider the interactions between genes (specific for TB) or proteins provide us new facts about TB disease, which in turn can be used for potential biomarkers identification. We identified total 29 biomarkers from various comparison groups of TB stages in which 14 genes are over expressed as host responses against pathogen, but 15 genes are down regulated that means these genes has allowed the process of host defense to cease and give time to pathogen for its progression.ConclusionsThis study revealed that gene-expression profiles can be used to identify and classified the genes on stage specific pattern among normal, LTTB and active TB and network modules associated with various stages of TB were elucidated, which in turn provided a basis for the identification of potential pathways and key regulatory genes that may be involved in progression of TB disease.

Download Full-text

Inference of gene regulatory networks and compound mode of action from time course gene expression profiles

Bioinformatics ◽

10.1093/bioinformatics/btl003 ◽

2006 ◽

Vol 22 (7) ◽

pp. 815-822 ◽

Cited By ~ 254

Author(s):

M. Bansal ◽

G. D. Gatta ◽

D. di Bernardo

Keyword(s):

Gene Expression ◽

Gene Regulatory Networks ◽

Mode Of Action ◽

Regulatory Networks ◽

Time Course ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Gene Regulatory

Download Full-text

Information Technology of Gene Expression Profiles Processing for Purpose of Gene Regulatory Networks Reconstruction

2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP) ◽

10.1109/dsmp.2018.8478452 ◽

2018 ◽

Cited By ~ 6

Author(s):

S. Babichev ◽

V. Lytvynenko ◽

J. Skvor ◽

M. Korobchynskyi ◽

M. Voronenko

Keyword(s):

Gene Expression ◽

Information Technology ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Gene Regulatory

Download Full-text

Inference of Gene Regulatory Networks Using Time Sliding Comparison and Transcriptional Lagging Time from Time Series Gene Expression Profiles

2007 IEEE 7th International Symposium on BioInformatics and BioEngineering ◽

10.1109/bibe.2007.4375684 ◽

2007 ◽

Author(s):

Sheehyun Kim ◽

Dongsup Kim

Keyword(s):

Gene Expression ◽

Time Series ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Time Series Gene Expression ◽

Gene Regulatory

Download Full-text

Comparative transcriptomics identifies differences in the regulation of the floral transition between Arabidopsis and Brassica rapa cultivars

10.1101/2020.08.26.266494 ◽

2020 ◽

Author(s):

Alexander Calderwood ◽

Jo Hepworth ◽

Shannon Woodhouse ◽

Lorelei Bilham ◽

D. Marc Jones ◽

...

Keyword(s):

Gene Expression ◽

Brassica Rapa ◽

Regulatory Networks ◽

Expression Profiles ◽

Expression Patterns ◽

Gene Expression Profiles ◽

Detailed Comparison ◽

Developmental Time ◽

Floral Transition ◽

Gene Regulatory

AbstractThe timing of the floral transition affects reproduction and yield, however its regulation in crops remains poorly understood. Here, we use RNA-Seq to determine and compare gene expression dynamics through the floral transition in the model species Arabidopsis thaliana and the closely related crop Brassica rapa. A direct comparison of gene expression over time between species shows little similarity, which could lead to the inference that different gene regulatory networks are at play. However, these differences can be largely resolved by synchronisation, through curve registration, of gene expression profiles. We find that different registration functions are required for different genes, indicating that there is no common ‘developmental time’ to which Arabidopsis and B. rapa can be mapped through gene expression. Instead, the expression patterns of different genes progress at different rates. We find that co-regulated genes show similar changes in synchronisation between species, suggesting that similar gene regulatory sub-network structures may be active with different wiring between them. A detailed comparison of the regulation of the floral transition between Arabidopsis and B. rapa, and between two B. rapa accessions reveals different modes of regulation of the key floral integrator SOC1, and that the floral transition in the B. rapa accessions is triggered by different pathways, even when grown under the same environmental conditions. Our study adds to the mechanistic understanding of the regulatory network of flowering time in rapid cycling B. rapa under long days and highlights the importance of registration methods for the comparison of developmental gene expression data.

Download Full-text

Convergent evolution of venom gland transcriptomes across Metazoa

10.1101/2021.07.04.451048 ◽

2021 ◽

Author(s):

Giulia Zancolli ◽

Maarten Reijnders ◽

Robert Waterhouse ◽

Marc Robinson-Rechavi

Keyword(s):

Gene Expression ◽

Regulatory Networks ◽

Molecular Mechanisms ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Venom Gland ◽

Bioactive Molecules ◽

Specific Gene ◽

Animal Kingdom ◽

Venom Glands

Animals have repeatedly evolved specialized organs and anatomical structures to produce and deliver a cocktail of potent bioactive molecules to subdue prey or predators: venom. This makes it one of the most widespread convergent functions in the animal kingdom. Whether animals have adopted the same genetic toolkit to evolved venom systems is a fascinating question that still eludes us. Here, we performed the first comparative analysis of venom gland transcriptomes from 20 venomous species spanning the main Metazoan lineages, to test whether different animals have independently adopted similar molecular mechanisms to perform the same function. We found a strong convergence in gene expression profiles, with venom glands being more similar to each other than to any other tissue from the same species, and their differences closely mirroring the species phylogeny. Although venom glands secrete some of the fastest evolving molecules (toxins), their gene expression does not evolve faster than evolutionarily older tissues. We found 15 venom gland specific gene modules enriched in endoplasmic reticulum stress and unfolded protein response pathways, indicating that animals have independently adopted stress response mechanisms to cope with mass production of toxins. This, in turns, activates regulatory networks for epithelial development, cell turnover and maintenance which seem composed of both convergent and lineage-specific factors, possibly reflecting the different developmental origins of venom glands. This study represents the first step towards an understanding of the molecular mechanisms underlying the repeated evolution of one of the most successful adaptive traits in the animal kingdom.

Download Full-text

Improved Feature Selection by Incorporating Gene Similarity into the LASSO

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/jkdb.2012010101 ◽

2012 ◽

Vol 3 (1) ◽

pp. 1-22 ◽

Cited By ~ 1

Author(s):

Christopher E. Gillies ◽

Xiaoli Gao ◽

Nilesh V. Patel ◽

Mohammad-Reza Siadat ◽

George D. Wilson

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Personalized Medicine ◽

Objective Function ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Genetic Profile ◽

Data Set ◽

Coordinate Descent Algorithm ◽

Gene Similarity

Personalized medicine is customizing treatments to a patient’s genetic profile and has the potential to revolutionize medical practice. An important process used in personalized medicine is gene expression profiling. Analyzing gene expression profiles is difficult, because there are usually few patients and thousands of genes, leading to the curse of dimensionality. To combat this problem, researchers suggest using prior knowledge to enhance feature selection for supervised learning algorithms. The authors propose an enhancement to the LASSO, a shrinkage and selection technique that induces parameter sparsity by penalizing a model’s objective function. Their enhancement gives preference to the selection of genes that are involved in similar biological processes. The authors’ modified LASSO selects similar genes by penalizing interaction terms between genes. They devise a coordinate descent algorithm to minimize the corresponding objective function. To evaluate their method, the authors created simulation data where they compared their model to the standard LASSO model and an interaction LASSO model. The authors’ model outperformed both the standard and interaction LASSO models in terms of detecting important genes and gene interactions for a reasonable number of training samples. They also demonstrated the performance of their method on a real gene expression data set from lung cancer cell lines.

Download Full-text

PCA-based unsupervised feature extraction for gene expression analysis of COVID-19 patients

Scientific Reports ◽

10.1038/s41598-021-95698-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Kota Fujisawa ◽

Mamoru Shimo ◽

Y.-H. Taguchi ◽

Shinya Ikematsu ◽

Ryota Miyata

Keyword(s):

Gene Expression ◽

Feature Extraction ◽

Target Genes ◽

Gene Selection ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Principal Component ◽

Data Set ◽

Immune Related Genes ◽

Unsupervised Feature Extraction

AbstractCoronavirus disease 2019 (COVID-19) is raging worldwide. This potentially fatal infectious disease is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, the complete mechanism of COVID-19 is not well understood. Therefore, we analyzed gene expression profiles of COVID-19 patients to identify disease-related genes through an innovative machine learning method that enables a data-driven strategy for gene selection from a data set with a small number of samples and many candidates. Principal-component-analysis-based unsupervised feature extraction (PCAUFE) was applied to the RNA expression profiles of 16 COVID-19 patients and 18 healthy control subjects. The results identified 123 genes as critical for COVID-19 progression from 60,683 candidate probes, including immune-related genes. The 123 genes were enriched in binding sites for transcription factors NFKB1 and RELA, which are involved in various biological phenomena such as immune response and cell survival: the primary mediator of canonical nuclear factor-kappa B (NF-κB) activity is the heterodimer RelA-p50. The genes were also enriched in histone modification H3K36me3, and they largely overlapped the target genes of NFKB1 and RELA. We found that the overlapping genes were downregulated in COVID-19 patients. These results suggest that canonical NF-κB activity was suppressed by H3K36me3 in COVID-19 patient blood.

Download Full-text

Technique of Gene Expression Profiles Extraction Based on the Complex Use of Clustering and Classification Methods

Diagnostics ◽

10.3390/diagnostics10080584 ◽

2020 ◽

Vol 10 (8) ◽

pp. 584

Author(s):

Sergii Babichev ◽

Jiří Škvor

Keyword(s):

Gene Expression ◽

Regulatory Networks ◽

Fuzzy Inference ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Quality Criterion ◽

Gene Expressions ◽

Inference System ◽

Stepwise Procedure ◽

Binary Classifiers

In this paper, we present the results of the research concerning extraction of informative gene expression profiles from high-dimensional array of gene expressions considering the state of patients’ health using clustering method, ML-based binary classifiers and fuzzy inference system. Applying of the proposed stepwise procedure can allow us to extract the most informative genes taking into account both the subtypes of disease or state of the patient’s health for further reconstruction of gene regulatory networks based on the allocated genes and following simulation of the reconstructed models. We used the publicly available gene expressions data as the experimental ones which were obtained using DNA microarray experiments and contained two types of patients’ gene expression profiles—the patients with lung cancer tumor and healthy patients. The stepwise procedure of the data processing assumes the following steps—in the beginning, we reduce the number of genes by removing non-informative genes in terms of statistical criteria and Shannon entropy; then, we perform the stepwise hierarchical clustering of gene expression profiles at hierarchical levels from 1 to 10 using the SOTA (Self-Organizing Tree Algorithm) clustering algorithm with correlation distance metric. The quality of the obtained clustering was evaluated using the complex clustering quality criterion which is considered both the gene expression profiles distribution relative to center of the clusters where these gene expression profiles are allocated and the centers of the clusters distribution. The result of this stage execution was a selection of the optimal cluster at each of the hierarchical levels which corresponded to the minimum value of the quality criterion. At the next step, we have implemented a classification procedure of the examined objects using four well known binary classifiers—logistic regression, support-vector machine, decision trees and random forest classifier. The effectiveness of the appropriate technique was evaluated based on the use of ROC (Receiver Operating Characteristic) analysis using criteria, included as the components, the errors of both the first and the second kinds. The final decision concerning the extraction of the most informative subset of gene expression profiles was taken based on the use of the fuzzy inference system, the inputs of which are the results of the appropriate single classifiers operation and the output is the final solution concerning state of the patient’s health. To our mind, the implementation of the proposed stepwise procedure of the informative gene expression profiles extraction create the conditions for the increasing effectiveness of the further procedure of gene regulatory networks reconstruction and the following simulation of the reconstructed models considering the subtypes of the disease and/or state of the patient’s health.

Download Full-text

The effects of a globin blocker on the resolution of 3’mRNA sequencing data in porcine blood

BMC Genomics ◽

10.1186/s12864-019-6122-2 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Kyu-Sang Lim ◽

Qian Dong ◽

Pamela Moll ◽

Jana Vitkovska ◽

Gregor Wiktorin ◽

...

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Data Sets ◽

Globin Genes ◽

Sequencing Data ◽

Globin Mrna ◽

Data Set ◽

Mrna Sequencing ◽

Porcine Blood

Abstract Background Gene expression profiling in blood is a potential source of biomarkers to evaluate or predict phenotypic differences between pigs but is expensive and inefficient because of the high abundance of globin mRNA in porcine blood. These limitations can be overcome by the use of QuantSeq 3’mRNA sequencing (QuantSeq) combined with a method to deplete or block the processing of globin mRNA prior to or during library construction. Here, we validated the effectiveness of QuantSeq using a novel specific globin blocker (GB) that is included in the library preparation step of QuantSeq. Results In data set 1, four concentrations of the GB were applied to RNA samples from two pigs. The GB significantly reduced the proportion of globin reads compared to non-GB (NGB) samples (P = 0.005) and increased the number of detectable non-globin genes. The highest evaluated concentration (C1) of the GB resulted in the largest reduction of globin reads compared to the NGB (from 56.4 to 10.1%). The second highest concentration C2, which showed very similar globin depletion rates (12%) as C1 but a better correlation of the expression of non-globin genes between NGB and GB (r = 0.98), allowed the expression of an additional 1295 non-globin genes to be detected, although 40 genes that were detected in the NGB sample (at a low level) were not present in the GB library. Concentration C2 was applied in the rest of the study. In data set 2, the distribution of the percentage of globin reads for NGB (n = 184) and GB (n = 189) samples clearly showed the effects of the GB on reducing globin reads, in particular for HBB, similar to results from data set 1. Data set 3 (n = 84) revealed that the proportion of globin reads that remained in GB samples was significantly and positively correlated with the reticulocyte count in the original blood sample (P < 0.001). Conclusions The effect of the GB on reducing the proportion of globin reads in porcine blood QuantSeq was demonstrated in three data sets. In addition to increasing the efficiency of sequencing non-globin mRNA, the GB for QuantSeq has an advantage that it does not require an additional step prior to or during library creation. Therefore, the GB is a useful tool in the quantification of whole gene expression profiles in porcine blood.

Download Full-text