Evaluation of gene–drug common module identification methods using pharmacogenomics data

Briefings in Bioinformatics ◽

10.1093/bib/bbaa087 ◽

2020 ◽

Author(s):

Jie Huang ◽

Jiazhou Chen ◽

Bin Zhang ◽

Lei Zhu ◽

Hongmin Cai

Keyword(s):

Drug Interactions ◽

Drug Repositioning ◽

Simulated Data ◽

Supplementary Information ◽

Supplementary File ◽

Cancer Drugs ◽

Real World Data ◽

Data Set ◽

Module Identification ◽

Network Analyses

Abstract Accurately identifying the interactions between genomic factors and the response of cancer drugs plays important roles in drug discovery, drug repositioning and cancer treatment. A number of studies revealed that interactions between genes and drugs were ‘many-genes-to-many drugs’ interactions, i.e. common modules, opposed to ‘one-gene-to-one-drug’ interactions. Such modules fully explain the interactions between complex biological regulatory mechanisms and cancer drugs. However, strategies for effectively and robustly identifying the underlying common modules among pharmacogenomics data remain to be improved. In this paper, we aim to provide a detailed evaluation of three categories of state-of-the-art common module identification techniques from a machine learning perspective, including non-negative matrix factorization (NMF), partial least squares (PLS) and network analyses. We first evaluate the performance of six methods, namely SNMNMF, NetNMF, SNPLS, O2PLS, NSBM and HOGMMNC, using two series of simulated data sets with different noise levels and outlier ratios. Then, we conduct experiments using a real world data set of 2091 genes and 101 drugs in 392 cancer cell lines and compare the real experimental results from the aspect of biological process term enrichment, gene–drug and drug–drug interactions. Finally, we present interesting findings from our evaluation study and discuss the advantages and drawbacks of each method. Supplementary information: Supplementary file is available at Briefings in Bioinformatics online.

Download Full-text

Inference in multiscale geographically weighted regression

10.31219/osf.io/4dksb ◽

2018 ◽

Cited By ~ 1

Author(s):

Hanchen Yu ◽

Stewart Fotheringham ◽

Ziqi Li ◽

Taylor Oshan ◽

Wei Kang ◽

...

Keyword(s):

Geographically Weighted Regression ◽

Additive Model ◽

Simulated Data ◽

Weighted Regression ◽

Parameter Estimates ◽

Real World Data ◽

Data Set ◽

Smoothing Factor ◽

Hat Matrix ◽

Response Vector

A recent paper (Fotheringham et al. 2017) expands the well-known Geographically Weighted Regression (GWR) framework significantly by allowing the bandwidth or smoothing factor in GWR to be derived separately for each covariate in the model – a framework referred to as Multiscale GWR (MGWR). However, one limitation of the MGWR framework is that, until now, no inference about the local parameter estimates was possible. Formally, the so-called “hat matrix,” which projects the observed response vector into the predicted response vector, was available in GWR but not in MGWR. This paper addresses this limitation by reframing GWR as a Generalized Additive Model (GAM), extending this framework to MGWR and then deriving standard errors for the local parameters in MGWR. In addition, we also demonstrate how the effective number of parameters (ENP) can be obtained for the overall fit of an MGWR model and for each of the covariates within the model. This statistic is essential for comparing model fit between MGWR, GWR, and traditional global models, as well as adjusting for multiple hypothesis tests. We demonstrate these advances to the MGWR framework with both a simulated data set and a real-world data set.

Download Full-text

PCN317 - THE CANCER DRUGS FUND: KEY UNCERTAINTIES, DATA COLLECTION PLANS, ANALYTICAL METHODS AND USE OF THE SYSTEMATIC ANTI-CANCER THERAPY (SACT) REAL WORLD DATA SET

Value in Health ◽

10.1016/j.jval.2018.09.399 ◽

2018 ◽

Vol 21 ◽

pp. S68

Author(s):

N.R. Latimer

Keyword(s):

Cancer Therapy ◽

Data Collection ◽

Real World ◽

Analytical Methods ◽

Cancer Drugs ◽

Real World Data ◽

Data Set ◽

World Data ◽

Anti Cancer

Download Full-text

DISSECT: an assignment-free Bayesian discovery method for species delimitation under the multispecies coalescent

10.1101/003178 ◽

2014 ◽

Cited By ~ 3

Author(s):

Graham Jones ◽

Bengt Oxelman

Keyword(s):

Simulated Data ◽

Degree Of Approximation ◽

Species Tree ◽

Supplementary Information ◽

Data Set ◽

Multispecies Coalescent ◽

Formal Framework ◽

Beast Analysis ◽

Tree Topologies ◽

Discovery Method

Motivation: The multispecies coalescent model provides a formal framework for the assignment of individual organisms to species, where the species are modeled as the branches of the species tree. None of the available approaches so far have simultaneously co-estimated all the relevant parameters in the model, without restricting the parameter space by requiring a guide tree and/or prior assignment of individuals to clusters or species. Results: We present DISSECT, which explores the full space of possible clusterings of individuals and species tree topologies in a Bayesian framework. It uses an approximation to avoid the need for reversible-jump MCMC, in the form of a prior that is a modification of the birth-death prior for the species tree. It incorporates a spike near zero in the density for node heights. The model has two extra parameters: one controls the degree of approximation, and the second controls the prior distribution on the numbers of species. It is implemented as part of BEAST and requires only a few changes from a standard *BEAST analysis. The method is evaluated on simulated data and demonstrated on an empirical data set. The method is shown to be insensitive to the degree of approximation, but quite sensitive to the second parameter, suggesting that large numbers of sequences are needed to draw firm conclusions. Availability:http://code.google.com/p/beast-mcmc/, http://www.indriid.com/dissectinbeast.html Contact:[email protected], www.indriid.com Supplementary information: Supplementary material is available.

Download Full-text

3MCor: an integrative web server for metabolome-microbiome-metadata correlation analysis

Bioinformatics ◽

10.1093/bioinformatics/btab818 ◽

2021 ◽

Author(s):

Tao Sun ◽

Mengci Li ◽

Xiangtian Yu ◽

Dandan Liang ◽

Guoxiang Xie ◽

...

Keyword(s):

Correlation Analysis ◽

Interaction Analysis ◽

Rapid Identification ◽

Supplementary Information ◽

Omics Data ◽

Real World Data ◽

Data Set ◽

Flexible Operation ◽

Key Nodes ◽

Omics Data Analysis

Abstract Motivation The metabolome and microbiome disorders are highly associated with human health and there are great demands for dual-omics interaction analysis. Here, we designed and developed an integrative platform, 3MCor, for metabolome and microbiome correlation analysis under the instruction of phenotype and with the consideration of confounders. Results Many traditional and novel correlation analysis methods were integrated for intra- and inter-correlation analysis. Three inter-correlation pipelines are provided for global, hierarchical, and pairwise analysis. The incorporated network analysis function is conducive to rapid identification of network clusters and key nodes from a complicated correlation network. Complete numerical results (csv files) and rich figures (pdf files) will be generated in minutes. To our knowledge, 3MCor is the first platform developed specifically for the correlation analysis of metabolome and microbiome. Its functions were compared with corresponding modules of existing omics data analysis platforms. A real-world data set was used to demonstrate its simple and flexible operation, comprehensive outputs, and distinctive contribution to dual-omics studies. Availability 3MCor is available at http://3mcor.cn and the backend R script is available at https://github.com/chentianlu/3MCorServer. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MNBDR: A Module Network Based Method for Drug Repositioning

Genes ◽

10.3390/genes12010025 ◽

2020 ◽

Vol 12 (1) ◽

pp. 25

Author(s):

He-Gang Chen ◽

Xiong-Hui Zhou

Keyword(s):

Gene Expression ◽

Protein Interactions ◽

Drug Response ◽

Drug Repositioning ◽

Expression Profiles ◽

Drug Repurposing ◽

Recent Decade ◽

Data Set ◽

Module Network ◽

The Cross

Drug repurposing/repositioning, which aims to find novel indications for existing drugs, contributes to reducing the time and cost for drug development. For the recent decade, gene expression profiles of drug stimulating samples have been successfully used in drug repurposing. However, most of the existing methods neglect the gene modules and the interactions among the modules, although the cross-talks among pathways are common in drug response. It is essential to develop a method that utilizes the cross-talks information to predict the reliable candidate associations. In this study, we developed MNBDR (Module Network Based Drug Repositioning), a novel method that based on module network to screen drugs. It integrated protein–protein interactions and gene expression profile of human, to predict drug candidates for diseases. Specifically, the MNBDR mined dense modules through protein–protein interaction (PPI) network and constructed a module network to reveal cross-talks among modules. Then, together with the module network, based on existing gene expression data set of drug stimulation samples and disease samples, we used random walk algorithms to capture essential modules in disease development and proposed a new indicator to screen potential drugs for a given disease. Results showed MNBDR could provide better performance than popular methods. Moreover, functional analysis of the essential modules in the network indicated our method could reveal biological mechanism in drug response.

Download Full-text

Effects of management decisions on genetic evaluation of simulated calving records using random regression

Translational Animal Science ◽

10.1093/tas/txab078 ◽

2021 ◽

Author(s):

M D MacNeil ◽

J W Buchanan ◽

M L Spangler ◽

E Hay

Keyword(s):

Reproductive Success ◽

Simulated Data ◽

Genetic Evaluation ◽

Random Regression ◽

Management Decisions ◽

Third Order ◽

Data Set ◽

Binary Phenotype ◽

Random Regression Model ◽

Missing Observation

Abstract The objective of this study was to evaluate the effects of various data structures on the genetic evaluation for the binary phenotype of reproductive success. The data were simulated based on an existing pedigree and an underlying fertility phenotype with a heritability of 0.10. A data set of complete observations was generated for all cows. This data set was then modified mimicking the culling of cows when they first failed to reproduce, cows having a missing observation at either their second or fifth opportunity to reproduce as if they had been selected as donors for embryo transfer, and censoring records following the sixth opportunity to reproduce as in a cull-for-age strategy. The data were analyzed using a third order polynomial random regression model. The EBV of interest for each animal was the sum of the age-specific EBV over the first 10 observations (reproductive success at ages 2-11). Thus, the EBV might be interpreted as the genetic expectation of number of calves produced when a female is given ten opportunities to calve. Culling open cows resulted in the EBV for 3 year-old cows being reduced from 8.27 ± 0.03 when open cows were retained to 7.60 ± 0.02 when they were culled. The magnitude of this effect decreased as cows grew older when they first failed to reproduce and were subsequently culled. Cows that did not fail over the 11 years of simulated data had an EBV of 9.43 ± 0.01 and 9.35 ± 0.01 based on analyses of the complete data and the data in which cows that failed to reproduce were culled, respectively. Cows that had a missing observation for their second record had a significantly reduced EBV, but the corresponding effect at the fifth record was negligible. The current study illustrates that culling and management decisions, and particularly those that impact the beginning of the trajectory of sustained reproductive success, can influence both the magnitude and accuracy of resulting EBV.

Download Full-text

A Traveler’s Guide to the Multiverse: Promises, Pitfalls, and a Framework for the Evaluation of Analytic Decisions

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245920954925 ◽

2021 ◽

Vol 4 (1) ◽

pp. 251524592095492

Author(s):

Marco Del Giudice ◽

Steven W. Gangestad

Keyword(s):

Degrees Of Freedom ◽

A Priori ◽

Simulated Data ◽

Style Analysis ◽

Data Set ◽

Scant Attention ◽

Equivalence Type ◽

The Impact ◽

Biased Estimates

Decisions made by researchers while analyzing data (e.g., how to measure variables, how to handle outliers) are sometimes arbitrary, without an objective justification for choosing one alternative over another. Multiverse-style methods (e.g., specification curve, vibration of effects) estimate an effect across an entire set of possible specifications to expose the impact of hidden degrees of freedom and/or obtain robust, less biased estimates of the effect of interest. However, if specifications are not truly arbitrary, multiverse-style analyses can produce misleading results, potentially hiding meaningful effects within a mass of poorly justified alternatives. So far, a key question has received scant attention: How does one decide whether alternatives are arbitrary? We offer a framework and conceptual tools for doing so. We discuss three kinds of a priori nonequivalence among alternatives—measurement nonequivalence, effect nonequivalence, and power/precision nonequivalence. The criteria we review lead to three decision scenarios: Type E decisions (principled equivalence), Type N decisions (principled nonequivalence), and Type U decisions (uncertainty). In uncertain scenarios, multiverse-style analysis should be conducted in a deliberately exploratory fashion. The framework is discussed with reference to published examples and illustrated with the help of a simulated data set. Our framework will help researchers reap the benefits of multiverse-style methods while avoiding their pitfalls.

Download Full-text

DRUG REPOSITION OF NON-CANCER DRUGS FOR CANCER TREATMENTS VIA PHARMACOVIGILANCE APPROACH - REPURPOSING DRUGS IN ONCOLOGY

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2019.v12i2.29523 ◽

2019 ◽

pp. 310-314 ◽

Cited By ~ 1

Author(s):

Mrugank Bhaskarkumar Parmar ◽

Shital Panchal

Keyword(s):

Statistical Analysis ◽

Cancer Treatment ◽

Drug Repositioning ◽

Adverse Event Reporting System ◽

Adverse Event Reporting ◽

Cancer Drug ◽

Cancer Drugs ◽

Management Activity ◽

Post Marketing Surveillance ◽

Source Data

This study for drug repositioning has been performed for the drugs which are in the market since more than a decade and they are approved with their well-established efficacy and safety in human being. Objective of this study was to reposition the existing non-cancer drug therapy for cancer treatment, which is having well characterized pharmacologic profile with more efficacy and least toxicity as anti-neoplastic agent. We have retrieved the source data from FDA Adverse Event Reporting System (FAERS) for the last 13 years covering duration from 2004 to 2016 and analysed those using pharmacovigilance approach â€˜a proposed future novel pharmaceutical tool for drug repositionâ€™. Signal management activity was performed for statistical analysis. Result of statistical analysis derived that propranolol; metformin; pioglitazone; dabigatran and nitroglycerin are the existing non-cancer drugs which deserved for their direct / indirect reposition for cancer treatment and anti-neoplastic activity. Further studies retrieving the source data from other regulatory database (e.g. Eudravigilance of EMA and VigiFlow of WHO) and post-marketing surveillance study with the same objective may adjuvant our results for the reposition of existing drugs by pharmacovigilance approach.

Download Full-text

Improving the performance of a radio-frequency localization system in adverse outdoor applications

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-021-02001-6 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Marcelo N. de Sousa ◽

Ricardo Sant’Ana ◽

Rigel P. Fernandes ◽

Julio Cesar Duarte ◽

José A. Apolinário ◽

...

Keyword(s):

Random Forest ◽

Ray Tracing ◽

Real World ◽

Practical Implication ◽

Real Life ◽

Simulated Data ◽

Real Data ◽

Gradient Boosting ◽

Real World Data ◽

Localization Accuracy

AbstractIn outdoor RF localization systems, particularly where line of sight can not be guaranteed or where multipath effects are severe, information about the terrain may improve the position estimate’s performance. Given the difficulties in obtaining real data, a ray-tracing fingerprint is a viable option. Nevertheless, although presenting good simulation results, the performance of systems trained with simulated features only suffer degradation when employed to process real-life data. This work intends to improve the localization accuracy when using ray-tracing fingerprints and a few field data obtained from an adverse environment where a large number of measurements is not an option. We employ a machine learning (ML) algorithm to explore the multipath information. We selected algorithms random forest and gradient boosting; both considered efficient tools in the literature. In a strict simulation scenario (simulated data for training, validating, and testing), we obtained the same good results found in the literature (error around 2 m). In a real-world system (simulated data for training, real data for validating and testing), both ML algorithms resulted in a mean positioning error around 100 ,m. We have also obtained experimental results for noisy (artificially added Gaussian noise) and mismatched (with a null subset of) features. From the simulations carried out in this work, our study revealed that enhancing the ML model with a few real-world data improves localization’s overall performance. From the machine ML algorithms employed herein, we also observed that, under noisy conditions, the random forest algorithm achieved a slightly better result than the gradient boosting algorithm. However, they achieved similar results in a mismatch experiment. This work’s practical implication is that multipath information, once rejected in old localization techniques, now represents a significant source of information whenever we have prior knowledge to train the ML algorithm.

Download Full-text

Auto-sharing parameters for transfer learning based on multi-objective optimization

Integrated Computer-Aided Engineering ◽

10.3233/ica-210655 ◽

2021 ◽

pp. 1-13

Author(s):

Hailin Liu ◽

Fangqing Gu ◽

Zixian Lin

Keyword(s):

Transfer Learning ◽

Optimization Problem ◽

Data Sets ◽

Multi Objective Optimization ◽

Particle Swarm Optimizer ◽

Real World Data ◽

Data Set ◽

Target Task ◽

Main Research ◽

Multi Objective

Transfer learning methods exploit similarities between different datasets to improve the performance of the target task by transferring knowledge from source tasks to the target task. “What to transfer” is a main research issue in transfer learning. The existing transfer learning method generally needs to acquire the shared parameters by integrating human knowledge. However, in many real applications, an understanding of which parameters can be shared is unknown beforehand. Transfer learning model is essentially a special multi-objective optimization problem. Consequently, this paper proposes a novel auto-sharing parameter technique for transfer learning based on multi-objective optimization and solves the optimization problem by using a multi-swarm particle swarm optimizer. Each task objective is simultaneously optimized by a sub-swarm. The current best particle from the sub-swarm of the target task is used to guide the search of particles of the source tasks and vice versa. The target task and source task are jointly solved by sharing the information of the best particle, which works as an inductive bias. Experiments are carried out to evaluate the proposed algorithm on several synthetic data sets and two real-world data sets of a school data set and a landmine data set, which show that the proposed algorithm is effective.

Download Full-text