variable subset selection
Recently Published Documents


TOTAL DOCUMENTS

20
(FIVE YEARS 1)

H-INDEX

5
(FIVE YEARS 0)

Author(s):  
Pablo Roman Duchowicz ◽  
Silvina Fioressi ◽  
Gustavo Romanelli ◽  
Daniel E. Bacelo

This work applied the quantitative structure-activity relationships (QSAR) theory to predict the inhibitory activity exhibited by 40 unsymmetrical aromatic disulfide compounds against the SARS-CoV main protease. Different freely available molecular descriptor programs provided 67,116 independent non-conformational molecular descriptors. This great number of descriptors contained multidimensional representations of the chemical structure and was analyzed through multivariable linear regressions and the replacement method variable subset selection technique. The developed QSAR model achieved an acceptable statistical quality and provided a prospective guide that was considered useful for predicting the inhibitory activity of structurally-related aromatic disulfide compounds on the SARS-CoV main protease.


2020 ◽  
Vol 93 (3) ◽  
Author(s):  
Marc Hofmann ◽  
Cristian Gatu ◽  
Erricos J. Kontoghiorghes ◽  
Ana Colubi ◽  
Achim Zeileis

2020 ◽  
Vol 85 (4) ◽  
pp. 467-480 ◽  
Author(s):  
Rana Amiri ◽  
Djelloul Messadi ◽  
Amel Bouakkadia

This study aimed at predicting the n-octanol/water partition coefficient (Kow) of 43 organophosphorous insecticides. Quantitative structure?property relationship analysis was performed on the series of 43 insecticides using two different methods, linear (multiple linear regression, MLR) and non-linear (artificial neural network, ANN), which Kow values of these chemicals to their structural descriptors. First, the data set was separated with a duplex algorithm into a training set (28 chemicals) and a test set (15 chemicals) for statistical external validation. A model with four descriptors was developed using as independent variables theoretical descriptors derived from Dragon software when applying genetic algorithm (GA)?variable subset selection (VSS) procedure. The values of statistical parameters, R2, Q2 ext, SDEPext and SDEC for the MLR (94.09 %, 92.43 %, 0.533 and 0.471, respectively) and ANN model (97.24 %, 92.17 %, 0.466 and 0.332, respectively) obtained for the three approaches are very similar, which confirmed that the employed four parameters model is stable, robust and significant.


2019 ◽  
Author(s):  
Michael Neidlin ◽  
Efthymia Chantzi ◽  
George Macheras ◽  
Mats G Gustafsson ◽  
Leonidas G Alexopoulos

AbstractThe pathophysiology of osteoarthritis (OA) involves dysregulation of anabolic and catabolic processes associated with a broad panel of cytokines and other secreted proteins and ultimately lead to cartilage degradation. An increased understanding about the interactions of these proteins by means of systematic in vitro analyses may give new ideas regarding pharmaceutical candidates for treatment of OA and related cartilage degradation.Therefore, first an ex vivo tissue model of cartilage degradation was established by culturing full thickness tissue explants with bacterial collagenase II. Then responses of healthy and degrading cartilage were analyzed by measuring protein abundance in tissue supernatant with a 26-multiplex protein profiling assay, after exposing them to a panel of 55 protein stimulations present in synovial joints of OA patients. Multivariate data analysis including exhaustive pairwise variable subset selection was used to identify the most outstanding changes in the measured protein secretions. This revealed that the MMP9 response is outstandingly low in degraded compared to healthy cartilage and that there are several protein pairs like IFNG and MMP9 that can be used for successful discrimination between degraded and healthy samples.Taken together, the results show that the characteristic changes in protein responses discovered seem promising for accurate detection/diagnosis of degrading cartilage in general and OA in particular. More generally the employed ex vivo tissue model seems promising for drug discovery and development projects related to cartilage degradation, for example when trying to uncover the unknown interactions between secreted proteins in healthy and degraded tissues.


Author(s):  
Cristian Rojas ◽  
Piercosimo Tripaldi ◽  
Pablo R. Duchowicz

The aim of this work was to develop predictive structure-property relationships (QSPR) of natural and synthetic sweeteners in order to predict and model relative sweetness (RS). The data set was composed of 233 sweeteners collected from diverse sources in the literature, which was divided into training (163) and test (70) molecules according to a procedure based on k-means cluster analysis. A total of 3763 non-conformational Dragon molecular descriptors were calculated which were simultaneously analyzed through multivariable linear regression analysis coupled with the replacement method variable subset selection technique. The established six-parameter model was validated through the cross-validation techniques, together with Y-randomization and applicability domain analysis. The results for the training set and the test set showed that the non-conformational descriptors offer relevant information for modeling the RS of a compound. Thus, this model can be used to predict the sweetness of both un-evaluated and un-synthesized sweeteners.


Author(s):  
Long Han ◽  
Mark J. Embrechts ◽  
Boleslaw K. Szymanski ◽  
Karsten Sternickel ◽  
Alexander Ross

This chapter introduces a novel Levenberg-Marquardt like second-order algorithm for tuning the Parzen window s in a Radial Basis Function (Gaussian) kernel. In this case, each attribute has its own sigma parameter associated with it. The values of the optimized s are then used as a gauge for variable selection. In this study, the Kernel Partial Least Squares (K-PLS) model is applied to several benchmark data sets in order to estimate the effectiveness of the second-order sigma tuning procedure for an RBF kernel. The variable subset selection method based on these sigma values is then compared with different feature selection procedures such as random forests and sensitivity analysis. The sigma-tuned RBF kernel model outperforms K-PLS and SVM models with a single sigma value. K-PLS models also compare favorably with Least Squares Support Vector Machines (LS-SVM), epsilon-insensitive Support Vector Regression and traditional PLS. The sigma tuning and variable selection procedure introduced in this chapter is applied to industrial magnetocardiogram data for the detection of ischemic heart disease from measurement of the magnetic field around the heart.


Author(s):  
Zheng Zhao

The high dimensionality of data poses a challenge to learning tasks such as classification. In the presence of many irrelevant features, classification algorithms tend to overfit training data (Guyon & Elisseeff, 2003). Many features can be removed without performance deterioration, and feature selection is one effective means to remove irrelevant features (Liu & Yu, 2005). Feature selection, also known as variable selection, feature reduction, attribute selection or variable subset selection, is the technique of selecting a subset of relevant features for building robust learning models. Usually a feature is relevant due to two reasons: (1) it is strongly correlated with the target concept; or (2) it forms a feature subset with other features and the subset is strongly correlated with the target concept. Optimal feature selection requires an exponentially large search space (O(2n), where n is the number of features) (Almual-lim & Dietterich, 1994). Researchers often resort to various approximations to determine relevant features, and in many existing feature selection algorithms, feature relevance is determined by correlation between individual features and the class (Hall, 2000; Yu & Liu, 2003). However, a single feature can be considered irrelevant based on its correlation with the class; but when combined with other features, it can become very relevant. Unintentional removal of these features can result in the loss of useful information and thus may cause poor classification performance, which is studied as attribute interaction in (Jakulin & Bratko, 2003). Therefore, it is desirable to consider the effect of feature interaction in feature selection.


Sign in / Sign up

Export Citation Format

Share Document