scholarly journals Prediction of Placental Barrier Permeability: A Model Based on Partial Least Squares Variable Selection Procedure

Molecules ◽  
2015 ◽  
Vol 20 (5) ◽  
pp. 8270-8286 ◽  
Author(s):  
Yong-Hong Zhang ◽  
Zhi-Ning Xia ◽  
Li Yan ◽  
Shu-Shen Liu
Author(s):  
Long Han ◽  
Mark J. Embrechts ◽  
Boleslaw K. Szymanski ◽  
Karsten Sternickel ◽  
Alexander Ross

This chapter introduces a novel Levenberg-Marquardt like second-order algorithm for tuning the Parzen window s in a Radial Basis Function (Gaussian) kernel. In this case, each attribute has its own sigma parameter associated with it. The values of the optimized s are then used as a gauge for variable selection. In this study, the Kernel Partial Least Squares (K-PLS) model is applied to several benchmark data sets in order to estimate the effectiveness of the second-order sigma tuning procedure for an RBF kernel. The variable subset selection method based on these sigma values is then compared with different feature selection procedures such as random forests and sensitivity analysis. The sigma-tuned RBF kernel model outperforms K-PLS and SVM models with a single sigma value. K-PLS models also compare favorably with Least Squares Support Vector Machines (LS-SVM), epsilon-insensitive Support Vector Regression and traditional PLS. The sigma tuning and variable selection procedure introduced in this chapter is applied to industrial magnetocardiogram data for the detection of ischemic heart disease from measurement of the magnetic field around the heart.


2002 ◽  
Vol 56 (3) ◽  
pp. 337-345 ◽  
Author(s):  
S. Kamaledin Setarehdan ◽  
John J. Soraghan ◽  
David Littlejohn ◽  
Daran A. Sadler

Circulation ◽  
2021 ◽  
Vol 143 (Suppl_1) ◽  
Author(s):  
Natalie Gasca ◽  
Robyn McClelland

Most nutritional epidemiology studies investigating trends between diet and heart disease use outcome-independent dimension reduction methods, like principal component analysis, to create dietary patterns. While these methods construct patterns that describe important aspects of food consumption, these patterns are not inherently related to heart disease. Incorporating disease data into the pattern construction offers the possibility of more concisely summarizing the most disease-related foods. Sparse partial least squares (SPLS), one such method, was found to have favorable interpretation and prediction properties in the continuous outcome setting; while selecting a subset of relevant foods, it constructed a few dietary patterns that were correlated with BMI while also capturing variation in diet composition. These results were validated with simulated data. We propose incorporating SPLS into the Cox proportional hazards model to analyze a right-censored survival outcome. We hypothesized that this method would inherit the beneficial parsimony properties seen in the continuous setting, and we assessed whether this proposed method could use the most relevant covariates to create a few patterns that were associated with a survival outcome. While the proposed method targets covariate-level sparsity (i.e. variable selection), one competitor method exists that integrates pattern-level parsimony and partial least squares (PLS) in the Cox model, but it imposes more model parameters than the proposed method. We compared the variable selection, pattern selection, and predictive performance of four survival methods (Lasso, PLS, competitor sparse PLS, and proposed SPLS) via a simulation study. Simulation settings were informed in part by the Multi-Ethnic Study of Atherosclerosis (MESA), which has detailed food frequency questionnaire data on a large multi-ethnic population-based sample (6814 participants aged 45-84), as well as subsequent cardiovascular disease follow-up for over 15 years. In most studied simulation settings, the proposed method selected all 9 relevant predictors and the fewest number of irrelevant predictors (of 15) while creating a similar number of patterns and maintaining predictive ability of the outcome. In the setting most comparable to MESA, PLS chose all 24 predictors (by default) and 3.4 patterns (C-statistic=0.90), the competitor SPLS selected 21.1 predictors and 4.4 patterns (C-statistic=0.91), Lasso chose 16.4 predictors (C-statistic=0.91), and the proposed SPLS selected 11.7 predictors and 4.3 patterns (C-statistic=0.91), on average. We will also present an analysis of a coronary event in MESA using these four survival methods. In conclusion, we propose that using methods like SPLS to summarize food intake can create more heart disease-tailored dietary patterns that can complement the current nutritional epidemiology literature.


2014 ◽  
Vol 70 (5) ◽  
Author(s):  
Nor Fazila Rasaruddin ◽  
Mas Ezatul Nadia Mohd Ruah ◽  
Mohamed Noor Hasan ◽  
Mohd Zuli Jaafar

This paper shows the determination of iodine value (IV) of pure and frying palm oils using Partial Least Squares (PLS) regression with application of variable selection. A total of 28 samples consisting of pure and frying palm oils which acquired from markets. Seven of them were considered as high-priced palm oils while the remaining was low-priced. PLS regression models were developed for the determination of IV using Fourier Transform Infrared (FTIR) spectra data in absorbance mode in the range from 650 cm-1 to 4000 cm-1. Savitzky Golay derivative was applied before developing the prediction models. The models were constructed using wavelength selected in the FTIR region by adopting selectivity ratio (SR) plot and correlation coefficient to the IV parameter. Each model was validated through Root Mean Square Error Cross Validation, RMSECV and cross validation correlation coefficient, R2cv. The best model using SR plot was the model with mean centring for pure sample and model with a combination of row scaling and standardization of frying sample. The best model with the application of the correlation coefficient variable selection was the model with a combination of row scaling and standardization of pure sample and model with mean centering data pre-processing for frying sample. It is not necessary to row scaled the variables to develop the model since the effect of row scaling on model quality is insignificant.


2018 ◽  
Vol 8 (2) ◽  
pp. 313-341
Author(s):  
Jiajie Chen ◽  
Anthony Hou ◽  
Thomas Y Hou

Abstract In Barber & Candès (2015, Ann. Statist., 43, 2055–2085), the authors introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and proved that this method achieves exact FDR control. Inspired by the work by Barber & Candès (2015, Ann. Statist., 43, 2055–2085), we propose a pseudo knockoff filter that inherits some advantages of the original knockoff filter and has more flexibility in constructing its knockoff matrix. Moreover, we perform a number of numerical experiments that seem to suggest that the pseudo knockoff filter with the half Lasso statistic has FDR control and offers more power than the original knockoff filter with the Lasso Path or the half Lasso statistic for the numerical examples that we consider in this paper. Although we cannot establish rigourous FDR control for the pseudo knockoff filter, we provide some partial analysis of the pseudo knockoff filter with the half Lasso statistic and establish a uniform false discovery proportion bound and an expectation inequality.


Sign in / Sign up

Export Citation Format

Share Document