Estimator and Model Selection Using Cross-Validation Iva´n Dı´az

2016 ◽  
pp. 239-256
2014 ◽  
Vol 19 (1) ◽  
pp. 41-53
Author(s):  
Charanpal Dhanjal ◽  
Nicolas Baskiotis ◽  
Stéphan Clémençon ◽  
Nicolas Usunier

2019 ◽  
Vol 6 (Supplement_2) ◽  
pp. S210-S210
Author(s):  
Mary T Caserta ◽  
Lu Wang ◽  
Chin-Yi Chu ◽  
Christopher Slaunwhite ◽  
Jeanne Holden-Wiltse ◽  
...  

Abstract Background RSV infection is common in infants with a majority of those affected displaying mild clinical symptoms. However, a substantial number develop severe symptoms requiring hospitalization. We currently lack sensitive and specific predictors to identify a majority of those who develop severe disease. Methods High throughput RNA sequencing (RNAseq) of nasal epithelial cells defined airway gene expression patterns in RSV-infected subjects. Using multivariate linear regression analysis with AIC-based model selection, we built a sparse linear predictor of RSV disease severity, the Nasal Gene Severity Score-NGSS1. Using a similar statistical approach, we built an alternate predictor based upon genes displaying stable expression over time (NGSS2). We evaluated predictive performance of both models using leave-one-out cross-validation analyses. Results We defined comprehensive airway gene expression profiles from 106 full-tem previously healthy RSV-infected subjects with a range of RSV disease severity prospectively enrolled in the AsPIRES study. Nasal samples were obtained during acute infection (day 1–10 of illness; 106 samples), and convalescence (day 14–28 of illness; 69 samples). All subjects had a primary infection and were assigned a cumulative clinical illness severity score (GRSS) (Table 1). From the RNA seq data 41 genes were identified as the NGSS1 which is strongly correlated with disease severity (GRSS) in both the naive (ρ=0.935) and cross-validated analysis (ρ of 0.813). As a binary classifier (mild vs. severe), NGSS1 correctly classifies 89.6% of the subjects following cross-validation (Figure 1). Next, we evaluated genes that were stably expressed in both acute illness and convalescence samples in 54 subjects with data from both time points. Repeating the regression based step wise model selection identified 13 genes as NGSS2, which was significantly correlated with GRSS (ρ = 0.741). This model has slightly less, but comparable, prediction accuracy with a cross-validated correlation of 0.741 and cross-validated classification accuracy of 84.0% (Figure 2). Conclusion Airway gene expression patterns, obtained following a minimally-invasive nasal procedure, have potential utility as prognostic biomarkers for severe infant RSV infections. Disclosures All authors: No reported disclosures.


Author(s):  
Federico Belotti ◽  
Franco Peracchi

In this article, we describe jackknife2, a new prefix command for jackknifing linear estimators. It takes full advantage of the available leave-one-out formula, thereby allowing for substantial reduction in computing time. Of special note is that jackknife2 allows the user to compute cross-validation and diagnostic measures that are currently not available after ivregress 2sls, xtreg, and xtivregress.


Sign in / Sign up

Export Citation Format

Share Document