Consistent Estimation of Generalized Linear Models with High Dimensional Predictors via Stepwise Regression

Alex Pijyan; Qi Zheng; Hyokyoung G. Hong; Yi Li

doi:10.3390/e22090965

Consistent Estimation of Generalized Linear Models with High Dimensional Predictors via Stepwise Regression

Entropy ◽

10.3390/e22090965 ◽

2020 ◽

Vol 22 (9) ◽

pp. 965

Author(s):

Alex Pijyan ◽

Qi Zheng ◽

Hyokyoung G. Hong ◽

Yi Li

Keyword(s):

Generalized Linear Models ◽

Predictive Models ◽

Linear Models ◽

Penalized Regression ◽

Screening Methods ◽

Final Model ◽

Variable Screening ◽

Model Control ◽

Stepwise Procedure ◽

Selection Operator

Predictive models play a central role in decision making. Penalized regression approaches, such as least absolute shrinkage and selection operator (LASSO), have been widely used to construct predictive models and explain the impacts of the selected predictors, but the estimates are typically biased. Moreover, when data are ultrahigh-dimensional, penalized regression is usable only after applying variable screening methods to downsize variables. We propose a stepwise procedure for fitting generalized linear models with ultrahigh dimensional predictors. Our procedure can provide a final model; control both false negatives and false positives; and yield consistent estimates, which are useful to gauge the actual effect size of risk factors. Simulations and applications to two clinical studies verify the utility of the method.

Download Full-text

Penalized Likelihood Estimation of Gamma Distributed Response Variable via Corrected Solution of Regression Coefficients

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1608552720 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Rasaki Olawale Olanrewaju

Keyword(s):

Generalized Linear Models ◽

Linear Models ◽

Penalized Likelihood ◽

Likelihood Estimation ◽

Regression Coefficients ◽

Response Variable ◽

Penalized Likelihood Estimation ◽

Selection Operator ◽

Minimax Concave Penalty

A Gamma distributed response is subjected to regression penalized likelihood estimations of Least Absolute Shrinkage and Selection Operator (LASSO) and Minimax Concave Penalty via Generalized Linear Models (GLMs). The Gamma related disturbance controls the influence of skewness and spread in the corrected path solutions of the regression coefficients.

Download Full-text

The Strategic Use of Rodenticides Against House Mice (Mus Domesticus) Prior to Crop Invasion.

Wildlife Research ◽

10.1071/wr9940011 ◽

1994 ◽

Vol 21 (1) ◽

pp. 11 ◽

Cited By ~ 7

Author(s):

BJ Kay ◽

LE Twigg ◽

HI Nicol

Keyword(s):

Generalized Linear Models ◽

Predictive Models ◽

Linear Models ◽

Control Strategies ◽

Crop Damage ◽

House Mice ◽

Mus Domesticus ◽

Short Term ◽

Over Time

This study evaluated the effect of baiting refuge habitats around irrigated soyabeans with bromadiolone to control house mice and reduce their invasion of crops. Generalized linear models were constructed and used to predict changes in mouse abundance over time in both refuge and crop habitats of treated and untreated plots. Compared with untreated plots, bromadiolone significantly reduced the number of mice inhabiting the refuge habitat and reduced the rate at which mice invaded and colonized the adjacent crops. Despite this, no significant reductions in damage were detected as mice numbers failed to reach critical densities for crop damage on the untreated plots. This indicates a need for short-term predictive models when considering control strategies.

Download Full-text

Improving random forest predictions in small datasets from two-phase sampling designs

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01688-3 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Sunwoo Han ◽

Brian D. Williamson ◽

Youyi Fong

Keyword(s):

Random Forest ◽

Generalized Linear Models ◽

Random Forests ◽

Sampling Design ◽

Linear Models ◽

Prediction Performance ◽

Two Phase ◽

Variable Screening ◽

Phase Sampling ◽

Two Phase Sampling

Abstract Background While random forests are one of the most successful machine learning methods, it is necessary to optimize their performance for use with datasets resulting from a two-phase sampling design with a small number of cases—a common situation in biomedical studies, which often have rare outcomes and covariates whose measurement is resource-intensive. Methods Using an immunologic marker dataset from a phase III HIV vaccine efficacy trial, we seek to optimize random forest prediction performance using combinations of variable screening, class balancing, weighting, and hyperparameter tuning. Results Our experiments show that while class balancing helps improve random forest prediction performance when variable screening is not applied, class balancing has a negative impact on performance in the presence of variable screening. The impact of the weighting similarly depends on whether variable screening is applied. Hyperparameter tuning is ineffective in situations with small sample sizes. We further show that random forests under-perform generalized linear models for some subsets of markers, and prediction performance on this dataset can be improved by stacking random forests and generalized linear models trained on different subsets of predictors, and that the extent of improvement depends critically on the dissimilarities between candidate learner predictions. Conclusion In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.

Download Full-text

From Generalized Linear Models to Neural Networks, and Back

SSRN Electronic Journal ◽

10.2139/ssrn.3491790 ◽

2019 ◽

Cited By ~ 5

Author(s):

Mario V. Wuthrich

Keyword(s):

Neural Networks ◽

Generalized Linear Models ◽

Linear Models

Download Full-text

Effect of Different Dentifrices, Bleaching with 35% Hydrogen Peroxide, and Red Wine on Surface Color and Roughness of Bovine Enamel

Current Dentistry ◽

10.2174/2542579x02999200817112951 ◽

2020 ◽

Vol 02 ◽

Author(s):

RM Garcia ◽

WF Vieira-Junior ◽

JD Theobaldo ◽

NIP Pini ◽

GM Ambrosano ◽

...

Keyword(s):

Hydrogen Peroxide ◽

Generalized Linear Models ◽

Repeated Measures ◽

Red Wine ◽

Linear Models ◽

Artificial Saliva ◽

Surface Color ◽

Dental Bleaching ◽

Surface Alteration ◽

Bovine Enamel

Objective: To evaluate color and roughness of bovine enamel exposed to dentifrices, dental bleaching with 35% hydrogen peroxide (HP), and erosion/staining by red wine. Methods: Bovine enamel blocks were exposed to: artificial saliva (control), Oral-B Pro-Health (stannous fluoride with sodium fluoride, SF), Sensodyne Repair & Protect (bioactive glass, BG), Colgate Pro-Relief (arginine and calcium carbonate, AR), or Chitodent (chitosan, CHI). After toothpaste exposure, half (n=12) of the samples were bleached (35% HP), and the other half were not (n=12). The color (CIE L*a* b*, ΔE), surface roughness (Ra), and scanning electron microscopy were evaluated. Color and roughness were assessed at baseline, post-dentifrice and/or -dental bleaching, and after red wine. The data were subjected to analysis of variance (ANOVA) (ΔE) for repeated measures (Ra), followed by Tukey ́s test. The L*, a*, and b* values were analyzed by generalized linear models (a=0.05). Results: The HP promoted an increase in Ra values; however, the SF, BG, and AR did not enable this alteration. After red wine, all groups apart from SF (unbleached) showed increases in Ra values; SF and AR promoted decreases in L* values; AR demonstrated higher ΔE values, differing from the control; and CHI decreased the L* variation in the unbleached group. Conclusion: Dentifrices did not interfere with bleaching efficacy of 35% HP. However, dentifrices acted as a preventive agent against surface alteration from dental bleaching (BG, SF, and AR) or red wine (SF). Dentifrices can decrease (CHI) or increase (AR and SF) staining by red wine.

Download Full-text

Economic burden of readmission due to postoperative cerebrospinal fluid leak in Chinese patients

Journal of Comparative Effectiveness Research ◽

10.2217/cer-2020-0067 ◽

2020 ◽

Vol 9 (16) ◽

pp. 1105-1115

Author(s):

Shuqing Wu ◽

Xin Cui ◽

Shaoyu Zhang ◽

Wenqi Tian ◽

Jiazhen Liu ◽

...

Keyword(s):

Cerebrospinal Fluid ◽

Generalized Linear Models ◽

Economic Burden ◽

Linear Models ◽

Chinese Patients ◽

Cerebrospinal Fluid Leakage ◽

Index Hospitalization ◽

Real World Data ◽

Factors Associated ◽

Hospitalization Cost

Aim: This real-world data study investigated the economic burden and associated factors of readmissions for cerebrospinal fluid leakage (CSFL) post-cranial, transsphenoidal, or spinal index surgeries. Methods: Costs of CSFL readmissions and index hospitalizations during 2014–2018 were collected. Readmission cost was measured as absolute cost and as percentage of index hospitalization cost. Factors associated with readmission cost were explored using generalized linear models. Results: Readmission cost averaged US$2407–6106, 35–94% of index hospitalization cost. Pharmacy costs were the leading contributor. Generalized linear models showed transsphenoidal index surgery and surgical treatment for CSFL were associated with higher readmission costs. Conclusion: CSFL readmissions are a significant economic burden in China. Factors associated with higher readmission cost should be monitored.

Download Full-text

Modelling passenger train arrival delays with Generalized Linear Models and its perspective for scheduling at main stations

8th International Conference on Railway Engineering (ICRE 2018) ◽

10.1049/cp.2018.0049 ◽

2018 ◽

Author(s):

M.M. de Faverges ◽

G. Russolillo ◽

C. Picouleau ◽

B. Merabet ◽

B. Houzel

Keyword(s):

Generalized Linear Models ◽

Linear Models ◽

Passenger Train

Download Full-text

Determination of Enzyme or Binding Constants Using Generalized Linear Models, with Particular Reference to Michaelis–Menten Models

Journal of Pharmaceutical Sciences ◽

10.1002/jps.2600780514 ◽

1989 ◽

Vol 78 (5) ◽

pp. 413-416

Author(s):

Gerald Van Belle ◽

Sue Leurgans ◽

Pat Friel ◽

Sunwei Guo ◽

Mark Yerby

Keyword(s):

Generalized Linear Models ◽

Linear Models ◽

Binding Constants

Download Full-text

Testing for treatment effect in covariate-adaptive randomized trials with generalized linear models and omitted covariates

Statistical Methods in Medical Research ◽

10.1177/09622802211008206 ◽

2021 ◽

pp. 096228022110082

Author(s):

Yang Li ◽

Wei Ma ◽

Yichen Qin ◽

Feifang Hu

Keyword(s):

Generalized Linear Models ◽

Treatment Effect ◽

Linear Models ◽

Type I Error ◽

Type I ◽

Test Statistic ◽

Asymptotic Results ◽

Adaptive Randomization ◽

Adjustment Method ◽

Inflated Type

Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two-sample t-test for treatment effect is typically conservative. This phenomenon of invalid tests has also been found for generalized linear models without adjusting for the covariates and are sometimes more worrisome due to inflated Type I error. The purpose of this study is to examine the unadjusted test for treatment effect under generalized linear models and covariate-adaptive randomization. For a large class of covariate-adaptive randomization methods, we obtain the asymptotic distribution of the test statistic under the null hypothesis and derive the conditions under which the test is conservative, valid, or anti-conservative. Several commonly used generalized linear models, such as logistic regression and Poisson regression, are discussed in detail. An adjustment method is also proposed to achieve a valid size based on the asymptotic results. Numerical studies confirm the theoretical findings and demonstrate the effectiveness of the proposed adjustment method.

Download Full-text

The Relationships Between Serum DHEA-S and AMH Levels in Infertile Women: A Retrospective Cross-Sectional Study

Journal of Clinical Medicine ◽

10.3390/jcm10061211 ◽

2021 ◽

Vol 10 (6) ◽

pp. 1211

Author(s):

Li-Te Lin ◽

Kuan-Hao Tsui

Keyword(s):

Generalized Linear Models ◽

Large Scale ◽

Linear Models ◽

Positive Association ◽

Cross Sectional Study ◽

Dehydroepiandrosterone Sulphate ◽

Sectional Study ◽

Cross Sectional ◽

Infertile Women ◽

The Relationship

The relationship between serum dehydroepiandrosterone sulphate (DHEA-S) and anti-Mullerian hormone (AMH) levels has not been fully established. Therefore, we performed a large-scale cross-sectional study to investigate the association between serum DHEA-S and AMH levels. The study included a total of 2155 infertile women aged 20 to 46 years who were divided into four quartile groups (Q1 to Q4) based on serum DHEA-S levels. We found that there was a weak positive association between serum DHEA-S and AMH levels in infertile women (r = 0.190, p < 0.001). After adjusting for potential confounders, serum DHEA-S levels positively correlated with serum AMH levels in infertile women (β = 0.103, p < 0.001). Infertile women in the highest DHEA-S quartile category (Q4) showed significantly higher serum AMH levels (p < 0.001) compared with women in the lowest DHEA-S quartile category (Q1). The serum AMH levels significantly increased across increasing DHEA-S quartile categories in infertile women (p = 0.014) using generalized linear models after adjustment for potential confounders. Our data show that serum DHEA-S levels are positively associated with serum AMH levels.

Download Full-text