scholarly journals Consistent Estimation of Generalized Linear Models with High Dimensional Predictors via Stepwise Regression

Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 965
Author(s):  
Alex Pijyan ◽  
Qi Zheng ◽  
Hyokyoung G. Hong ◽  
Yi Li

Predictive models play a central role in decision making. Penalized regression approaches, such as least absolute shrinkage and selection operator (LASSO), have been widely used to construct predictive models and explain the impacts of the selected predictors, but the estimates are typically biased. Moreover, when data are ultrahigh-dimensional, penalized regression is usable only after applying variable screening methods to downsize variables. We propose a stepwise procedure for fitting generalized linear models with ultrahigh dimensional predictors. Our procedure can provide a final model; control both false negatives and false positives; and yield consistent estimates, which are useful to gauge the actual effect size of risk factors. Simulations and applications to two clinical studies verify the utility of the method.

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Rasaki Olawale Olanrewaju

A Gamma distributed response is subjected to regression penalized likelihood estimations of Least Absolute Shrinkage and Selection Operator (LASSO) and Minimax Concave Penalty via Generalized Linear Models (GLMs). The Gamma related disturbance controls the influence of skewness and spread in the corrected path solutions of the regression coefficients.


1994 ◽  
Vol 21 (1) ◽  
pp. 11 ◽  
Author(s):  
BJ Kay ◽  
LE Twigg ◽  
HI Nicol

This study evaluated the effect of baiting refuge habitats around irrigated soyabeans with bromadiolone to control house mice and reduce their invasion of crops. Generalized linear models were constructed and used to predict changes in mouse abundance over time in both refuge and crop habitats of treated and untreated plots. Compared with untreated plots, bromadiolone significantly reduced the number of mice inhabiting the refuge habitat and reduced the rate at which mice invaded and colonized the adjacent crops. Despite this, no significant reductions in damage were detected as mice numbers failed to reach critical densities for crop damage on the untreated plots. This indicates a need for short-term predictive models when considering control strategies.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Sunwoo Han ◽  
Brian D. Williamson ◽  
Youyi Fong

Abstract Background While random forests are one of the most successful machine learning methods, it is necessary to optimize their performance for use with datasets resulting from a two-phase sampling design with a small number of cases—a common situation in biomedical studies, which often have rare outcomes and covariates whose measurement is resource-intensive. Methods Using an immunologic marker dataset from a phase III HIV vaccine efficacy trial, we seek to optimize random forest prediction performance using combinations of variable screening, class balancing, weighting, and hyperparameter tuning. Results Our experiments show that while class balancing helps improve random forest prediction performance when variable screening is not applied, class balancing has a negative impact on performance in the presence of variable screening. The impact of the weighting similarly depends on whether variable screening is applied. Hyperparameter tuning is ineffective in situations with small sample sizes. We further show that random forests under-perform generalized linear models for some subsets of markers, and prediction performance on this dataset can be improved by stacking random forests and generalized linear models trained on different subsets of predictors, and that the extent of improvement depends critically on the dissimilarities between candidate learner predictions. Conclusion In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.


2020 ◽  
Vol 02 ◽  
Author(s):  
RM Garcia ◽  
WF Vieira-Junior ◽  
JD Theobaldo ◽  
NIP Pini ◽  
GM Ambrosano ◽  
...  

Objective: To evaluate color and roughness of bovine enamel exposed to dentifrices, dental bleaching with 35% hydrogen peroxide (HP), and erosion/staining by red wine. Methods: Bovine enamel blocks were exposed to: artificial saliva (control), Oral-B Pro-Health (stannous fluoride with sodium fluoride, SF), Sensodyne Repair & Protect (bioactive glass, BG), Colgate Pro-Relief (arginine and calcium carbonate, AR), or Chitodent (chitosan, CHI). After toothpaste exposure, half (n=12) of the samples were bleached (35% HP), and the other half were not (n=12). The color (CIE L*a* b*, ΔE), surface roughness (Ra), and scanning electron microscopy were evaluated. Color and roughness were assessed at baseline, post-dentifrice and/or -dental bleaching, and after red wine. The data were subjected to analysis of variance (ANOVA) (ΔE) for repeated measures (Ra), followed by Tukey ́s test. The L*, a*, and b* values were analyzed by generalized linear models (a=0.05). Results: The HP promoted an increase in Ra values; however, the SF, BG, and AR did not enable this alteration. After red wine, all groups apart from SF (unbleached) showed increases in Ra values; SF and AR promoted decreases in L* values; AR demonstrated higher ΔE values, differing from the control; and CHI decreased the L* variation in the unbleached group. Conclusion: Dentifrices did not interfere with bleaching efficacy of 35% HP. However, dentifrices acted as a preventive agent against surface alteration from dental bleaching (BG, SF, and AR) or red wine (SF). Dentifrices can decrease (CHI) or increase (AR and SF) staining by red wine.


2020 ◽  
Vol 9 (16) ◽  
pp. 1105-1115
Author(s):  
Shuqing Wu ◽  
Xin Cui ◽  
Shaoyu Zhang ◽  
Wenqi Tian ◽  
Jiazhen Liu ◽  
...  

Aim: This real-world data study investigated the economic burden and associated factors of readmissions for cerebrospinal fluid leakage (CSFL) post-cranial, transsphenoidal, or spinal index surgeries. Methods: Costs of CSFL readmissions and index hospitalizations during 2014–2018 were collected. Readmission cost was measured as absolute cost and as percentage of index hospitalization cost. Factors associated with readmission cost were explored using generalized linear models. Results: Readmission cost averaged US$2407–6106, 35–94% of index hospitalization cost. Pharmacy costs were the leading contributor. Generalized linear models showed transsphenoidal index surgery and surgical treatment for CSFL were associated with higher readmission costs. Conclusion: CSFL readmissions are a significant economic burden in China. Factors associated with higher readmission cost should be monitored.


1989 ◽  
Vol 78 (5) ◽  
pp. 413-416
Author(s):  
Gerald Van Belle ◽  
Sue Leurgans ◽  
Pat Friel ◽  
Sunwei Guo ◽  
Mark Yerby

2021 ◽  
pp. 096228022110082
Author(s):  
Yang Li ◽  
Wei Ma ◽  
Yichen Qin ◽  
Feifang Hu

Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two-sample t-test for treatment effect is typically conservative. This phenomenon of invalid tests has also been found for generalized linear models without adjusting for the covariates and are sometimes more worrisome due to inflated Type I error. The purpose of this study is to examine the unadjusted test for treatment effect under generalized linear models and covariate-adaptive randomization. For a large class of covariate-adaptive randomization methods, we obtain the asymptotic distribution of the test statistic under the null hypothesis and derive the conditions under which the test is conservative, valid, or anti-conservative. Several commonly used generalized linear models, such as logistic regression and Poisson regression, are discussed in detail. An adjustment method is also proposed to achieve a valid size based on the asymptotic results. Numerical studies confirm the theoretical findings and demonstrate the effectiveness of the proposed adjustment method.


2021 ◽  
Vol 10 (6) ◽  
pp. 1211
Author(s):  
Li-Te Lin ◽  
Kuan-Hao Tsui

The relationship between serum dehydroepiandrosterone sulphate (DHEA-S) and anti-Mullerian hormone (AMH) levels has not been fully established. Therefore, we performed a large-scale cross-sectional study to investigate the association between serum DHEA-S and AMH levels. The study included a total of 2155 infertile women aged 20 to 46 years who were divided into four quartile groups (Q1 to Q4) based on serum DHEA-S levels. We found that there was a weak positive association between serum DHEA-S and AMH levels in infertile women (r = 0.190, p < 0.001). After adjusting for potential confounders, serum DHEA-S levels positively correlated with serum AMH levels in infertile women (β = 0.103, p < 0.001). Infertile women in the highest DHEA-S quartile category (Q4) showed significantly higher serum AMH levels (p < 0.001) compared with women in the lowest DHEA-S quartile category (Q1). The serum AMH levels significantly increased across increasing DHEA-S quartile categories in infertile women (p = 0.014) using generalized linear models after adjustment for potential confounders. Our data show that serum DHEA-S levels are positively associated with serum AMH levels.


Sign in / Sign up

Export Citation Format

Share Document