scholarly journals Penalized logistic regression based on L1/2 penalty for high-dimensional DNA methylation data

2020 ◽  
Vol 28 ◽  
pp. 161-171
Author(s):  
Hong-Kun Jiang ◽  
Yong Liang
PLoS ONE ◽  
2019 ◽  
Vol 14 (5) ◽  
pp. e0217057 ◽  
Author(s):  
Sam Doerken ◽  
Marta Avalos ◽  
Emmanuel Lagarde ◽  
Martin Schumacher

Entropy ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. 543 ◽  
Author(s):  
Konrad Furmańczyk ◽  
Wojciech Rejchel

In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.


2018 ◽  
Vol 16 (04) ◽  
pp. 1850010 ◽  
Author(s):  
Jiyun Choi ◽  
Kipoong Kim ◽  
Hokeun Sun

In genetic association studies, regularization methods are often used due to their computational efficiency for analysis of high-dimensional genomic data. DNA methylation data generated from Infinium HumanMethylation450 BeadChip Kit have a group structure where an individual gene consists of multiple Cytosine–phosphate–Guanine (CpG) sites. Consequently, group-based regularization can precisely detect outcome-related CpG sites. Representative examples are sparse group lasso (SGL) and network-based regularization. The former is powerful when most of the CpG sites within the same gene are associated with a phenotype outcome. In contrast, the latter is preferred when only a few of the CpG sites within the same gene are related to the outcome. In this paper, we propose new variable selection strategy based on a selection probability that measures selection frequency of individual variables selected by both SGL and network-based regularization. In extensive simulation study, we demonstrated that the proposed strategy can show relatively outstanding selection performance under any situation, compared with both SGL and network-based regularization. Also, we applied the proposed strategy to identify differentially methylated CpG sites and their corresponding genes from ovarian cancer data.


2017 ◽  
Vol 33 (12) ◽  
pp. 1765-1772 ◽  
Author(s):  
Hokeun Sun ◽  
Ya Wang ◽  
Yong Chen ◽  
Yun Li ◽  
Shuang Wang

Sign in / Sign up

Export Citation Format

Share Document