Penalized logistic regression based on L1/2 penalty for high-dimensional DNA methylation data

Penalized logistic regression for high-dimensional DNA methylation data with case-control studies

Bioinformatics ◽

10.1093/bioinformatics/bts145 ◽

2012 ◽

Vol 28 (10) ◽

pp. 1368-1375 ◽

Cited By ~ 54

Author(s):

Hokeun Sun ◽

Shuang Wang

Keyword(s):

Dna Methylation ◽

Logistic Regression ◽

Case Control ◽

High Dimensional ◽

Methylation Data ◽

Case Control Studies ◽

Penalized Logistic Regression

Download Full-text

Penalized logistic regression with low prevalence exposures beyond high dimensional settings

PLoS ONE ◽

10.1371/journal.pone.0217057 ◽

2019 ◽

Vol 14 (5) ◽

pp. e0217057 ◽

Cited By ~ 10

Author(s):

Sam Doerken ◽

Marta Avalos ◽

Emmanuel Lagarde ◽

Martin Schumacher

Keyword(s):

Logistic Regression ◽

High Dimensional ◽

Penalized Logistic Regression ◽

Low Prevalence

Download Full-text

Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification

Entropy ◽

10.3390/e22050543 ◽

2020 ◽

Vol 22 (5) ◽

pp. 543 ◽

Cited By ~ 2

Author(s):

Konrad Furmańczyk ◽

Wojciech Rejchel

Keyword(s):

Logistic Regression ◽

Variable Selection ◽

Logistic Model ◽

Binary Classification ◽

Model Misspecification ◽

High Dimensional ◽

Classification Models ◽

Computationally Efficient ◽

Class Labels ◽

Penalized Logistic Regression

In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.

Download Full-text

New variable selection strategy for analysis of high-dimensional DNA methylation data

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720018500105 ◽

2018 ◽

Vol 16 (04) ◽

pp. 1850010 ◽

Cited By ~ 2

Author(s):

Jiyun Choi ◽

Kipoong Kim ◽

Hokeun Sun

Keyword(s):

Dna Methylation ◽

Variable Selection ◽

Group Structure ◽

Association Studies ◽

High Dimensional ◽

Selection Strategy ◽

Methylation Data ◽

Selection Probability ◽

Cancer Data ◽

Cpg Sites

In genetic association studies, regularization methods are often used due to their computational efficiency for analysis of high-dimensional genomic data. DNA methylation data generated from Infinium HumanMethylation450 BeadChip Kit have a group structure where an individual gene consists of multiple Cytosine–phosphate–Guanine (CpG) sites. Consequently, group-based regularization can precisely detect outcome-related CpG sites. Representative examples are sparse group lasso (SGL) and network-based regularization. The former is powerful when most of the CpG sites within the same gene are associated with a phenotype outcome. In contrast, the latter is preferred when only a few of the CpG sites within the same gene are related to the outcome. In this paper, we propose new variable selection strategy based on a selection probability that measures selection frequency of individual variables selected by both SGL and network-based regularization. In extensive simulation study, we demonstrated that the proposed strategy can show relatively outstanding selection performance under any situation, compared with both SGL and network-based regularization. Also, we applied the proposed strategy to identify differentially methylated CpG sites and their corresponding genes from ovarian cancer data.

Download Full-text

Feature selection using logistic regression in case–control DNA methylation data of Parkinson's disease: A comparative study

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2018.08.018 ◽

2018 ◽

Vol 457 ◽

pp. 14-18 ◽

Cited By ~ 1

Author(s):

Aishwarya Kakade ◽

Baby Kumari ◽

Pankaj Singh Dholaniya

Keyword(s):

Parkinson’S Disease ◽

Dna Methylation ◽

Parkinson's Disease ◽

Logistic Regression ◽

Feature Selection ◽

Comparative Study ◽

Case Control ◽

Methylation Data

Download Full-text

Analyzing High-Dimensional Gene Expression and DNA Methylation Data with R

10.1201/9780429155192 ◽

2020 ◽

Author(s):

Hongmei Zhang

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

High Dimensional ◽

Methylation Data

Download Full-text

Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification

Expert Systems with Applications ◽

10.1016/j.eswa.2015.08.016 ◽

2015 ◽

Vol 42 (23) ◽

pp. 9326-9332 ◽

Cited By ~ 52

Author(s):

Zakariya Yahya Algamal ◽

Muhammad Hisyam Lee

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Cancer Classification ◽

Adaptive Lasso ◽

High Dimensional ◽

Penalized Logistic Regression

Download Full-text

Network-based regularization for matched case-control analysis of high-dimensional DNA methylation data

Statistics in Medicine ◽

10.1002/sim.5694 ◽

2012 ◽

Vol 32 (12) ◽

pp. 2127-2139 ◽

Cited By ~ 17

Author(s):

Hokeun Sun ◽

Shuang Wang

Keyword(s):

Dna Methylation ◽

Case Control ◽

High Dimensional ◽

Control Analysis ◽

Methylation Data ◽

Matched Case ◽

Case Control Analysis

Download Full-text

pETM: a penalized Exponential Tilt Model for analysis of correlated high-dimensional DNA methylation data

Bioinformatics ◽

10.1093/bioinformatics/btx064 ◽

2017 ◽

Vol 33 (12) ◽

pp. 1765-1772 ◽

Cited By ~ 8

Author(s):

Hokeun Sun ◽

Ya Wang ◽

Yong Chen ◽

Yun Li ◽

Shuang Wang

Keyword(s):

Dna Methylation ◽

High Dimensional ◽

Methylation Data ◽

Exponential Tilt ◽

Tilt Model

Download Full-text

Analyzing high‐dimensional gene expression and DNA methylation data with RHongmeiZhang, (2020). Chapman & Hall/CRC Press Mathematical and Computational Biology 202 pages, £59.99 (Paperback), £150.00 (Hardbound), £53.99 (e‐book). ISBN 9780367495169

Journal of the Royal Statistical Society Series A (Statistics in Society) ◽

10.1111/rssa.12706 ◽

2021 ◽

Author(s):

Anoop Chaturvedi

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Computational Biology ◽

High Dimensional ◽

Methylation Data

Download Full-text