The Higher-Order of Adaptive Lasso and Elastic Net Methods for Classification on High Dimensional Data

Autcha Araveeporn

doi:10.3390/math9101091

The Higher-Order of Adaptive Lasso and Elastic Net Methods for Classification on High Dimensional Data

Mathematics ◽

10.3390/math9101091 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1091

Author(s):

Autcha Araveeporn

Keyword(s):

Logistic Regression ◽

Categorical Data ◽

High Dimensional Data ◽

Higher Order ◽

Elastic Net ◽

Adaptive Lasso ◽

High Dimensional ◽

Independent Variables ◽

Dependent Variables ◽

Adaptive Elastic Net

The lasso and elastic net methods are the popular technique for parameter estimation and variable selection. Moreover, the adaptive lasso and elastic net methods use the adaptive weights on the penalty function based on the lasso and elastic net estimates. The adaptive weight is related to the power order of the estimator. Normally, these methods focus to estimate parameters in terms of linear regression models that are based on the dependent variable and independent variable as a continuous scale. In this paper, we compare the lasso and elastic net methods and the higher-order of the adaptive lasso and adaptive elastic net methods for classification on high dimensional data. The classification is used to classify the categorical data for dependent variable dependent on the independent variables, which is called the logistic regression model. The categorical data are considered a binary variable, and the independent variables are used as the continuous variable. The high dimensional data are represented when the number of independent variables is higher than the sample sizes. For this research, the simulation of the logistic regression is considered as the binary dependent variable and 20, 30, 40, and 50 as the independent variables when the sample sizes are less than the number of the independent variables. The independent variables are generated from normal distribution on several variances, and the dependent variables are obtained from the probability of logit function and transforming it to predict the binary data. For application in real data, we express the classification of the type of leukemia as the dependent variables and the subset of gene expression as the independent variables. The criterion of these methods is to compare by the average percentage of predicted accuracy value. The results are found that the higher-order of adaptive lasso method is satisfied with large dispersion, but the higher-order of adaptive elastic net method outperforms on small dispersion.

Download Full-text

Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2015.10.008 ◽

2015 ◽

Vol 67 ◽

pp. 136-145 ◽

Cited By ~ 45

Author(s):

Zakariya Yahya Algamal ◽

Muhammad Hisyam Lee

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Elastic Net ◽

Cancer Classification ◽

High Dimensional ◽

Adaptive Elastic Net

Download Full-text

Nonnegative estimation and variable selection via adaptive elastic-net for high-dimensional data

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2019.1642484 ◽

2019 ◽

pp. 1-17

Author(s):

Ning Li ◽

Hu Yang ◽

Jing Yang

Keyword(s):

Variable Selection ◽

High Dimensional Data ◽

Elastic Net ◽

High Dimensional ◽

Adaptive Elastic Net

Download Full-text

Appendix: A Brief Introduction to Analyzing Categorical Data and Finding More Data

Using R for Data Analysis in Social Sciences ◽

10.1093/oso/9780190656218.003.0008 ◽

2018 ◽

pp. 302-326

Author(s):

Quan Li

Keyword(s):

Logistic Regression ◽

Categorical Data ◽

Public Domain ◽

Statistical Independence ◽

Discrete Variables ◽

Short List ◽

The Public ◽

Independent Variables ◽

Dichotomous Variable ◽

Chi Squared

This chapter provides a brief introduction to two techniques often used with discrete data: testing statistical independence between two discrete variables with Chi-squared statistics, and testing the effects of some independent variables on the probability of a dependent variable taking on the value of one rather than zero with logistic regression. Both are illustrated by focusing on a dichotomous variable measuring self-reported happiness by survey respondents in World Value Surveys. In addition, the chapter also provides a short list of publicly available data resources that help to familiarize readers with the wealth of data in the public domain.

Download Full-text

The cross-validated AUC for MCP-logistic regression with high-dimensional data

Statistical Methods in Medical Research ◽

10.1177/0962280211428385 ◽

2011 ◽

Vol 22 (5) ◽

pp. 505-518 ◽

Cited By ~ 7

Author(s):

Dingfeng Jiang ◽

Jian Huang ◽

Ying Zhang

Keyword(s):

Logistic Regression ◽

High Dimensional Data ◽

High Dimensional ◽

The Cross

Download Full-text

Scalable Subspace Logistic Regression Models for High Dimensional Data

Web Technologies and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-642-29253-8_65 ◽

2012 ◽

pp. 685-694 ◽

Cited By ~ 1

Author(s):

Shuang Wang ◽

Xiaojun Chen ◽

Joshua Zhexue Huang ◽

Shengzhong Feng

Keyword(s):

Logistic Regression ◽

Regression Models ◽

High Dimensional Data ◽

High Dimensional ◽

Logistic Regression Models

Download Full-text

“Cultural additivity” and how the values and norms of Confucianism, Buddhism, and Taoism co-exist, interact, and influence Vietnamese society: a Bayesian analysis of long-standing folktales, R / Stan

10.31219/osf.io/xv7jz ◽

2018 ◽

Author(s):

Quan-Hoang Vuong ◽

Tung Ho ◽

Viet-Phuong La ◽

Dam Van Nhue ◽

Bui Quang Khiem ◽

...

Keyword(s):

Logistic Regression ◽

Bayesian Analysis ◽

Full Account ◽

Empirical Results ◽

Independent Variables ◽

Religious Intolerance ◽

Dependent Variables ◽

Values And Beliefs ◽

The Common ◽

Bayesian Logistic Regression

Every year, the Vietnamese people reportedly burned about 50,000 tons of joss papers, which took the form of not only bank notes, but iPhones, cars, clothes, even housekeepers, in hope of pleasing the dead. The practice was mistakenly attributed to traditional Buddhist teachings but originated in fact from China, which most Vietnamese were not aware of. In other aspects of life, there were many similar examples of Vietnamese so ready and comfortable with adding new norms, values, and beliefs, even contradictory ones, to their culture. This phenomenon, dubbed “cultural additivity”, prompted us to study the co-existence, interaction, and influences among core values and norms of the Three Teachings –Confucianism, Buddhism, and Taoism–as shown through Vietnamese folktales. By applying Bayesian logistic regression, we evaluated the possibility of whether the key message of a story was dominated by a religion (dependent variables), as affected by the appearance of values and anti-values pertaining to the Three Teachings in the story (independent variables). Our main findings included the existence of the cultural additivity of Confucian and Taoist values. More specifically, empirical results showed that the interaction or addition of the values of Taoism and Confucianism in folktales together helped predict whether the key message of a story was about Confucianism, β_{VT⋅VC} =0.86. Meanwhile, there was no such statistical tendency for Buddhism. The results lead to a number of important implications. First, this showed the dominance of Confucianism because the fact that Confucian and Taoist values appeared together in a story led to the story’s key message dominated by Confucianism. Thus, it presented the evidence of Confucian dominance and against liberal interpretations of the concept of the Common Roots of Three Religions (“tam giáo đồng nguyên”) as religious unification or unicity. Second, the concept of “cultural additivity” could help explain many interesting socio-cultural phenomena, namely the absence of religious intolerance and extremism in the Vietnamese society, outrageous cases of sophistry in education, the low productivity in creative endeavors like science and technology, the misleading branding strategy in business. We are aware that our results are only preliminary and more studies, both theoretical and empirical, must be carried out to give a full account of the explanatory reach of “cultural additivity”.

Download Full-text

PEMODELAN TINGKAT KERAWANAN DEMAM BERDARAH DI KABUPATEN BANJAR DENGAN METODE ANALISIS REGRESI LOGISTIK YANG TERBOBOTI GEOGRAFIS

MEDIA BINA ILMIAH ◽

10.33758/mbi.v14i4.348 ◽

2019 ◽

Vol 14 (4) ◽

pp. 2393

Author(s):

Dewi Sri Susanti ◽

Pamona Dwi Rahayu ◽

Oni Soesanto

Keyword(s):

Logistic Regression ◽

Dengue Fever ◽

Estimation Method ◽

Likelihood Estimation ◽

Rate Model ◽

Independent Variables ◽

Dependent Variables ◽

Geographically Weighted Logistic Regression ◽

The Relationship ◽

Maximum Likelihood Estimation Method

Regression analysis is a metodh for investigating the relationship between the dependent variable (Y) and independent variables (X). Logistic regression is a regression model that used related to the qualitative Dependent variable. If the Logistic regression influenced by factors of the location of each point from observation where the data is collected, it will be a Geographically Weighted Logistic Regression (GWLR). In the case of insecurity rate model of dengue fever has two or more categories, so that this case can be resolved by GWLR. This research aims to clarify the procedure of testing the parameters GWLR model and form insecurity rate model of dengue fever with GWLR method in Banjar Regency. Dependent variable with catagoric is Insecurity rate of dengue fever ( ) and independent variables is the population density ( ), the distance from the capital of the subdistrict to capital of regency ( ), fogging per subdistrict ( ), the percentage of households living clean and healthy ( ), pesentase healthy homes ( ), the percentage of access to decent sanitation ( ). The results from this research are estimate parameters using Maximum Likelihood Estimation method and presented in the form of thematic map that shows not all dependent variables give influence on Insecurity rate dengue fever

Download Full-text

On the Performance of Variable Selection and Classification via Rank-Based Classifier

Mathematics ◽

10.3390/math7050457 ◽

2019 ◽

Vol 7 (5) ◽

pp. 457 ◽

Cited By ~ 1

Author(s):

Md Sarker ◽

Michael Pokojovy ◽

Sangjin Kim

Keyword(s):

Gene Expression ◽

Logistic Regression ◽

Gene Expression Data ◽

Gene Selection ◽

Laplacian Matrix ◽

Adaptive Lasso ◽

High Dimensional ◽

Expression Data ◽

Correlation Pattern ◽

Future Outcomes

In high-dimensional gene expression data analysis, the accuracy and reliability of cancer classification and selection of important genes play a very crucial role. To identify these important genes and predict future outcomes (tumor vs. non-tumor), various methods have been proposed in the literature. But only few of them take into account correlation patterns and grouping effects among the genes. In this article, we propose a rank-based modification of the popular penalized logistic regression procedure based on a combination of ℓ 1 and ℓ 2 penalties capable of handling possible correlation among genes in different groups. While the ℓ 1 penalty maintains sparsity, the ℓ 2 penalty induces smoothness based on the information from the Laplacian matrix, which represents the correlation pattern among genes. We combined logistic regression with the BH-FDR (Benjamini and Hochberg false discovery rate) screening procedure and a newly developed rank-based selection method to come up with an optimal model retaining the important genes. Through simulation studies and real-world application to high-dimensional colon cancer gene expression data, we demonstrated that the proposed rank-based method outperforms such currently popular methods as lasso, adaptive lasso and elastic net when applied both to gene selection and classification.

Download Full-text

High Dimensional Logistic Regression Model using Adjusted Elastic Net Penalty

Pakistan Journal of Statistics and Operation Research ◽

10.18187/pjsor.v11i4.990 ◽

2015 ◽

Vol 11 (4) ◽

pp. 667 ◽

Cited By ~ 8

Author(s):

Zakariya Y Algamal ◽

Muhammad Hisyam Lee

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Elastic Net ◽

High Dimensional

Download Full-text

High-dimensional index tracking based on the adaptive elastic net

Quantitative Finance ◽

10.1080/14697688.2020.1737328 ◽

2020 ◽

Vol 20 (9) ◽

pp. 1513-1530

Author(s):

Lianjie Shu ◽

Fangquan Shi ◽

Guoliang Tian

Keyword(s):

Elastic Net ◽

High Dimensional ◽

Index Tracking ◽

Adaptive Elastic Net

Download Full-text