scholarly journals The Higher-Order of Adaptive Lasso and Elastic Net Methods for Classification on High Dimensional Data

Mathematics ◽  
2021 ◽  
Vol 9 (10) ◽  
pp. 1091
Author(s):  
Autcha Araveeporn

The lasso and elastic net methods are the popular technique for parameter estimation and variable selection. Moreover, the adaptive lasso and elastic net methods use the adaptive weights on the penalty function based on the lasso and elastic net estimates. The adaptive weight is related to the power order of the estimator. Normally, these methods focus to estimate parameters in terms of linear regression models that are based on the dependent variable and independent variable as a continuous scale. In this paper, we compare the lasso and elastic net methods and the higher-order of the adaptive lasso and adaptive elastic net methods for classification on high dimensional data. The classification is used to classify the categorical data for dependent variable dependent on the independent variables, which is called the logistic regression model. The categorical data are considered a binary variable, and the independent variables are used as the continuous variable. The high dimensional data are represented when the number of independent variables is higher than the sample sizes. For this research, the simulation of the logistic regression is considered as the binary dependent variable and 20, 30, 40, and 50 as the independent variables when the sample sizes are less than the number of the independent variables. The independent variables are generated from normal distribution on several variances, and the dependent variables are obtained from the probability of logit function and transforming it to predict the binary data. For application in real data, we express the classification of the type of leukemia as the dependent variables and the subset of gene expression as the independent variables. The criterion of these methods is to compare by the average percentage of predicted accuracy value. The results are found that the higher-order of adaptive lasso method is satisfied with large dispersion, but the higher-order of adaptive elastic net method outperforms on small dispersion.

Author(s):  
Quan Li

This chapter provides a brief introduction to two techniques often used with discrete data: testing statistical independence between two discrete variables with Chi-squared statistics, and testing the effects of some independent variables on the probability of a dependent variable taking on the value of one rather than zero with logistic regression. Both are illustrated by focusing on a dichotomous variable measuring self-reported happiness by survey respondents in World Value Surveys. In addition, the chapter also provides a short list of publicly available data resources that help to familiarize readers with the wealth of data in the public domain.


2018 ◽  
Author(s):  
Quan-Hoang Vuong ◽  
Tung Ho ◽  
Viet-Phuong La ◽  
Dam Van Nhue ◽  
Bui Quang Khiem ◽  
...  

Every year, the Vietnamese people reportedly burned about 50,000 tons of joss papers, which took the form of not only bank notes, but iPhones, cars, clothes, even housekeepers, in hope of pleasing the dead. The practice was mistakenly attributed to traditional Buddhist teachings but originated in fact from China, which most Vietnamese were not aware of. In other aspects of life, there were many similar examples of Vietnamese so ready and comfortable with adding new norms, values, and beliefs, even contradictory ones, to their culture. This phenomenon, dubbed “cultural additivity”, prompted us to study the co-existence, interaction, and influences among core values and norms of the Three Teachings –Confucianism, Buddhism, and Taoism–as shown through Vietnamese folktales. By applying Bayesian logistic regression, we evaluated the possibility of whether the key message of a story was dominated by a religion (dependent variables), as affected by the appearance of values and anti-values pertaining to the Three Teachings in the story (independent variables). Our main findings included the existence of the cultural additivity of Confucian and Taoist values. More specifically, empirical results showed that the interaction or addition of the values of Taoism and Confucianism in folktales together helped predict whether the key message of a story was about Confucianism, β_{VT⋅VC} =0.86. Meanwhile, there was no such statistical tendency for Buddhism. The results lead to a number of important implications. First, this showed the dominance of Confucianism because the fact that Confucian and Taoist values appeared together in a story led to the story’s key message dominated by Confucianism. Thus, it presented the evidence of Confucian dominance and against liberal interpretations of the concept of the Common Roots of Three Religions (“tam giáo đồng nguyên”) as religious unification or unicity. Second, the concept of “cultural additivity” could help explain many interesting socio-cultural phenomena, namely the absence of religious intolerance and extremism in the Vietnamese society, outrageous cases of sophistry in education, the low productivity in creative endeavors like science and technology, the misleading branding strategy in business. We are aware that our results are only preliminary and more studies, both theoretical and empirical, must be carried out to give a full account of the explanatory reach of “cultural additivity”.


2019 ◽  
Vol 14 (4) ◽  
pp. 2393
Author(s):  
Dewi Sri Susanti ◽  
Pamona Dwi Rahayu ◽  
Oni Soesanto

Regression analysis is a metodh for investigating the relationship between the dependent variable (Y) and independent variables (X). Logistic regression is a regression model that used related to the qualitative Dependent variable. If the Logistic regression influenced by factors of the location of each point from observation where the data is collected, it will be a Geographically Weighted Logistic Regression (GWLR). In the case of insecurity rate model of dengue fever has two or more categories, so that this case can be resolved by GWLR. This research aims to clarify the procedure of testing the parameters GWLR model and form insecurity rate model of dengue fever with GWLR method in Banjar Regency. Dependent variable with catagoric is Insecurity rate of dengue fever ( ) and independent variables is the population density ( ), the distance from the capital of the subdistrict to capital of regency ( ), fogging per subdistrict ( ), the percentage of households living clean and healthy ( ), pesentase healthy homes ( ), the percentage of access to decent sanitation ( ). The results from this research are estimate parameters using Maximum Likelihood Estimation method and presented in the form of thematic map that shows not all dependent variables give influence on Insecurity rate dengue fever


Mathematics ◽  
2019 ◽  
Vol 7 (5) ◽  
pp. 457 ◽  
Author(s):  
Md Sarker ◽  
Michael Pokojovy ◽  
Sangjin Kim

In high-dimensional gene expression data analysis, the accuracy and reliability of cancer classification and selection of important genes play a very crucial role. To identify these important genes and predict future outcomes (tumor vs. non-tumor), various methods have been proposed in the literature. But only few of them take into account correlation patterns and grouping effects among the genes. In this article, we propose a rank-based modification of the popular penalized logistic regression procedure based on a combination of ℓ 1 and ℓ 2 penalties capable of handling possible correlation among genes in different groups. While the ℓ 1 penalty maintains sparsity, the ℓ 2 penalty induces smoothness based on the information from the Laplacian matrix, which represents the correlation pattern among genes. We combined logistic regression with the BH-FDR (Benjamini and Hochberg false discovery rate) screening procedure and a newly developed rank-based selection method to come up with an optimal model retaining the important genes. Through simulation studies and real-world application to high-dimensional colon cancer gene expression data, we demonstrated that the proposed rank-based method outperforms such currently popular methods as lasso, adaptive lasso and elastic net when applied both to gene selection and classification.


2020 ◽  
Vol 20 (9) ◽  
pp. 1513-1530
Author(s):  
Lianjie Shu ◽  
Fangquan Shi ◽  
Guoliang Tian

Sign in / Sign up

Export Citation Format

Share Document