Latent profile analysis of human values: What is the optimal number of clusters?

Mikkel N. Schmidt; Daniel Seddig; Eldad Davidov; Morten Mørup; Kristoffer Jon Albers; Jan Michael Bauer; Fumiko Kano Glückstad

doi:10.5964/meth.5479

Latent profile analysis of human values: What is the optimal number of clusters?

Methodology ◽

10.5964/meth.5479 ◽

2021 ◽

Vol 17 (2) ◽

pp. 127-148

Author(s):

Mikkel N. Schmidt ◽

Daniel Seddig ◽

Eldad Davidov ◽

Morten Mørup ◽

Kristoffer Jon Albers ◽

...

Keyword(s):

Model Selection ◽

Selection Criteria ◽

Latent Profile Analysis ◽

Profile Analysis ◽

Information Criterion ◽

Optimal Number ◽

Number Of Clusters ◽

Model Selection Criteria ◽

Optimal Number Of Clusters ◽

Latent Profile

Latent Profile Analysis (LPA) is a method to extract homogeneous clusters characterized by a common response profile. Previous works employing LPA to human value segmentation tend to select a small number of moderately homogeneous clusters based on model selection criteria such as Akaike information criterion, Bayesian information criterion and Entropy. The question is whether a small number of clusters is all that can be gleaned from the data. While some studies have carefully compared different statistical model selection criteria, there is currently no established criteria to assess if an increased number of clusters generates meaningful theoretical insights. This article examines the content and meaningfulness of the clusters extracted using two algorithms: Variational Bayesian LPA and Maximum Likelihood LPA. For both methods, our results point towards eight as the optimal number of clusters for characterizing distinctive Schwartz value typologies that generate meaningful insights and predict several external variables.

Download Full-text

Using Model Selection Criteria to Choose the Number of Principal Components

Journal of Statistical Theory and Applications ◽

10.1007/s44199-021-00002-4 ◽

2021 ◽

Vol 20 (3) ◽

pp. 450-461

Author(s):

Stanley L. Sclove

Keyword(s):

Model Selection ◽

Principal Components ◽

Bayesian Information Criterion ◽

Selection Criteria ◽

Information Criterion ◽

Information Criteria ◽

Akaike's Information Criterion ◽

Model Selection Criteria ◽

Adequate Number ◽

Number Of Principal Components

AbstractThe use of information criteria, especially AIC (Akaike’s information criterion) and BIC (Bayesian information criterion), for choosing an adequate number of principal components is illustrated.

Download Full-text

A Comparative Analysis and Evaluation of Model Selection Criteria

Volume 5: Education and Globalization ◽

10.1115/imece2017-70152 ◽

2017 ◽

Author(s):

Ahmed H. Kamel ◽

Ali S. Shaqlaih ◽

Arslan Rozyyev

Keyword(s):

Experimental Data ◽

Model Selection ◽

Selection Criteria ◽

Oil And Gas ◽

Information Criterion ◽

Oil And Gas Industry ◽

Absolute Error ◽

Ongoing Research ◽

Model Selection Criteria ◽

Innovative Strategy

The ongoing research for model choice and selection has generated a plethora of approaches. With such a wealth of methods, it can be difficult for a researcher to know what model selection approach is the proper way to proceed to select the appropriate model for prediction. The authors present an evaluation of various model selection criteria from decision-theoretic perspective using experimental data to define and recommend a criterion to select the best model. In this analysis, six of the most common selection criteria, nineteen friction factor correlations, and eight sets of experimental data are employed. The results show that while the use of the traditional correlation coefficient, R2 is inappropriate, root mean square error, RMSE can be used to rank models, but does not give much insight on their accuracy. Other criteria such as correlation ratio, mean absolute error, and standard deviation are also evaluated. The Akaike information criterion, AIC has shown its superiority to other selection criteria. The authors propose AIC as an alternative to use when fitting experimental data or evaluating existing correlations. Indeed, the AIC method is an information theory based, theoretically sound and stable. The paper presents a detailed discussion of the model selection criteria, their pros and cons, and how they can be utilized to allow proper comparison of different models for the best model to be inferred based on sound mathematical theory. In conclusion, model selection is an interesting problem and an innovative strategy to help alleviate similar challenges faced by the professionals in the oil and gas industry is introduced.

Download Full-text

Latent Profile Analysis of the Five Facet Mindfulness Questionnaire in a Sample With a History of Recurrent Depression

Assessment ◽

10.1177/1073191117715114 ◽

2017 ◽

Vol 27 (1) ◽

pp. 149-163 ◽

Cited By ~ 6

Author(s):

Jenny Gu ◽

Anke Karl ◽

Ruth Baer ◽

Clara Strauss ◽

Thorsten Barnhofer ◽

...

Keyword(s):

Psychological Functioning ◽

Latent Profile Analysis ◽

Profile Analysis ◽

Optimal Number ◽

Future Research ◽

Five Facet Mindfulness Questionnaire ◽

Recurrent Depression ◽

Mindfulness Skills ◽

History Of ◽

Latent Profile

Extending previous research, we applied latent profile analysis in a sample of adults with a history of recurrent depression to identify subgroups with distinct response profiles on the Five Facet Mindfulness Questionnaire and understand how these relate to psychological functioning. The sample was randomly divided into two subsamples to first examine the optimal number of latent profiles (test sample; n = 343) and then validate the identified solution (validation sample; n = 340). In both test and validation samples, a four-profile solution was revealed where two profiles mapped broadly onto those previously identified in nonclinical samples: “high mindfulness” and “nonjudgmentally aware.” Two additional subgroups, “moderate mindfulness” and “very low mindfulness,” were observed. “High mindfulness” was associated with the most adaptive psychological functioning and “very low mindfulness” with the least adaptive. In most people with recurrent depression, mindfulness skills are expressed evenly across different domains. However, in a small minority a meaningful and replicable uneven profile indicating nonjudgmental awareness is observable. Current findings require replication and future research should examine the extent to which profiles change from periods of wellness to illness in people with recurrent depression and how profiles are influenced by exposure to mindfulness-based intervention.

Download Full-text

Improved Treatment of the Independent Variables for the Deployment of Model Selection Criteria in the Analysis of Complex Systems

Entropy ◽

10.3390/e23091202 ◽

2021 ◽

Vol 23 (9) ◽

pp. 1202

Author(s):

Luca Spolladore ◽

Michela Gelfusa ◽

Riccardo Rossi ◽

Andrea Murari

Keyword(s):

Model Selection ◽

Complex Systems ◽

Selection Criteria ◽

A Priori ◽

Synthetic Data ◽

Information Criterion ◽

Model Selection Criteria ◽

Dependent Variables ◽

Chi Squared ◽

Highly Correlated

Model selection criteria are widely used to identify the model that best represents the data among a set of potential candidates. Amidst the different model selection criteria, the Bayesian information criterion (BIC) and the Akaike information criterion (AIC) are the most popular and better understood. In the derivation of these indicators, it was assumed that the model’s dependent variables have already been properly identified and that the entries are not affected by significant uncertainties. These are issues that can become quite serious when investigating complex systems, especially when variables are highly correlated and the measurement uncertainties associated with them are not negligible. More sophisticated versions of this criteria, capable of better detecting spurious relations between variables when non-negligible noise is present, are proposed in this paper. Their derivation is obtained starting from a Bayesian statistics framework and adding an a priori Chi-squared probability distribution function of the model, dependent on a specifically defined information theoretic quantity that takes into account the redundancy between the dependent variables. The performances of the proposed versions of these criteria are assessed through a series of systematic simulations, using synthetic data for various classes of functions and noise levels. The results show that the upgraded formulation of the criteria clearly outperforms the traditional ones in most of the cases reported.

Download Full-text

PERSONALITY TYPES ON NEW GROUND: LATENT PROFILE ANALYSIS BASED ON THREE PSYCHOLEXICAL MODELS OF PERSONALITY

Primenjena psihologija ◽

10.19090/pp.2016.1.41-61 ◽

2016 ◽

Vol 9 (1) ◽

pp. 41

Author(s):

Selka Sadiković ◽

Dina Fesl ◽

Petar Čolović

Keyword(s):

Big Five ◽

Latent Profile Analysis ◽

Profile Analysis ◽

Dimensional Space ◽

Information Criterion ◽

Personality Types ◽

Short Version ◽

Big Five Model ◽

The Stability ◽

Latent Profile

The aim of the research was to determine the number, characteristics, and the level of convergence of personality types extracted in the space of the three psycho-lexical conceptualizations of personality – The Big Five, HEXACO, and The Big Seven. The study was conducted on a sample consisting of 343 participants (55.7% female), aged 18–60 (M = 33.99). The participants completed the IPIP-50 (Big Five model operationalization), IPIP-HEXACO (HEXACO model operationalization) and the BF+2-70 (short version of the questionnaire for assessing seven lexical dimensions in Serbian language) questionnaires. Latent profile analysis was conducted in the space of dimension scores of the three questionnaires. The Bayesian information criterion suggested three-class solution to be optimal in the space of all three questionnaires. Analyzing the structure of latent profiles, the classes within the three models were interpreted as “resilient”, “reserved”, and “maladjusted”. The congruency of classes was analyzed by multiple correspondence analyses, which indicated a high convergence of types in the two-dimensional space. Results indicate a distinct similarity between the extracted profiles with the profiles from previous studies, generally pointing towards the stability of the three big personality prototypes.

Download Full-text

On the Use of Entropy to Improve Model Selection Criteria

Entropy ◽

10.3390/e21040394 ◽

2019 ◽

Vol 21 (4) ◽

pp. 394 ◽

Cited By ~ 6

Author(s):

Andrea Murari ◽

Emmanuele Peluso ◽

Francesco Cianfrani ◽

Pasquale Gaudio ◽

Michele Lungaroni

Keyword(s):

Model Selection ◽

Shannon Entropy ◽

Selection Criteria ◽

Mean Squared Error ◽

Geodesic Distance ◽

Information Criterion ◽

Residual Distribution ◽

Model Selection Criteria ◽

Synthetic Indicators ◽

Squared Error

The most widely used forms of model selection criteria, the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC), are expressed in terms of synthetic indicators of the residual distribution: the variance and the mean-squared error of the residuals respectively. In many applications in science, the noise affecting the data can be expected to have a Gaussian distribution. Therefore, at the same level of variance and mean-squared error, models, whose residuals are more uniformly distributed, should be favoured. The degree of uniformity of the residuals can be quantified by the Shannon entropy. Including the Shannon entropy in the BIC and AIC expressions improves significantly these criteria. The better performances have been demonstrated empirically with a series of simulations for various classes of functions and for different levels and statistics of the noise. In presence of outliers, a better treatment of the errors, using the Geodesic Distance, has proved essential.

Download Full-text

Latent profile analysis with nonnormal mixtures: A Monte Carlo examination of model selection using fit indices

Computational Statistics & Data Analysis ◽

10.1016/j.csda.2015.02.019 ◽

2016 ◽

Vol 93 ◽

pp. 146-161 ◽

Cited By ~ 12

Author(s):

Grant B. Morgan ◽

Kari J. Hodge ◽

Aaron R. Baggett

Keyword(s):

Monte Carlo ◽

Model Selection ◽

Latent Profile Analysis ◽

Profile Analysis ◽

Fit Indices ◽

Latent Profile

Download Full-text

Application of statistical model selection criteria to the Stock Synthesis assessment program

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f00-137 ◽

2000 ◽

Vol 57 (9) ◽

pp. 1784-1793 ◽

Cited By ~ 20

Author(s):

S Langitoto Helu ◽

David B Sampson ◽

Yanshui Yin

Keyword(s):

Model Selection ◽

Akaike Information Criterion ◽

Bayesian Information Criterion ◽

Selection Criteria ◽

Information Criterion ◽

Model Selection Criteria ◽

Assessment Program ◽

Correct Model ◽

Complex Models ◽

Multiple Data Sets

Statistical modeling involves building sufficiently complex models to represent the system being investigated. Overly complex models lead to imprecise parameter estimates, increase the subjective role of the modeler, and can distort the perceived characteristics of the system under investigation. One approach for controlling the tendency to increased complexity and subjectivity is to use model selection criteria that account for these factors. The effectiveness of two selection criteria was tested in an application with the stock assessment program known as Stock Synthesis. This program, which is often used on the U.S. west coast to assess the status of exploited marine fish stocks, can handle multiple data sets and mimic highly complex population dynamics. The Akaike information criterion and Schwarz's Bayesian information criterion are criteria that satisfy the fundamental principles of model selection: goodness-of-fit, parsimony, and objectivity. Their ability to select the correct model form and produce accurate estimates was evaluated in Monte Carlo experiments with the Stock Synthesis program. In general, the Akaike information criterion and the Bayesian information criterion had similar performance in selecting the correct model, and they produced comparable levels of accuracy in their estimates of ending stock biomass.

Download Full-text

The Genetic Structure of Slovak Spotted Cattle Based on Genome-wide Analysis

Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis ◽

10.11118/actaun202068010057 ◽

2020 ◽

Vol 68 (1) ◽

pp. 57-61

Author(s):

Kristína Lehocká ◽

Barbora Olšanská ◽

Radovan Kasarda ◽

Ondrej Kadlečík ◽

Anna Trakovická ◽

...

Keyword(s):

Bayesian Analysis ◽

Clustering Algorithm ◽

Information Criterion ◽

Optimal Number ◽

Number Of Clusters ◽

Membership Probability ◽

Production Type ◽

Genome Wide ◽

Genetic Clusters ◽

Optimal Number Of Clusters

The objective of the study was to determine the membership probability and level of admixture among Slovak Spotted cattle and historically related breeds (Ayshire, Holstein, Swiss Simmental and Slovak Pinzgau). The analysis was based on the panel of 35 934 SNPs that were used for genotyping of 423 individuals. The optimal number of clusters was estimated in two ways; by analysis of Bayesian information criterion and Bayesian clustering algorithm. The optimal number of clusters ranged from 3 to 5, depending on the applied approach. Subsequently, the population structure was tested by discriminant analysis of principal components (DAPC) and unsupervised Bayesian analysis based on the correlated allele frequencies model. The first discriminant function revealed three genetic clusters in population resulting from the production type and origin of analysed breeds. The unsupervised Bayesian analysis showed similar results, where the highest level of admixture was found between Slovak Pinzgau and Slovak Spotted cattle (0.6%). Despite that, the results of this study clearly showed that the Slovak Spotted cattle is genetically separated from other breeds that were involved in its grading-up process.

Download Full-text

A Modified Model-Selection Criteria in a Generalised Estimating Equation for Latent Class Regression Models

MATEMATIKA ◽

10.11113/matematika.v35.n2.1175 ◽

2019 ◽

Vol 35 (2) ◽

pp. 201-211

Author(s):

Jerry Dwi Trijoyo Purnomo ◽

Chih-Rung Chen ◽

Guan-Hua Huang

Keyword(s):

Model Selection ◽

Selection Criteria ◽

Latent Class ◽

Information Criterion ◽

Estimating Equation ◽

Modified Model ◽

Model Selection Criteria ◽

Penalty Term ◽

Latent Class Regression ◽

Full Likelihood

In recent years, generalised estimating equations (GEEs) have played an important role in many fields of research, such as biomedicine. In this paper, we use GEEs for latent class regression (LCR) with covariate effects on underlying and measured variables. However, there are only a few model-selection criteria in GEEs. The widely known Akaike information criterion (AIC) cannot be used directly, since AIC is a full likelihood-based model, whereas GEEs are nonlikelihood based. Hence, we propose a modification to AIC in GEEs for (LCR) models, where the likelihood is replaced by quasi-likelihood, and a proper adjustment is made by giving a penalty term. The data of the modified hospital elder life program (mHELP) project are used to illustrate our method.

Download Full-text