Assessing Goodness of Fit of Hybrid Choice Models

2015 ◽  
Vol 2495 (1) ◽  
pp. 131-141 ◽  
Author(s):  
Yutaka Motoaki ◽  
Ricardo A. Daziano
1990 ◽  
Vol 17 (2) ◽  
pp. 184-191 ◽  
Author(s):  
Frank R. Wilson ◽  
Sundar Damodaran ◽  
J. David Innes

Disaggregate mode choice models were calibrated for intercity passenger travel in Canada using a data base drawn from the Canadian Travel Survey. Multinomial logit models were calibrated for business and nonbusiness trips in the eastern and the western regions of Canada. The calibrated models produced reliable results in terms of goodness-of-fit measures. The likelihood ratio index, ρ2(c), varied from 0.282 to 0.436. Results obtained were comparable to those of previous studies. The research identified the significance of level-of-service factors in determining mode choice. The findings from the study indicated that the Canadian Travel Survey data could be used for developing disaggregate models for possible use in policy impact analysis. The potential for the use of this data base in the transportation planning process could be enhanced if some relatively minor modifications were made. Key words: models, disaggregate, choice, intercity, passenger, travel time, cost, frequency, service.


Author(s):  
Jose Apesteguia ◽  
Miguel A Ballester

Abstract We propose a novel measure of goodness of fit for stochastic choice models, that is, the maximal fraction of data that can be reconciled with the model. The procedure is to separate the data into two parts: one generated by the best specification of the model and another representing residual behavior. We claim that the three elements involved in a separation are instrumental in understanding the data. We show how to apply our approach to any stochastic choice model and then study the case of four well-known models, each capturing a different notion of randomness. We illustrate our results with an experimental data set.


Marketing ZFP ◽  
2021 ◽  
Vol 43 (3) ◽  
pp. 49-66
Author(s):  
Nils Goeken ◽  
Peter Kurz ◽  
Winfried Steiner

Choice-based conjoint (CBC) is nowadays the most widely used variant of conjoint analysis, a class of methods for measuring consumer preferences. The primary reason for the increasing dominance of the CBC approach over the last 35 years is that it closely mimics real choice behavior of consumers by asking respondents repeatedly to choose their preferred alternative from a set of several offered alternatives (choice sets). Within the framework of CBC analysis, the multinomial logit (MNL) model is the most frequently used discrete choice model due to the existence of closed form solutions for conditional choice probabilities. The popularity of CBC and the MNL model has grown even more since the introduction of hierarchical Bayesian (HB) estimation techniques that accommodate individual consumer heterogeneity in choice data, and which have now become state-of-the-art in marketing theory and practice. Still, researchers and practitioners have to make further decisions under this framework (CBC, MNL, HB estimation), such as how to represent preference heterogeneity. Here, using a normal distribution (and therefore a unimodal distribution) has become the standard approach in the marketing literature. However, the thin tails of the normal distribution suggest that the standard HB-MNL model should not be the “go-to” approach to approximate multimodal preference distributions, because individual preference patterns lying at the tails of the normal distribution (i.e., that do not fit well with the assumption of a unimodal distribution) tend to be shrunk to the population mean. This shrinkage, especially in multimodal data settings, could mask important information (e.g., new or different structures in the data). A mixture of normal distributions avoids this limited flexibility of the most simple continuous approach of assuming a unimodal prior heterogeneity distribution. There are currently two prominent HB-CBC modeling approaches embedding the mixture-of-normals (MoN) approach: the more widespread MoN-HB-MNL model, and the Dirichlet process mixture (DPM)-HB-MNL model. In this article, we review the prominent HB-MNL model (with its normal prior), the MoN-HB-MNL model, and the DPM-HB-MNL model and apply them to an empirical multi-country CBC data set. We compare the statistical performance of the three models in terms of goodness-of-fit and predictive accuracy, show how to include consumer background characteristics in the upper level of these models, and illustrate how to interpret the estimation results (with a special focus on cross-county heterogeneity). In sum, our article serves as a kind of user guide to the estimation and interpretation of Hierarchical Bayes Conjoint Choice Models. For our data, we observed that all three choice models (both with and without consumer background characteristics) resulted in a one-component solution. The DPM-HB-MNL model nevertheless yielded a higher cross-validated hit rate compared to the MoN-HB-MNL and the HB-MNL models due to its even more flexible prior assumptions. The two latter models tended to slightly overfit our empirical data, which was reflected by higher goodness-of-fit statistics but a lower predictive accuracy compared to the DPM-HB-MNL model. We showed that this result could be attributed to the weaker extent of Bayesian shrinkage of these two models. The DPM-HB-MNL model showed a stronger shrinkage effect and seems therefore somewhat more robust against overfitting. Including consumer background characteristics in terms of country of origin information for the respondents did not improve the statistical model performance (especially not the predictive performance). Still, using the country of origin information for respondents in a post-hoc segmentation analysis helped us to explain some differences in brand preferences between the five countries.


2016 ◽  
Vol 44 (6) ◽  
pp. 1145-1167 ◽  
Author(s):  
Calvin P Tribby ◽  
Harvey J Miller ◽  
Barbara B Brown ◽  
Carol M Werner ◽  
Ken R Smith

Walking is a form of active transportation with numerous benefits, including better health outcomes, lower environmental impacts and stronger communities. Understanding built environmental associations with walking behavior is a key step towards identifying design features that support walking. Human mobility data available through GPS receivers and cell phones, combined with high resolution walkability data, provide a rich source of georeferenced data for analyzing environmental associations with walking behavior. However, traditional techniques such as route choice models have difficulty with highly dimensioned data. This paper develops a novel combination of a data-driven technique with route choice modeling for leveraging walkability audits. Using data from a study in Salt Lake City, UT, USA, we apply the data-driven technique of random forests to select variables for use in walking route choice models. We estimate data-driven route choice models and theory-driven models based on predefined walkability dimensions. Results indicate that the random forest technique selects variables that dramatically improve goodness of fit of walking route choice models relative to models based on predefined walkability dimensions. We compare the theory-driven and data-driven walking route choice models based on interpretability and policy relevance.


Crisis ◽  
2013 ◽  
Vol 34 (6) ◽  
pp. 434-437 ◽  
Author(s):  
Donald W. MacKenzie

Background: Suicide clusters at Cornell University and the Massachusetts Institute of Technology (MIT) prompted popular and expert speculation of suicide contagion. However, some clustering is to be expected in any random process. Aim: This work tested whether suicide clusters at these two universities differed significantly from those expected under a homogeneous Poisson process, in which suicides occur randomly and independently of one another. Method: Suicide dates were collected for MIT and Cornell for 1990–2012. The Anderson-Darling statistic was used to test the goodness-of-fit of the intervals between suicides to distribution expected under the Poisson process. Results: Suicides at MIT were consistent with the homogeneous Poisson process, while those at Cornell showed clustering inconsistent with such a process (p = .05). Conclusions: The Anderson-Darling test provides a statistically powerful means to identify suicide clustering in small samples. Practitioners can use this method to test for clustering in relevant communities. The difference in clustering behavior between the two institutions suggests that more institutions should be studied to determine the prevalence of suicide clustering in universities and its causes.


Sign in / Sign up

Export Citation Format

Share Document