Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling

Andrea Cappozzo; Luis Angel García García Escudero; Francesca Greselin; Agustín Mayo-Iscar

doi:10.3390/stats4030036

Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling

Stats ◽

10.3390/stats4030036 ◽

2021 ◽

Vol 4 (3) ◽

pp. 602-615

Author(s):

Andrea Cappozzo ◽

Luis Angel García García Escudero ◽

Francesca Greselin ◽

Agustín Mayo-Iscar

Keyword(s):

Statistical Inference ◽

Robust Estimation ◽

Final Solution ◽

Parameter Choice ◽

Subjective Judgment ◽

Explanatory Variables ◽

Weighted Model ◽

Cluster A ◽

Parameter Values ◽

Robust Cluster

Statistical inference based on the cluster weighted model often requires some subjective judgment from the modeler. Many features influence the final solution, such as the number of mixture components, the shape of the clusters in the explanatory variables, and the degree of heteroscedasticity of the errors around the regression lines. Moreover, to deal with outliers and contamination that may appear in the data, hyper-parameter values ensuring robust estimation are also needed. In principle, this freedom gives rise to a variety of “legitimate” solutions, each derived by a specific set of choices and their implications in modeling. Here we introduce a method for identifying a “set of good models” to cluster a dataset, considering the whole panorama of choices. In this way, we enable the practitioner, or the scientist who needs to cluster the data, to make an educated choice. They will be able to identify the most appropriate solutions for the purposes of their own analysis, in light of their stability and validity.

Download Full-text

Robust Estimation with Discrete Explanatory Variables

Compstat ◽

10.1007/978-3-642-57489-4_78 ◽

2002 ◽

pp. 509-514 ◽

Cited By ~ 3

Author(s):

Pavel Čížek

Keyword(s):

Robust Estimation ◽

Explanatory Variables

Download Full-text

Penalized multiply robust estimation in high-order autoregressive processes with missing explanatory variables

Journal of Multivariate Analysis ◽

10.1016/j.jmva.2021.104867 ◽

2021 ◽

pp. 104867

Author(s):

Wei Xiong ◽

Dehui Wang ◽

Dianliang Deng ◽

Xinyang Wang ◽

Wanying Zhang

Keyword(s):

Robust Estimation ◽

High Order ◽

Autoregressive Processes ◽

Explanatory Variables

Download Full-text

Selection Without Exclusion

Econometrica ◽

10.3982/ecta16481 ◽

2020 ◽

Vol 88 (3) ◽

pp. 1007-1029

Author(s):

Bo E. Honoré ◽

Luojia Hu

Keyword(s):

Line Segment ◽

Sample Selection ◽

Selection Model ◽

Sharp Bounds ◽

One Dimensional ◽

Explanatory Variables ◽

Sample Selection Model ◽

Full Structure ◽

Exclusion Restrictions ◽

Parameter Values

It is well understood that classical sample selection models are not semiparametrically identified without exclusion restrictions. Lee (2009) developed bounds for the parameters in a model that nests the semiparametric sample selection model. These bounds can be wide. In this paper, we investigate bounds that impose the full structure of a sample selection model with errors that are independent of the explanatory variables but have unknown distribution. The additional structure can significantly reduce the identified set for the parameters of interest. Specifically, we construct the identified set for the parameter vector of interest. It is a one‐dimensional line segment in the parameter space, and we demonstrate that this line segment can be short in practice. We show that the identified set is sharp when the model is correct and empty when there exist no parameter values that make the sample selection model consistent with the data. We also provide non‐sharp bounds under the assumption that the model is correct. These are easier to compute and associated with lower statistical uncertainty than the sharp bounds. Throughout the paper, we illustrate our approach by estimating a standard sample selection model for wages.

Download Full-text

Bayesian Inference for Comparative Research

American Political Science Review ◽

10.2307/2944713 ◽

1994 ◽

Vol 88 (2) ◽

pp. 412-423 ◽

Cited By ~ 97

Author(s):

Bruce Western ◽

Simon Jackman

Keyword(s):

Regression Analysis ◽

Comparative Analysis ◽

Bayesian Inference ◽

Statistical Inference ◽

Bayesian Approach ◽

Comparative Research ◽

Data Sets ◽

Explanatory Variables ◽

Long Run

Regression analysis in comparative research suffers from two distinct problems of statistical inference. First, because the data constitute all the available observations from a population, conventional inference based on the long-run behavior of a repeatable data mechanism is not appropriate. Second, the small and collinear data sets of comparative research yield imprecise estimates of the effects of explanatory variables. We describe a Bayesian approach to statistical inference that provides a unified solution to these two problems. This approach is illustrated in a comparative analysis of unionization.

Download Full-text

Statistical inference for single-index-driven varying-coefficient time series model with explanatory variables

Proceedings - Mathematical Sciences ◽

10.1007/s12044-021-00614-x ◽

2021 ◽

Vol 131 (2) ◽

Author(s):

Jingwen Huang ◽

Dehui Wang

Keyword(s):

Time Series ◽

Statistical Inference ◽

Time Series Model ◽

Explanatory Variables ◽

Varying Coefficient ◽

Single Index

Download Full-text

Approximate Bayesian Computation in Population Genetics

Genetics ◽

10.1093/genetics/162.4.2025 ◽

2002 ◽

Vol 162 (4) ◽

pp. 2025-2035 ◽

Cited By ~ 40

Author(s):

Mark A Beaumont ◽

Wenyang Zhang ◽

David J Balding

Keyword(s):

Population Genetics ◽

Statistical Inference ◽

Local Linear Regression ◽

Alternative Methods ◽

Nuisance Parameters ◽

Summary Statistics ◽

Statistical Efficiency ◽

Simulation Step ◽

Approximate Bayesian ◽

Parameter Values

Abstract We propose a new method for approximate Bayesian statistical inference on the basis of summary statistics. The method is suited to complex problems that arise in population genetics, extending ideas developed in this setting by earlier authors. Properties of the posterior distribution of a parameter, such as its mean or density curve, are approximated without explicit likelihood calculations. This is achieved by fitting a local-linear regression of simulated parameter values on simulated summary statistics, and then substituting the observed summary statistics into the regression equation. The method combines many of the advantages of Bayesian statistical inference with the computational efficiency of methods based on summary statistics. A key advantage of the method is that the nuisance parameters are automatically integrated out in the simulation step, so that the large numbers of nuisance parameters that arise in population genetics problems can be handled without difficulty. Simulation results indicate computational and statistical efficiency that compares favorably with those of alternative methods previously proposed in the literature. We also compare the relative efficiency of inferences obtained using methods based on summary statistics with those obtained directly from the data using MCMC.

Download Full-text

PENERAPAN ARTIFICIAL FISH SWARM ALGORITHM (AFSA) PADA MULTIPLE TRAVELLING SALESMAN PROBLEMS (m-TSP)

Majalah Ilmiah Matematika dan Statistika ◽

10.19184/mims.v20i1.17220 ◽

2020 ◽

Vol 20 (1) ◽

pp. 27

Author(s):

Florencia Wahyu Ganda Fismaya ◽

Abduh Riski ◽

Ahmad Kamsyakawuni

Keyword(s):

Travelling Salesman Problem ◽

Final Solution ◽

Travelling Salesman ◽

Artificial Fish Swarm Algorithm ◽

Package Delivery ◽

Artificial Fish Swarm ◽

Delivery Services ◽

Experimental Process ◽

Swarm Algorithm ◽

Parameter Values

Selling or trading in the industrial 4.0 era as it can now be done by opening a shop online. Therefore, shopping at this time can also be done online also, so that the online shop owners require orders that do not allow for Cash On Delivery (COD) transactions using package delivery services. This research discusses about finding a solution for good shipping with a minimum total mileage of several couriers at PT. Titipan Kilat District Banyuwangi uses AFSA as a settlement algorithm. The experimental process is carried out by using several parameter values to determine the parameters that affect the final solution. Each parameter will be tested with a maximum of 1000 iterations, then the best results will be tested again with a maximum iteration of 2000, and 5000 and will be compared with the original distance traveled by the couriers. The final solution offered in the form of a delivery route by three couriers with the total distance (Z) of the third courier is 87.28 Km with the smallest iteration value reaching the local minimum in iteration 1169. Keywords: artificial fish swarm algorithm (afsa), multiple travelling salesman problem (m-tsp), route, total mileage.

Download Full-text

Recent Advances in Spatial Interaction Modelling: An Application to the Forecasting of Shopping Travel

Environment and Planning A Economy and Space ◽

10.1068/a190173 ◽

1987 ◽

Vol 19 (2) ◽

pp. 173-186 ◽

Cited By ~ 31

Author(s):

C M Guy

Keyword(s):

Optimal Parameter ◽

Spatial Interaction ◽

Specific Model ◽

Small Sample ◽

Explanatory Variables ◽

Interaction Modelling ◽

Complex Models ◽

Decay Parameters ◽

Parameter Values ◽

Specific Distance

A common problem in the use of singly-constrained spatial interaction shopping models has been that of finding optimal parameter values. This problem has been exacerbated where improvements to the model have involved extra parameters to be estimated. In this paper it is shown that calibration of quite complex models can be achieved through modification of the conventional ‘gravity’ model to a generalised linear model with Poisson error structure and logarithmic link function. Data on observed trips between fifteen residential zones and eighty-three shopping destinations in Cardiff are used to test several models through application of the GLIM computing package. Models involving extra explanatory variables, origin-specific distance-decay parameters, and competing-destinations terms are all shown to offer worthwhile improvements in performance over the conventional singly-constrained model. An individual-specific model is also tested for a small sample of shoppers. Finally, some comments are made concerning the relevance of the Cardiff findings and the wider significance of these methodological advances.

Download Full-text

Non-Parametric Rank Statistics for Spectral Power and Coherence

10.1101/818906 ◽

2019 ◽

Author(s):

Bahman Nasseroleslami ◽

Stefan Dukic ◽

Teresa Buxo ◽

Amina Coffey ◽

Roisin McMackin ◽

...

Keyword(s):

Spectral Analysis ◽

Statistical Inference ◽

Robust Estimation ◽

Spectral Power ◽

Real Life ◽

Parametric Methods ◽

Neural Signals ◽

Rank Statistics ◽

Sample Testing ◽

Non Parametric

AbstractDespite advances in multivariate spectral analysis of neural signals, the statistical inference of measures such as spectral power and coherence in practical and real-life scenarios remains a challenge. The non-normal distribution of the neural signals and presence of artefactual components make it difficult to use the parametric methods for robust estimation of measures or to infer the presence of specific spectral components above the chance level. Furthermore, the bias of the coherence measures and their complex statistical distributions are impediments in robust statistical comparisons between 2 different levels of coherence. Non-parametric methods based on the median of auto-/cross-spectra have shown promise for robust estimation of spectral power and coherence estimates. However, the statistical inference based on these non-parametric estimates remain to be formulated and tested. In this report a set of methods based on non-parametric rank statistics for 1-sample and 2-sample testing of spectral power and coherence is provided. The proposed methods were demonstrated and tested using simulated neural signals in different conditions. The results show that non-parametric methods provide robustness against artefactual components. Moreover, they provide new possibilities for robust 1-sample and 2-sample testing of the complex coherency function, including both the magnitude and phase, where existing methods fall short of functionality. The utility of the methods were further demonstrated by examples on experimental neural data. The proposed approach provides a new framework for non-parametric spectral analysis of digital signals. These methods are especially suited to neuroscience and neural engineering applications, given the attractive properties such as minimal assumption on distributions, statistical robustness, and the diverse testing scenarios afforded.

Download Full-text

Consistently Bounding Parameter Values with One Instrument and Two Endogenous Explanatory Variables: With an Application to the Effect of Fast-Food Availability on Obesity

SSRN Electronic Journal ◽

10.2139/ssrn.1497142 ◽

2009 ◽

Cited By ~ 1

Author(s):

Richard A. Dunn

Keyword(s):

Food Availability ◽

Fast Food ◽

Explanatory Variables ◽

Parameter Values

Download Full-text