Multinomial Data | ScienceGate

AbstractA basic problem of cluster analysis is the determination or selection of the number of clusters evinced in any set of data. We address this issue with multinomial data using Akaike’s information criterion and demonstrate its utility in identifying an appropriate number of clusters of tumor types with similar profiles of cell surface antigens.

Download Full-text

A Gibbs Sampler for a Hierarchical Dirichlet Process Mixture Model

10.31234/osf.io/ebzt8 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mark Andrews

Keyword(s):

Mixture Model ◽

Gibbs Sampler ◽

Dirichlet Process ◽

Dirichlet Process Mixture ◽

Hierarchical Dirichlet Process ◽

Dirichlet Process Mixture Model ◽

Multinomial Data

A Gibbs sampler for the hierarchical Dirichlet process mixture model (HDPMM) when used with multinomial data.

Download Full-text

Genz and Mendell-Elston Estimation of the High-Dimensional Multivariate Normal Distribution

Algorithms ◽

10.3390/a14100296 ◽

2021 ◽

Vol 14 (10) ◽

pp. 296

Author(s):

Lucy Blondell ◽

Mark Z. Kos ◽

John Blangero ◽

Harald H. H. Göring

Keyword(s):

Monte Carlo ◽

Statistical Analysis ◽

Execution Time ◽

Absolute Error ◽

High Dimensional ◽

Multivariate Normal ◽

Multinomial Data ◽

Efficient Performance ◽

Simulation Based ◽

Scale Characteristics

Statistical analysis of multinomial data in complex datasets often requires estimation of the multivariate normal (mvn) distribution for models in which the dimensionality can easily reach 10–1000 and higher. Few algorithms for estimating the mvn distribution can offer robust and efficient performance over such a range of dimensions. We report a simulation-based comparison of two algorithms for the mvn that are widely used in statistical genetic applications. The venerable Mendell-Elston approximation is fast but execution time increases rapidly with the number of dimensions, estimates are generally biased, and an error bound is lacking. The correlation between variables significantly affects absolute error but not overall execution time. The Monte Carlo-based approach described by Genz returns unbiased and error-bounded estimates, but execution time is more sensitive to the correlation between variables. For ultra-high-dimensional problems, however, the Genz algorithm exhibits better scale characteristics and greater time-weighted efficiency of estimation.

Download Full-text

Collaborative filtering for massive multinomial data

Journal of Applied Statistics ◽

10.1080/02664763.2013.847072 ◽

2013 ◽

Vol 41 (4) ◽

pp. 701-715 ◽

Cited By ~ 4

Author(s):

Andrew Cron ◽

Liang Zhang ◽

Deepak Agarwal

Keyword(s):

Collaborative Filtering ◽

Multinomial Data

Download Full-text

Inference for misclassified multinomial data with covariates

Canadian Journal of Statistics ◽

10.1002/cjs.11556 ◽

2020 ◽

Author(s):

Shijia Wang ◽

Liangliang Wang ◽

Tim B. Swartz

Keyword(s):

Multinomial Data

Download Full-text

Overdispersion models for correlated multinomial data: Applications to blinding assessment

Statistics in Medicine ◽

10.1002/sim.8344 ◽

2019 ◽

Vol 38 (25) ◽

pp. 4963-4976

Author(s):

V. Landsman ◽

D. Landsman ◽

C.S. Li ◽

H. Bang

Keyword(s):

Multinomial Data

Download Full-text

Sparse consistency and smoothing for multinomial data

Statistics & Probability Letters ◽

10.1016/s0167-7152(96)00108-3 ◽

1997 ◽

Vol 33 (1) ◽

pp. 41-48 ◽

Cited By ~ 7

Author(s):

Marc Aerts ◽

Ilse Augustyns ◽

Paul Janssen

Keyword(s):

Multinomial Data

Download Full-text

Estimating overdispersion in sparse multinomial data

Biometrics ◽

10.1111/biom.13194 ◽

2019 ◽

Vol 76 (3) ◽

pp. 834-842

Author(s):

Farzana Afroz ◽

Matt Parry ◽

David Fletcher

Keyword(s):

Multinomial Data

Download Full-text

The inference of hierarchical schemes for multinomial data

Journal of Classification ◽

10.1007/bf01908589 ◽

1989 ◽

Vol 6 (1) ◽

pp. 73-95 ◽

Cited By ~ 1

Author(s):

Pascale Rousseau ◽

David Sankoff

Keyword(s):

Multinomial Data

Download Full-text

Outlier detection for multinomial data with a large number of categories

Random Matrices Theory and Application ◽

10.1142/s2010326320500082 ◽

2019 ◽

Vol 09 (03) ◽

pp. 2050008

Author(s):

Xiaona Yang ◽

Zhaojun Wang ◽

Xuemin Zi

Keyword(s):

Outlier Detection ◽

Multivariate Normal Distribution ◽

Detection Methods ◽

Multivariate Normal ◽

Finite Sample ◽

Main Obstacle ◽

Multinomial Data ◽

One Step ◽

Discrete Scale ◽

Threshold Rule

This paper develops an outlier detection procedure for multinomial data when the number of categories tends to infinity. Most of the outlier detection methods are based on the assumption that the observations follow multivariate normal distribution, while in many modern applications, the observations either are measured on a discrete scale or naturally have some categorical structures. For such multinomial observations, there are rather limited approaches for outlier detection. To overcome the main obstacle, the least trimmed distances estimator for multinomial data and a fast algorithm to identify the clean subset are introduced in this work. Also, a threshold rule is considered through the asymptotic distribution of measure distance to identify outliers. Furthermore, a one-step reweighting scheme is proposed to improve the efficiency of the procedure. Finally, the finite sample performance of our method is evaluated through simulations and is compared with that of available outlier detection methods.

Download Full-text