Modelling count data with excessive zeros: The need for class prediction in zero-inflated models and the issue of data generation in choosing between zero-inflated and generic mixture models for dental caries data

Mark S. Gilthorpe; Morten Frydenberg; Yaping Cheng; Vibeke Baelum

doi:10.1002/sim.3699

Modelling count data with excessive zeros: The need for class prediction in zero-inflated models and the issue of data generation in choosing between zero-inflated and generic mixture models for dental caries data

Statistics in Medicine ◽

10.1002/sim.3699 ◽

2009 ◽

Vol 28 (28) ◽

pp. 3539-3553 ◽

Cited By ~ 24

Author(s):

Mark S. Gilthorpe ◽

Morten Frydenberg ◽

Yaping Cheng ◽

Vibeke Baelum

Keyword(s):

Dental Caries ◽

Mixture Models ◽

Count Data ◽

Data Generation ◽

Class Prediction

Download Full-text

Class Prediction Performance Boost Using Gaussian Mixture Models

2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE) ◽

10.1109/eitce47263.2019.9094810 ◽

2019 ◽

Author(s):

Yunfeng Sui ◽

Shixuan Zhao ◽

Weiqian Liu ◽

Zhi Cheng ◽

Lingzhu Deng

Keyword(s):

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Prediction Performance ◽

Class Prediction

Download Full-text

A Wald test for zero inflation and deflation for correlated count data from dental caries research

Statistical Modelling ◽

10.1177/1471082x14535480 ◽

2014 ◽

Vol 14 (6) ◽

pp. 471-488 ◽

Cited By ~ 1

Author(s):

Wei-Wen Hsu ◽

David Todem ◽

KyungMann Kim ◽

Woosung Sohn

Keyword(s):

Dental Caries ◽

Count Data ◽

Wald Test ◽

Zero Inflation

Download Full-text

Maximizing the Prediction Accuracy in Tweet Sentiment Extraction using Tensor Flow based Deep Neural Networks

Journal of Ubiquitous Computing and Communication Technologies - December 2019 ◽

10.36548/jucct.2021.2.001 ◽

2021 ◽

Vol 3 (2) ◽

pp. 61-79

Author(s):

S Thivaharan ◽

G Srivatsun

Keyword(s):

Neural Networks ◽

Social Media ◽

Prediction Accuracy ◽

Deep Neural Networks ◽

Modern Technology ◽

The Other ◽

Data Generation ◽

Class Prediction ◽

The Social ◽

Communication Devices

The amount of data generated by modern communication devices is enormous, reaching petabytes. The rate of data generation is also increasing at an unprecedented rate. Though modern technology supports storage in massive amounts, the industry is reluctant in retaining the data, which includes the following characteristics: redundancy in data, unformatted records with outdated information, data that misleads the prediction and data with no impact on the class prediction. Out of all of this data, social media plays a significant role in data generation. As compared to other data generators, the ratio at which the social media generates the data is comparatively higher. Industry and governments are both worried about the circulation of mischievous or malcontents, as they are extremely susceptible and are used by criminals. So it is high time to develop a model to classify the social media contents as fair and unfair. The developed model should have higher accuracy in predicting the class of contents. In this article, tensor flow based deep neural networks are deployed with a fixed Epoch count of 15, in order to attain 25% more accuracy over the other existing models. Activation methods like “Relu” and “Sigmoid”, which are specific for Tensor flow platforms support to attain the improved prediction accuracy.

Download Full-text

Disease‐structured N ‐mixture models: A practical guide to model disease dynamics using count data

Ecology and Evolution ◽

10.1002/ece3.4849 ◽

2019 ◽

Vol 9 (2) ◽

pp. 899-909 ◽

Cited By ~ 4

Author(s):

Graziella V. DiRenzo ◽

Christian Che‐Castaldo ◽

Sarah P. Saunders ◽

Evan H. Campbell Grant ◽

Elise F. Zipkin

Keyword(s):

Mixture Models ◽

Count Data ◽

Disease Dynamics ◽

Practical Guide

Download Full-text

Semiparametric Mixture Models for Multivariate Count Data, with Application

SSRN Electronic Journal ◽

10.2139/ssrn.517922 ◽

2004 ◽

Author(s):

Giovanni Trovato ◽

Marco Alfò

Keyword(s):

Mixture Models ◽

Count Data ◽

Semiparametric Mixture ◽

Semiparametric Mixture Models

Download Full-text

Matching the Statistical Model to the Research Question for Dental Caries Indices with Many Zero Counts

Caries Research ◽

10.1159/000452675 ◽

2017 ◽

Vol 51 (3) ◽

pp. 198-208 ◽

Cited By ~ 4

Author(s):

John S. Preisser ◽

D. Leann Long ◽

John W. Stamm

Keyword(s):

Clinical Trial ◽

Dental Caries ◽

Statistical Model ◽

Randomized Clinical Trial ◽

Count Data ◽

Treatment Effects ◽

Negative Binomial ◽

Research Question ◽

Model Class ◽

Zero Counts

Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two data sets, one consisting of fictional dmft counts in 2 groups and the other on DMFS among schoolchildren from a randomized clinical trial comparing 3 toothpaste formulations to prevent incident dental caries, are analyzed with negative binomial hurdle, zero-inflated negative binomial, and marginalized zero-inflated negative binomial models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the randomized clinical trial were similar despite their distinctive interpretations. The choice of statistical model class should match the study's purpose, while accounting for the broad decline in children's caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts.

Download Full-text

Semiparametric mixture models for multivariate count data, with application

Econometrics Journal ◽

10.1111/j.1368-423x.2004.00138.x ◽

2004 ◽

Vol 7 (2) ◽

pp. 426-454 ◽

Cited By ~ 31

Author(s):

Marco Alfò ◽

Giovanni Trovato

Keyword(s):

Mixture Models ◽

Count Data ◽

Semiparametric Mixture ◽

Semiparametric Mixture Models

Download Full-text

Mixture Models for the Analysis of Repeated Count Data

Journal of the Royal Statistical Society Series C (Applied Statistics) ◽

10.2307/2986139 ◽

1995 ◽

Vol 44 (4) ◽

pp. 473 ◽

Cited By ~ 8

Author(s):

Marijtje A. J. van Duijn ◽

Ulf Bockenholt

Keyword(s):

Mixture Models ◽

Count Data

Download Full-text

A spatial beta-binomial model for clustered count data on dental caries

Statistical Methods in Medical Research ◽

10.1177/0962280210372453 ◽

2010 ◽

Vol 20 (2) ◽

pp. 85-102 ◽

Cited By ~ 5

Author(s):

Dipankar Bandyopadhyay ◽

Brian J Reich ◽

Elizabeth H Slate

Keyword(s):

Dental Caries ◽

Count Data ◽

Binomial Model

Download Full-text

Sex, lies and self-reported counts: Bayesian mixture models for heaping in longitudinal count data via birth–death processes

The Annals of Applied Statistics ◽

10.1214/15-aoas809 ◽

2015 ◽

Vol 9 (2) ◽

pp. 572-596 ◽

Cited By ~ 12

Author(s):

Forrest W. Crawford ◽

Robert E. Weiss ◽

Marc A. Suchard

Keyword(s):

Mixture Models ◽

Count Data ◽

Longitudinal Count Data ◽

Bayesian Mixture ◽

Bayesian Mixture Models ◽

Birth Death

Download Full-text