Modelling count data with excessive zeros: The need for class prediction in zero-inflated models and the issue of data generation in choosing between zero-inflated and generic mixture models for dental caries data

2009 ◽  
Vol 28 (28) ◽  
pp. 3539-3553 ◽  
Author(s):  
Mark S. Gilthorpe ◽  
Morten Frydenberg ◽  
Yaping Cheng ◽  
Vibeke Baelum
2014 ◽  
Vol 14 (6) ◽  
pp. 471-488 ◽  
Author(s):  
Wei-Wen Hsu ◽  
David Todem ◽  
KyungMann Kim ◽  
Woosung Sohn

Author(s):  
S Thivaharan ◽  
G Srivatsun

The amount of data generated by modern communication devices is enormous, reaching petabytes. The rate of data generation is also increasing at an unprecedented rate. Though modern technology supports storage in massive amounts, the industry is reluctant in retaining the data, which includes the following characteristics: redundancy in data, unformatted records with outdated information, data that misleads the prediction and data with no impact on the class prediction. Out of all of this data, social media plays a significant role in data generation. As compared to other data generators, the ratio at which the social media generates the data is comparatively higher. Industry and governments are both worried about the circulation of mischievous or malcontents, as they are extremely susceptible and are used by criminals. So it is high time to develop a model to classify the social media contents as fair and unfair. The developed model should have higher accuracy in predicting the class of contents. In this article, tensor flow based deep neural networks are deployed with a fixed Epoch count of 15, in order to attain 25% more accuracy over the other existing models. Activation methods like “Relu” and “Sigmoid”, which are specific for Tensor flow platforms support to attain the improved prediction accuracy.


2019 ◽  
Vol 9 (2) ◽  
pp. 899-909 ◽  
Author(s):  
Graziella V. DiRenzo ◽  
Christian Che‐Castaldo ◽  
Sarah P. Saunders ◽  
Evan H. Campbell Grant ◽  
Elise F. Zipkin

2017 ◽  
Vol 51 (3) ◽  
pp. 198-208 ◽  
Author(s):  
John S. Preisser ◽  
D. Leann Long ◽  
John W. Stamm

Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two data sets, one consisting of fictional dmft counts in 2 groups and the other on DMFS among schoolchildren from a randomized clinical trial comparing 3 toothpaste formulations to prevent incident dental caries, are analyzed with negative binomial hurdle, zero-inflated negative binomial, and marginalized zero-inflated negative binomial models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the randomized clinical trial were similar despite their distinctive interpretations. The choice of statistical model class should match the study's purpose, while accounting for the broad decline in children's caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts.


Author(s):  
Marijtje A. J. van Duijn ◽  
Ulf Bockenholt
Keyword(s):  

2010 ◽  
Vol 20 (2) ◽  
pp. 85-102 ◽  
Author(s):  
Dipankar Bandyopadhyay ◽  
Brian J Reich ◽  
Elizabeth H Slate

Sign in / Sign up

Export Citation Format

Share Document