Towards Improving Transparency of Count Data Regression Models for Health Impacts of Air Pollution

John F. Joseph; Chad Furl; Hatim O. Sharif; Thankam Sunil; Charles G. Macias

doi:10.3390/app11083375

Towards Improving Transparency of Count Data Regression Models for Health Impacts of Air Pollution

Applied Sciences ◽

10.3390/app11083375 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3375

Author(s):

John F. Joseph ◽

Chad Furl ◽

Hatim O. Sharif ◽

Thankam Sunil ◽

Charles G. Macias

Keyword(s):

Air Pollution ◽

Regression Analysis ◽

Count Data ◽

Regression Models ◽

Health Impacts ◽

Environmental Literature ◽

Data Regression ◽

Statistics Course ◽

Classical Linear Regression ◽

Bell Shaped Curve

In studies on the health impacts of air pollution, regression analysis continues to advance far beyond classical linear regression, which many scientists may have become familiar with in an introductory statistics course. With each new level of complexity, regression analysis may become less transparent, even to the analyst working with the data. This may be especially true in count data regression models, where the response variable (typically given the symbol y) is count data (i.e., takes on values of 0, 1, 2, …). In such models, the normal distribution (the familiar bell-shaped curve) for the residuals (i.e., the differences between the observed values and the values predicted by the regression model) no longer applies. Unless care is taken to correctly specify just how those residuals are distributed, the tendency to accept untrue hypotheses may be greatly increased. The aim of this paper is to present a simple histogram of predicted and observed count values (POCH), which, while rarely found in the environmental literature but presented in authoritative statistical texts, can dramatically reduce the risk of accepting untrue hypotheses. POCH can also increase the transparency of count data regression models to analysts themselves and to the scientific community in general.

Download Full-text

Application of zero-truncated count data regression models to air-pollution disease

Journal of Physics Conference Series ◽

10.1088/1742-6596/1988/1/012096 ◽

2021 ◽

Vol 1988 (1) ◽

pp. 012096

Author(s):

Z I Zulki Alwani ◽

A I N Ibrahim ◽

R M Yunus ◽

F Yusof

Keyword(s):

Air Pollution ◽

Count Data ◽

Regression Models ◽

Data Regression

Download Full-text

Statistical Investigation of Household Size using Count Data Regression Models

Journal of Advanced Research in Applied Mathematics and Statistics ◽

10.24321/2455.7021.201702 ◽

2017 ◽

Vol 2 (1&2) ◽

pp. 10-21

Author(s):

Brijesh P. Singh ◽

Keyword(s):

Count Data ◽

Regression Models ◽

Statistical Investigation ◽

Household Size ◽

Data Regression

Download Full-text

The Economic Impact of Tourism in Xinghai Park, China: A Travel Cost Value Analysis Using Count Data Regression Models

Tourism Economics ◽

10.5367/000000009788254287 ◽

2009 ◽

Vol 15 (2) ◽

pp. 413-425 ◽

Cited By ~ 12

Author(s):

Erda Wang ◽

Zuozhi Li ◽

Bertis B. Little ◽

Yu Yang

Keyword(s):

Economic Impact ◽

Count Data ◽

Regression Models ◽

Travel Cost ◽

Value Analysis ◽

Data Regression

Download Full-text

R-Squared Measures for Count Data Regression Models with Applications to Health-Care Utilization

Journal of Business and Economic Statistics ◽

10.2307/1392433 ◽

1996 ◽

Vol 14 (2) ◽

pp. 209 ◽

Cited By ~ 52

Author(s):

A. Colin Cameron ◽

Frank A. G. Windmeijer

Keyword(s):

Health Care ◽

Health Care Utilization ◽

Count Data ◽

Regression Models ◽

Care Utilization ◽

Data Regression

Download Full-text

Simple Tests for Exogeneity of a Binary Explanatory Variable in Count Data Regression Models

Communications in Statistics - Simulation and Computation ◽

10.1080/03610910903147789 ◽

2009 ◽

Vol 38 (9) ◽

pp. 1834-1855 ◽

Cited By ~ 8

Author(s):

Kevin E. Staub

Keyword(s):

Count Data ◽

Regression Models ◽

Explanatory Variable ◽

Data Regression

Download Full-text

Bayesian approach to errors-in-variables in count data regression models with departures from normality and overdispersion

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2017.1381845 ◽

2017 ◽

Vol 88 (2) ◽

pp. 203-220 ◽

Cited By ~ 1

Author(s):

Nur Aainaa Rozliman ◽

Adriana Irawati Nur Ibrahim ◽

Rossita Muhamad Yunus

Keyword(s):

Bayesian Approach ◽

Count Data ◽

Regression Models ◽

Errors In Variables ◽

Data Regression

Download Full-text

Modelling tick bite risk by combining random forests and count data regression models

PLoS ONE ◽

10.1371/journal.pone.0216511 ◽

2019 ◽

Vol 14 (12) ◽

pp. e0216511 ◽

Cited By ~ 3

Author(s):

Irene Garcia-Marti ◽

Raul Zurita-Milla ◽

Arno Swart

Keyword(s):

Random Forests ◽

Count Data ◽

Regression Models ◽

Tick Bite ◽

Data Regression

Download Full-text

A Review of Regression Models in Machine Learning

10.51682/jiscom.00202005.2021 ◽

2021 ◽

Vol 2 (2) ◽

pp. 40-47

Author(s):

Sunil Kumar ◽

Vaibhav Bhatnagar

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Regression Model ◽

Regression Models ◽

Machine Learning Algorithms ◽

Data Sets ◽

Analysis Model ◽

Data Set ◽

Data Regression ◽

Different Types

Machine learning is one of the active fields and technologies to realize artificial intelligence (AI). The complexity of machine learning algorithms creates problems to predict the best algorithm. There are many complex algorithms in machine learning (ML) to determine the appropriate method for finding regression trends, thereby establishing the correlation association in the middle of variables is very difficult, we are going to review different types of regressions used in Machine Learning. There are mainly six types of regression model Linear, Logistic, Polynomial, Ridge, Bayesian Linear and Lasso. This paper overview the above-mentioned regression model and will try to find the comparison and suitability for Machine Learning. A data analysis prerequisite to launch an association amongst the innumerable considerations in a data set, association is essential for forecast and exploration of data. Regression Analysis is such a procedure to establish association among the datasets. The effort on this paper predominantly emphases on the diverse regression analysis model, how they binning to custom in context of different data sets in machine learning. Selection the accurate model for exploration is the most challenging assignment and hence, these models considered thoroughly in this study. In machine learning by these models in the perfect way and thru accurate data set, data exploration and forecast can provide the maximum exact outcomes.

Download Full-text