Mixed Generalized Linear Model for Estimating Household Trip Production

Author(s):  
Chang-Jen Lan ◽  
Patricia S. Hu

An innovative modeling framework to estimate household trip rates using 1995 Nationwide Personal Transportation Survey data is presented. A generalized linear model with a mixture of negative binomial probability distribution functions was developed on the basis of characteristics observed from the empirical distribution of household daily trips. This model provides a more flexible framework and a better model specification for analyzing household-specific trip production behavior. Compared with traditional least squares-based regression models, the parameter estimates from the proposed model are more efficient. Although the mean accuracies from the two modeling approaches are comparable, the mixed generalized linear model is more robust in identifying outliers due to its unsymmetric prediction bounds derived from more correct model specification.

2021 ◽  
Author(s):  
Ratih Oktri Nanda ◽  
Aldilas Achmad Nursetyo ◽  
Aditya Lia Ramadona ◽  
Muhammad Ali Imron ◽  
Anis Fuad ◽  
...  

Background Human mobility could act as a vector to facilitate the spread of infectious diseases. In response to the COVID-19 pandemic, Google Community Mobility Reports (CMR) provide the necessary data to explore community mobility further. Therefore, we aimed to examine the relationship between community mobility on COVID-19 dynamics in Jakarta, Indonesia. Methods We utilized the mobility data from Google from February 15 to December 31, 2020. We explored several statistical models to estimate the COVID-19 dynamics in Jakarta. Model 1 was a Poisson Regression Generalized Linear Model (GLM), Model 2 was a Negative Binomial Regression Generalized Linear Model (GLM), and Model 3 was a Multiple Linear Regression (MLR). Results We found that Multiple Linear Regression (MLR) with some adjustments using Principal Component Analysis (PCA) was the best fit model. It explained 52% of COVID-19 cases in Jakarta (R-Square: 0.52, p<0.05). All mobility variables were significant predictors of COVID-19 cases (p<0.05). More precisely, about 1% change in grocery and pharmacy would contribute to a 4.12% increase of the COVID-19 cases in Jakarta. Retails and recreations, workplaces, transit stations, and parks would result in 3.11%, 2.56%, 2.26%, and 1.93% of more COVID-19 cases, respectively. Conclusion Our study indicates that increased mobility contributes to increased COVID-19 cases. This finding will be beneficial to assist policymakers to have better outbreak management strategies, to anticipate increased COVID-19 cases in the future at certain public places and during seasonal events such as annual religious holidays or other long holidays in particular.


2019 ◽  
Author(s):  
Christoph Hafemeister ◽  
Rahul Satija

AbstractSingle-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from ’regularized negative binomial regression’, where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation, and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R packagesctransform, with a direct interface to our single-cell toolkitSeurat.


2021 ◽  
Vol 15 (4) ◽  
pp. 607-614
Author(s):  
Feby Seru ◽  
Azizah Azizah ◽  
Agung Dwi Saputro

One of the crucial things in the insurance business is determining the amount of IBNR claim reserves. The amount of IBNR's claim reserves is uncertain so it is necessary to estimate as accurately as possible. The estimation results of IBNR's claim reserves will affect the solvency and sustainability of the company. To calculate the estimated IBNR claim reserves, several approaches are used both deterministically and stochastically. This study uses a stochastic model with the GLM approach for data that is assumed to have an ODP distribution. Besides, this study also uses 2 different methods to calculate parameter estimates in the model, namely by performing parameter transformations and using the Verbeek algorithm. This study will compare the results of the IBNR claim reserve estimation obtained using these two methods in estimating the parameters in the model. The estimation results obtained indicate that the value of the IBNR claim reserves is the same. The advantage of the Verbeek algorithm is that the resulting parameter values ​​have interpretations.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Christoph Hafemeister ◽  
Rahul Satija

AbstractSingle-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package , with a direct interface to our single-cell toolkit .


2017 ◽  
Vol 13 (4-1) ◽  
pp. 354-361 ◽  
Author(s):  
Aaishah Radziah Jamaludin ◽  
Fadhilah Yusof ◽  
Rahmah Mohd Lokoman ◽  
Zainura Zainoon Noor ◽  
Noreliza Alias ◽  
...  

Four pollution related diseases, namely asthma, conjunctivitis, URTI and dengue will be studied in terms of their trend, behaviour and association with influential factors such as air pollution and climate variables. Two methods were chosen; Poisson Generalized Linear Model and Negative Binomial Model. These methods were used to determine the association between the diseases and their influential factors. This study shows that Sulphur Dioxide (SO2) is the most abundant source that contributes to the diseases. Therefore, the local authorities such as the Department of Environment need to reinforce the law in planning and monitoring the SO2 sources which are produced from fuel combustion in mobile sources and motor vehicles. 


Author(s):  
Rasaki Olawale Olanrewaju ◽  
Johnson Funminiyi Ojo

This study provided a non-convex penalized estimation procedure via Smoothed Clipped Absolute Deviation (SCAD) and Minimax Concave Penalty (MCP) for count data responses to checkmate the problem of covariates exceeding the sample size . The Generalized Linear Model (GLM) approach was adopted in obtaining the penalized functions needed by the MCP and SCAD non-convex penalizations of Binomial, Poisson and Negative-Binomial related count responses regression. A case study of the colorectal cancer with six (6) covariates against sample size of five (5) was subjected to the non-convex penalized estimation of the three distributions. It was revealed that the non-convex penalization of Binomial regression via MCP and SCAD best explained four un-penalized covariates needed in determining whether surgical or therapy ideal for treating the turmoil.


Sign in / Sign up

Export Citation Format

Share Document