Epi Evident: Biosurveillance to Monitor, Compare, and Forecast Disease Case Counts

Natalie Tomaszewski; Meeshu Agnihotri; Huiwen Cheng; Ashutosh Bhadke; Michael Henry; Lauren E. Charles

doi:10.5210/ojphi.v10i1.8324

Epi Evident: Biosurveillance to Monitor, Compare, and Forecast Disease Case Counts

Online Journal of Public Health Informatics ◽

10.5210/ojphi.v10i1.8324 ◽

2018 ◽

Vol 10 (1) ◽

Author(s):

Natalie Tomaszewski ◽

Meeshu Agnihotri ◽

Huiwen Cheng ◽

Ashutosh Bhadke ◽

Michael Henry ◽

...

Keyword(s):

Model Selection ◽

Model Building ◽

Situational Awareness ◽

Moving Average ◽

Poisson Model ◽

Forecast Model ◽

Data Type ◽

Period Prevalence ◽

Disease Case ◽

Best Fit

ObjectiveEpi Evident is a web based application built to empower public health analysts by providing a platform that improves monitoring, comparing, and forecasting case counts and period prevalence of notifiable diseases for any scale jurisdiction at regional, country, or global-level. This proof of concept application development addresses improving visualization, access, situational awareness, and prediction of disease behavior.IntroductionThe Epi Evident application was designed for clear and comprehensive visualization for monitoring, comparing, and forecasting notifiable diseases simultaneously across chosen countries. Epi Evident addresses the taxing analytical evaluation of how diseases behave differently across countries. This application provides a user-friendly platform with easily interpretable analytics which allows analysts to conduct biosurveillance with minimal user tasks. Developed at the Pacific Northwest National Laboratory (PNNL), Epi Evident utilizes time-series disease case count data from the Biosurveillance Ecosystem (BSVE) application Epi Archive (1). This diverse data source is filtered through the flexible Epi Evident workflow for forecast model building designed to integrate any entering combination of country and disease. The application aims to quickly inform analysts of anomalies in disease & location specific behavior and aid in evidence based decision making to help control or prevent disease outbreaks.MethodsA workflow was constructed to define the best disease forecast model for each location based on an adjustable method approach. The differences in disease behavior across countries was achieved through a React/Python application with a user-friendly output for monitoring and comparing different combinations.The forecast model building workflow consisted of three major steps to determine the best fit model for a given disease-country pair: data type, model type, and model comparison & selection. Testing various disease-country combinations allowed for direct evaluation of the workflow efficiency, flexibility, and criteria for determining the best fit model. Data type was characterized as either seasonal, cyclic, or sporadic. Depending on data type, a specific time series forecasting model was applied. In general, seasonal or cyclic data required either an Auto-Regression Integrated Moving Average (ARIMA) model or a Seasonal Auto-Regression Integrated Moving Average (SARIMA) model while sporadic datasets employed a Poisson model. Several model candidates for a single country and disease combination were then compared to determine which was the best fit model. ARIMA and SARIMA model selection criteria included their respective order significance, residual diagnostics, and lowest possible combination of Akaike Information Criterion and Root Mean Square Error (RMSE) values. Poisson model selection criteria involved Poisson or negative binomial distribution and event probability, lag dependency of immediate past events or seasonality, and lowest possible RMSE. To enhance the user’s monitoring and comparisons across multiple countries and diseases, each forecasted case counts supplied a corresponding period prevalence. This period prevalence was calculated by dividing the case counts by the population in the selected country and timeframe. Population records were obtained through the public World Health Organization database (2).ResultsA variety of visualization tools on Epi Evident allows convenient interpretation on behaviors of diseases spanning multiple countries simultaneously (Figure 1). Countries, diseases, and timeframe are selected and displayed within a matrix alongside with their corresponding forecasts for case counts and period prevalence. By providing this full representation, users can easily interpret and anticipate disease behavior while monitoring, comparing, and forecasting case counts and period prevalence across multiple countries. For future work, the Epi Evident workflow can be scaled to accommodate any disease-country combination with automated model selection to allow easier and more efficient biosurveillance.ConclusionsEpi Evident empowers analysts to visualize, monitor, compare, and forecast disease case counts and period prevalence across countries. Epi Evident exemplifies how filtering diverse data through a flexible workflow can be scalable to output distinctive models for any given country and disease combination. Thus, providing accurate forecasting and enhanced situational awareness throughout the globe. Implementing this application’s methodology helps enhance and expand biosurveillance efficacy for multiple diseases across multiple countries simultaneously.References1. Generous Nicholas, Fairchild Geoffrey, Khalsa Hari, Tasseff Byron, Arnold James. Epi Archive: An automated data collection of notifiable disease data. Online Journal of Public Health Informatics. 2017. 9(1):e372. http://apps.who.int/gho/data/view.main.POP2040?lang=en Accessed: 6/20/2017

Download Full-text

An exploratory framework for understanding electronic bill presentment and payment model selection

Human Systems Management ◽

10.3233/hsm-2000-19405 ◽

2000 ◽

Vol 19 (4) ◽

pp. 255-264

Author(s):

Wenhong Luo ◽

David Cook ◽

Jimmie Joseph ◽

Bopana Ganapathy

Keyword(s):

Model Selection ◽

Payment Model ◽

Customer Interaction ◽

Best Fit

Electronic bill presentment and payment (EBPP) provides an opportunity for firms to decrease their billing costs, while increasing their customer interaction. While many models exist, there is a dearth of information for determining which model would best fit customer characteristics and needs. This article examines the three primary models of EBPP, the characteristics of recurring bills, and customer concerns to develop an exploratory framework for determining which EBPP model a bill generating firm should deploy.

Download Full-text

Decline in body condition in the Antarctic minke whale (Balaenoptera bonaerensis) in the Southern Ocean during the 1990s

Polar Biology ◽

10.1007/s00300-020-02783-3 ◽

2021 ◽

Vol 44 (2) ◽

pp. 259-273

Author(s):

Céline Cunen ◽

Lars Walløe ◽

Kenji Konishi ◽

Nils Lid Hjort

Keyword(s):

Model Selection ◽

Body Condition ◽

Model Building ◽

Research Question ◽

Information Criterion ◽

The Body ◽

Antarctic Minke Whale ◽

Minke Whales ◽

Balaenoptera Bonaerensis ◽

The Antarctic

AbstractChanges in the body condition of Antarctic minke whales (Balaenoptera bonaerensis) have been investigated in a number of studies, but remain contested. Here we provide a new analysis of body condition measurements, with particularly careful attention to the statistical model building and to model selection issues. We analyse body condition data for a large number (4704) of minke whales caught between 1987 and 2005. The data consist of five different variables related to body condition (fat weight, blubber thickness and girth) and a number of temporal, spatial and biological covariates. The body condition variables are analysed using linear mixed-effects models, for which we provide sound biological motivation. Further, we conduct model selection with the focused information criterion (FIC), reflecting the fact that we have a clearly specified research question, which leads us to a clear focus parameter of particular interest. We find that there has been a substantial decline in body condition over the study period (the net declines are estimated to 10% for fat weight, 7% for blubber thickness and 3% for the girth). Interestingly, there seems to be some differences in body condition trends between males and females and in different regions of the Antarctic. The decline in body condition could indicate major changes in the Antarctic ecosystem, in particular, increased competition from some larger krill-eating whale species.

Download Full-text

Model Building and Model Selection

Handbook of Regression and Modeling - Chapman & Hall/CRC Biostatistics Series ◽

10.1201/9781420017380.ch10 ◽

2006 ◽

Keyword(s):

Model Selection ◽

Model Building

Download Full-text

Forecasting Software Vulnerabilities Using Time-Series Techniques

Advances in Business Information Systems and Analytics - Machine Learning Techniques for Improved Business Analytics ◽

10.4018/978-1-5225-3534-8.ch007 ◽

2019 ◽

pp. 125-165

Author(s):

Baidyanath Biswas

Keyword(s):

Time Series ◽

Homeland Security ◽

Moving Average ◽

Primary Objective ◽

Data Set ◽

Software Vulnerabilities ◽

Systems Security ◽

Univariate Time Series ◽

Efficient Prediction ◽

Best Fit

This chapter discusses the concepts of time-series applications and forecasting in the context of information systems security. The primary objective in such formulation is the training of the models followed by efficient prediction. Although economic and financial forecasting problems extensively use time-series, predicting software vulnerabilities is a novel idea. The chapter also provides appropriate guidelines for the implementation and adaptation of univariate time-series for information security. To achieve this, the authors focus on the following techniques: autoregressive (AR), moving average (MA), autoregressive integrated moving average (ARIMA), and exponential smoothing. The analysis considers a unique data set consisting of the publicly exposed software vulnerabilities, available from the U.S. Dept. of Homeland Security. The problem is presented first, followed by a general framework to identify the problem, estimate the best-fit parameters of that model, and conclude with an illustrative example from the above dataset to familiarize readers with the business problem.

Download Full-text

Constitutive model selection for unreinforced masonry cross sections based on best-fit analytical moment–curvature diagrams

Engineering Structures ◽

10.1016/j.engstruct.2015.12.036 ◽

2016 ◽

Vol 111 ◽

pp. 451-466 ◽

Cited By ~ 10

Author(s):

Fulvio Parisi ◽

Giuseppe Sabella ◽

Nicola Augenti

Keyword(s):

Model Selection ◽

Constitutive Model ◽

Cross Sections ◽

Unreinforced Masonry ◽

Selection For ◽

Best Fit

Download Full-text

Predictive Modeling of Product Returns for Remanufacturing

Volume 2A: 41st Design Automation Conference ◽

10.1115/detc2015-46875 ◽

2015 ◽

Author(s):

Jungmok Ma ◽

Harrison M. Kim

Keyword(s):

Model Selection ◽

Predictive Model ◽

Mixed Model ◽

Moving Average ◽

Predictive Performance ◽

Performance Measure ◽

Selection Algorithm ◽

Product Returns ◽

Distributed Lag

As awareness of environmental issues increases, the pressures from the public and policy makers have forced OEMs to consider remanufacturing as the key product design option. In order to make the remanufacturing operations more profitable, forecasting product returns is critical with regards to the uncertainty in quantity and timing. This paper proposes a predictive model selection algorithm to deal with the uncertainty by identifying better predictive models. Unlike other major approaches in literature (distributed lag model or DLM), the predictive model selection algorithm focuses on the predictive power over new or future returns. The proposed algorithm extends the set of candidate models that should be considered: autoregressive integrated moving average or ARIMA (previous returns for future returns), DLM (previous sales for future returns), and mixed model (both previous sales and returns for future returns). The prediction performance measure from holdout samples is used to find a better model among them. The case study of reusable bottles shows that one of the candidate models, ARIMA, can predict better than the DLM depending on the relationships between returns and sales. The univariate model is widely unexplored due to the criticism that the model cannot utilize the previous sales. Another candidate model, mixed model, can provide a chance to find a better predictive model by combining the ARIMA and DLM. The case study also shows that the DLM in the predictive model selection algorithm can provide a good predictive performance when there are relatively strong and static relationships between returns and sales.

Download Full-text

Zero-Inflated Poisson Model with Group Data

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.569.627 ◽

2012 ◽

Vol 569 ◽

pp. 627-631

Author(s):

Jun Yang ◽

Xin Zhang

Keyword(s):

Parameter Estimation ◽

Maximum Likelihood ◽

Model Selection ◽

Count Data ◽

Maximum Likelihood Estimate ◽

Poisson Model ◽

Cigarette Consumption ◽

Chi Square ◽

Group Data ◽

Chi Square Test

The Zero-inflated Poisson model has been widely used in many fields for count data with excessive zeroes. In fact, group data are often collected for many count data, such as cigarette consumption. In order to solve the problem, Zero-inflated Poisson model with group data is investigated in this paper. Parameter estimation is given by the maximum likelihood estimate, model selection is discussed by the Chi-square test, and one real example is given for application in the end.

Download Full-text

A SARIMA forecasting model to predict the number of cases of dengue in Campinas, State of São Paulo, Brazil

Revista da Sociedade Brasileira de Medicina Tropical ◽

10.1590/s0037-86822011000400007 ◽

2011 ◽

Vol 44 (4) ◽

pp. 436-440 ◽

Cited By ~ 31

Author(s):

Edson Zangiacomi Martinez ◽

Elisângela Aparecida Soares da Silva ◽

Amaury Lelis Dal Fabbro

Keyword(s):

Moving Average ◽

Forecasting Model ◽

Time Series Models ◽

R Software ◽

Relative Precision ◽

Sarima Model ◽

Autoregressive Integrated Moving Average ◽

Predicted Values ◽

Best Fit ◽

Monthly Incidence

INTRODUCTION: Forecasting dengue cases in a population by using time-series models can provide useful information that can be used to facilitate the planning of public health interventions. The objective of this article was to develop a forecasting model for dengue incidence in Campinas, southeast Brazil, considering the Box-Jenkins modeling approach. METHODS: The forecasting model for dengue incidence was performed with R software using the seasonal autoregressive integrated moving average (SARIMA) model. We fitted a model based on the reported monthly incidence of dengue from 1998 to 2008, and we validated the model using the data collected between January and December of 2009. RESULTS: SARIMA (2,1,2) (1,1,1)12 was the model with the best fit for data. This model indicated that the number of dengue cases in a given month can be estimated by the number of dengue cases occurring one, two and twelve months prior. The predicted values for 2009 are relatively close to the observed values. CONCLUSIONS: The results of this article indicate that SARIMA models are useful tools for monitoring dengue incidence. We also observe that the SARIMA model is capable of representing with relative precision the number of cases in a next year.

Download Full-text

Using Autoregressive Integrated Moving Average (ARIMA) Technique to Forecast the Production of Kharif Cereals in Odisha (India)

Current Journal of Applied Science and Technology ◽

10.9734/cjast/2020/v39i930619 ◽

2020 ◽

pp. 104-113

Author(s):

Abhiram Dash ◽

A. Mangaraju ◽

Pradeep Mishra ◽

H. Nayak

Keyword(s):

Autocorrelation Function ◽

Moving Average ◽

Arima Model ◽

Percentage Error ◽

Arima Models ◽

Training Set ◽

Autoregressive Integrated Moving Average ◽

Future Production ◽

Best Fit ◽

Testing Set

Cereals are the most important kharif season crop in Odisha. The present study was carried out to forecast the production of kharif cereals in Odisha by using the forecast values of area and yield of kharif cereals obtained from the selected best fit Autoregressive Integrated Moving Average (ARIMA) model. The data from 1970-71 to 2010-11 are considered as training set data and used for model building and from 2011-12 to 2015-16 are considered as testing set data and used for cross-validation of the selected model on the basis of the absolute percentage error. The ARIMA models are fitted to the stationary data which may be the original data or the differenced data. The different ARIMA models are evaluated on the basis of Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) at various lags. The possible ARIMA models are selected on the basis of significant coefficient of autoregressive and moving average components by using the training set data. The best fitted models are then selected on the basis of residual diagnostics test and model fit statistics. The ARIMA model found to be best fitted for area under kharif cereals and yield of kharif cereals are ARIMA (1,1,0) without constant and ARIMA (0,1,2) without constant respectively which are successfully cross-validated with the testing set data. The respective best fit ARIMA model has been used to forecast the area and yield of kharif cereals for the years 2016-17, 2017-18 and 2018-19. The forecast values of area shows a decrease, whereas, the forecast values of yield shows an increase. The decrease in area might have been the result of limited availability of area for cereals due to shifting towards non-food grain crops. The forecast values of production of kharif cereals obtained from the forecast values of area and yield of kharif cereals shows an increase which is due to the increase in forecast values of yield. Since there is limited scope for area expansion, the future production of kharif cereals can only be increased by increasing the yield to achieve the goal of food security for the growing population.

Download Full-text

PERAMALAN JUMLAH PENUMPANG PESAWAT TERBANG DI PINTU KEDATANGAN BANDAR UDARA INTERNASIONAL PATTIMURA AMBON DENGAN MENGGUNAKAN METODE ARIMA BOX-JENKINS

BAREKENG JURNAL ILMU MATEMATIKA DAN TERAPAN ◽

10.30598/barekengvol13iss3pp135-144ar883 ◽

2019 ◽

Vol 13 (3) ◽

pp. 135-144

Author(s):

Sasmita Hayoto ◽

Yopi Andry Lesnussa ◽

Henry W. M. Patty ◽

Ronald John Djami

Keyword(s):

Time Series ◽

Model Selection ◽

Time Series Data ◽

Moving Average ◽

Arima Model ◽

Series Data ◽

Autoregressive Integrated Moving Average ◽

Significant Parameter ◽

International Airport ◽

Forecast Time

The Autoregressive Integrated Moving Average (ARIMA) model is often used to forecast time series data. In the era of globalization, rapidly progressing times, one of them in the field of transportation. The aircraft is one of the transportation that the residents can use to support their activities, both in business and tourism. The objective of the research is to know the forecasting of the number of passengers of airplanes at the arrival gate of Pattimura Ambon International Airport using ARIMA Box-Jenkins method. The best model selection is ARIMA (0, 1, 3) because it has significant parameter value and MSE value is smaller.

Download Full-text