Data modelling for discrete time series data using Cassandra and MongoDB

Author(s):  
Dharavath Ramesh ◽  
Ashay Sinha ◽  
Suraj Singh
Biosystems ◽  
2008 ◽  
Vol 93 (3) ◽  
pp. 181-190 ◽  
Author(s):  
Markus Durzinsky ◽  
Annegret Wagler ◽  
Robert Weismantel ◽  
Wolfgang Marwan

2008 ◽  
Vol 5 (25) ◽  
pp. 885-897 ◽  
Author(s):  
Simon Cauchemez ◽  
Neil M Ferguson

We present a new statistical approach to analyse epidemic time-series data. A major difficulty for inference is that (i) the latent transmission process is partially observed and (ii) observed quantities are further aggregated temporally. We develop a data augmentation strategy to tackle these problems and introduce a diffusion process that mimicks the susceptible–infectious–removed (SIR) epidemic process, but that is more tractable analytically. While methods based on discrete-time models require epidemic and data collection processes to have similar time scales, our approach, based on a continuous-time model, is free of such constraint. Using simulated data, we found that all parameters of the SIR model, including the generation time, were estimated accurately if the observation interval was less than 2.5 times the generation time of the disease. Previous discrete-time TSIR models have been unable to estimate generation times, given that they assume the generation time is equal to the observation interval. However, we were unable to estimate the generation time of measles accurately from historical data. This indicates that simple models assuming homogenous mixing (even with age structure) of the type which are standard in mathematical epidemiology miss key features of epidemics in large populations.


2020 ◽  
Vol 30 (5) ◽  
pp. 374-381 ◽  
Author(s):  
Benjamin J. Narang ◽  
Greg Atkinson ◽  
Javier T. Gonzalez ◽  
James A. Betts

The analysis of time series data is common in nutrition and metabolism research for quantifying the physiological responses to various stimuli. The reduction of many data from a time series into a summary statistic(s) can help quantify and communicate the overall response in a more straightforward way and in line with a specific hypothesis. Nevertheless, many summary statistics have been selected by various researchers, and some approaches are still complex. The time-intensive nature of such calculations can be a burden for especially large data sets and may, therefore, introduce computational errors, which are difficult to recognize and correct. In this short commentary, the authors introduce a newly developed tool that automates many of the processes commonly used by researchers for discrete time series analysis, with particular emphasis on how the tool may be implemented within nutrition and exercise science research.


2021 ◽  
Vol 10 (2) ◽  
pp. 68
Author(s):  
Juan Bacilio Guerrero Escamilla ◽  
Arquímedes Avilés Vargas

This paper presents the elements entailing the building of a panel data model on the basis of both cross-sectional and time series dimensions, as well as the assumptions implemented for the model application; this, with the objective of focusing on the main elements of the panel data modelling, its way of building, the estimation of parameters and their ratification. On the basis of the methodology of operations research, a practical application exercise is made to estimate the number of kidnapping cases in Mexico based on several economic indicators, finding that from the two types of panel data analyzed in this research, the best adjustment is obtained through the random-effects model, and the most meaningful variables are the Gross domestic product growth and the informal employment rate from the period 2010 to 2019 in each of the states. Thus, it is illustrated that panel data modelling present a better adjustment of data than any other type of models such as linear regression and time series analysis.


2021 ◽  
Vol 2084 (1) ◽  
pp. 012002
Author(s):  
Utriweni Mukhaiyar ◽  
Dhika Yudistira ◽  
Sapto Wahyu Indratno ◽  
Wan Fairos Wan Yaacob

Abstract The nonstationary in time series data may be caused by the existence of intervention, outliers, and heteroscedastic effects. The outliers can represent an intervention so that it creates a heteroscedastic process. This research investigates the involvements of these three factors in time series data modelling. It is also reviewed how long the effects of the intervention and outliersfactors will last. The weekly IDR-USD exchange rate in period of May 2015 to April 2020 be evaluated. It is obtained that ARIMA model with the intervention factor gives the best re-estimation result, with smallest average of errors squared. Meanwhile for prediction, the heteroscedastic effect combined with outlier factors gives better results with the lowest percentage of errors. One of the phenomenal interventions in this data is the Covid-19 pandemic, which was started in Indonesia on March 2020. It is found that the effect of the intervention lasts less than five months and the prediction shows that the volatility of IDR-USD exchange rate starts to decline. This shows the stability of the process is starting to be maintained.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Sign in / Sign up

Export Citation Format

Share Document