Forecasting the Spread of COVID-19 Pandemic with Prophet

Mohan Mahanty; K. Swathi; K. Sasi Teja; P. Hemanth Kumar; A. Sravani

doi:10.18280/ria.350202

Forecasting the Spread of COVID-19 Pandemic with Prophet

Revue d intelligence artificielle ◽

10.18280/ria.350202 ◽

2021 ◽

Vol 35 (2) ◽

pp. 115-122

Author(s):

Mohan Mahanty ◽

K. Swathi ◽

K. Sasi Teja ◽

P. Hemanth Kumar ◽

A. Sravani

Keyword(s):

Data Science ◽

Statistical Data ◽

Time Series Data ◽

Daily Basis ◽

Series Data ◽

Use Case ◽

The Past ◽

Open Source Framework ◽

Forecast Models ◽

Using Data

COVID-19 pandemic shook the whole world with its brutality, and the spread has been still rising on a daily basis, causing many nations to suffer seriously. This paper presents a medical stance on research studies of COVID-19, wherein we estimated a time-series data-based statistical model using prophet to comprehend the trend of the current pandemic in the coming future after July 29, 2020 by using data at a global level. Prophet is an open-source framework discovered by the Data Science team at Facebook for carrying out forecasting based operations. It aids to automate the procedure of developing accurate forecasts and can be customized according to the use case we are solving. The Prophet model is easy to work because the official repository of prophet is live on GitHub and is open for contributions and can be fitted effortlessly. The statistical data presented on the paper refers to the number of daily confirmed cases officially for the period January 22, 2020, to July 29, 2020. The estimated data produced by the forecast models can then be used by Governments and medical care departments of various countries to manage the existing situation, thus trying to flatten the curve in various nations as we believe that there is minimal time to do this. The inferences made using the model can be clearly comprehended without much effort. Furthermore, it tries to give an understanding of the past, present, and future trends by showing graphical forecasts and statistics. Compared to other models, prophet specifically holds its own importance and innovativeness as the model is fully automated and generates quick and precise forecasts that can be tunable additionally.

Download Full-text

Fusing Nature with Computational Science for Optimal Signal Extraction

Stats ◽

10.3390/stats4010006 ◽

2021 ◽

Vol 4 (1) ◽

pp. 71-85

Author(s):

Hossein Hassani ◽

Mohammad Reza Yeganegi ◽

Xu Huang

Keyword(s):

Data Science ◽

Time Series Data ◽

Singular Spectrum Analysis ◽

Hybrid Approach ◽

Computational Science ◽

Series Data ◽

Computationally Efficient ◽

Forecasting Performance ◽

Nature Inspired Algorithms ◽

Optimal Signal

Fusing nature with computational science has been proved paramount importance and researchers have also shown growing enthusiasm on inventing and developing nature inspired algorithms for solving complex problems across subjects. Inevitably, these advancements have rapidly promoted the development of data science, where nature inspired algorithms are changing the traditional way of data processing. This paper proposes the hybrid approach, namely SSA-GA, which incorporates the optimization merits of genetic algorithm (GA) for the advancements of Singular Spectrum Analysis (SSA). This approach further boosts the performance of SSA forecasting via better and more efficient grouping. Given the performances of SSA-GA on 100 real time series data across various subjects, this newly proposed SSA-GA approach is proved to be computationally efficient and robust with improved forecasting performance.

Download Full-text

Population Pressure on Land Resources in Nigeria: The Past and Projected Outcome

Journal of Energy Research and Reviews ◽

10.9734/jenrr/2020/v4i230122 ◽

2020 ◽

pp. 20-34

Author(s):

H. I. Eririogu ◽

R. N. Echebiri ◽

E. S. Ebukiba

Keyword(s):

Carrying Capacity ◽

Time Series Data ◽

Arable Land ◽

Population Pressure ◽

Series Data ◽

Land Resources ◽

Exploration And Exploitation ◽

The Past ◽

Land Consumption ◽

Per Capita

Aims: This paper assesses the population pressure on land resources in Nigeria: The past and projected outcome. Study Design: 1967 to 2068 time series data were used. The data sets were resorted to due to lack of complete national data. Place and Duration of Study: Past (1967-2017) and projected (2018-2068) five decades in Nigeria. Methodology: The time series data were obtained from the United Nations Population Division, Department of Economic and Social Affairs, National Population Commission, International Energy Statistics and Food and Agriculture Organization (FAO) on population levels, renewable and non renewable resources in Nigeria. Others such as transformity were adapted from Odum (1996) and Odum (2000) for specific objectives. Data collected were analyzed using modified ecological footprint/carrying capacity approach, descriptive statistics and Z-statistics. Results: Results showed that the mean annual pressure on land resources in the past five decades (1967-2017) was 9.323 hectares per capita, while the projected pressure in the next five decades (2018-2068) was 213.178 hectares per capita. Results also showed that about 73.08 percent of the pressure per capita in the past five decades emanated from arable land consumption (6.813ha), while 75.91percent of the pressure is expected to emanate from fossil land in the next projected five decades due to crude oil and mineral resource exploration and exploitation. The carrying capacity of land resources in the past five decades was 6.4091 hectares per capita, while that of the projected five decades was 1.667 hectares per capita, an indication of ecological overshoot in both periods. Conclusion: Population pressures on land resources per capita in the past and projected five decades are higher than the carrying capacity of these resources in the country. Citizens lived and are expected to live unsustainably by depleting and degrading available land resources. Arable land consumption is the major contributor to the total pressure on land resources in the past five decades, while the consumption of fossil land due to exploration and exploitation of crude oil and mineral resources is expected to contribute majorly to the total pressure on land resources in the next five decades. Limiting affluence (per capita consumption of resources) and improving technology will not only ensure sustainable use of arable and fossil lands but place consumption within the limits of these resources for a sustainable future.

Download Full-text

Reconsidering the importance of the past in predator–prey models: both numerical and functional responses depend on delayed prey densities

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2013.1389 ◽

2013 ◽

Vol 280 (1768) ◽

pp. 20131389 ◽

Cited By ~ 14

Author(s):

Jiqiu Li ◽

Andy Fenton ◽

Lee Kettley ◽

Phillip Roberts ◽

David J. S. Montagnes

Keyword(s):

Time Series ◽

Time Series Data ◽

Prey Abundance ◽

Series Data ◽

Functional Responses ◽

New Approach ◽

Predator Prey ◽

The Past ◽

Predator Prey Models ◽

Underlying Mechanisms

We propose that delayed predator–prey models may provide superficially acceptable predictions for spurious reasons. Through experimentation and modelling, we offer a new approach: using a model experimental predator–prey system (the ciliates Didinium and Paramecium ), we determine the influence of past-prey abundance at a fixed delay (approx. one generation) on both functional and numerical responses (i.e. the influence of present : past-prey abundance on ingestion and growth, respectively). We reveal a nonlinear influence of past-prey abundance on both responses, with the two responding differently. Including these responses in a model indicated that delay in the numerical response drives population oscillations, supporting the accepted (but untested) notion that reproduction, not feeding, is highly dependent on the past. We next indicate how delays impact short- and long-term population dynamics. Critically, we show that although superficially the standard (parsimonious) approach to modelling can reasonably fit independently obtained time-series data, it does so by relying on biologically unrealistic parameter values. By contrast, including our fully parametrized delayed density dependence provides a better fit, offering insights into underlying mechanisms. We therefore present a new approach to explore time-series data and a revised framework for further theoretical studies.

Download Full-text

GA-Optimized Multivariate CNN-LSTM Model for Predicting Multi-channel Mobility in the COVID-19 Pandemic

Emerging Science Journal ◽

10.28991/esj-2021-01300 ◽

2021 ◽

Vol 5 (5) ◽

pp. 619-635

Author(s):

Harya Widiputra

Keyword(s):

Time Series ◽

Time Series Data ◽

Short Term Memory ◽

Human Mobility ◽

Positive Association ◽

Human Movement ◽

Daily Basis ◽

Series Data ◽

Data Prediction ◽

The Future

The primary factor that contributes to the transmission of COVID-19 infection is human mobility. Positive instances added on a daily basis have a substantial positive association with the pace of human mobility, and the reverse is true. Thus, having the ability to predict human mobility trend during a pandemic is critical for policymakers to help in decreasing the rate of transmission in the future. In this regard, one approach that is commonly used for time-series data prediction is to build an ensemble with the aim of getting the best performance. However, building an ensemble often causes the performance of the model to decrease, due to the increasing number of parameters that are not being optimized properly. Consequently, the purpose of this study is to develop and evaluate a deep learning ensemble model, which is optimized using a genetic algorithm (GA) that incorporates a convolutional neural network (CNN) and a long short-term memory (LSTM). A CNN is used to conduct feature extraction from mobility time-series data, while an LSTM is used to do mobility prediction. The parameters of both layers are adjusted using GA. As a result of the experiments conducted with data from the Google Community Mobility Reports in Indonesia that ranges from the beginning of February 2020 to the end of December 2020, the GA-Optimized Multivariate CNN-LSTM ensemble outperforms stand-alone CNN and LSTM models, as well as the non-optimized CNN-LSTM model, in terms of predicting human movement in the future. This may be useful in assisting policymakers in anticipating future human mobility trends. Doi: 10.28991/esj-2021-01300 Full Text: PDF

Download Full-text

Time-Series Causality with Missing Data

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3552 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-4

Author(s):

Bo Yuan Chang ◽

Mohamed A. Naiel ◽

Steven Wardell ◽

Stan Kleinikkink ◽

John S. Zelek

Keyword(s):

Time Series ◽

Missing Data ◽

Missing Values ◽

Time Series Data ◽

Multivariate Time Series ◽

Gaussian Process Regression ◽

Series Data ◽

Causal Relationships ◽

Sampled Data ◽

The Past

Over the past years, researchers have proposed various methods to discover causal relationships among time-series data as well as algorithms to fill in missing entries in time-series data. Little to no work has been done in combining the two strategies for the purpose of learning causal relationships using unevenly sampled multivariate time-series data. In this paper, we examine how the causal parameters learnt from unevenly sampled data (with missing entries) deviates from the parameters learnt using the evenly sampled data (without missing entries). However, to obtain the causal relationship from a given time-series requires evenly sampled data, which suggests filling the missing data values before obtaining the causal parameters. Therefore, the proposed method is based on applying a Gaussian Process Regression (GPR) model for missing data recovery, followed by several pairwise Granger causality equations in Vector Autoregssive form to fit the recovered data and obtain the causal parameters. Experimental results show that the causal parameters generated by using GPR data filling offers much lower RMSE than the dummy model (fill with last seen entry) under all missing values percentage, suggesting that GPR data filling can better preserve the causal relationships when compared with dummy data filling, thus should be considered when dealing with unevenly sampled time-series causality learning.

Download Full-text

Detection of Outliers in the Volatility of Malaysia Shariah Compliant Index Return: The Impulse Indicator Saturation Approach

ASM Science Journal ◽

10.32802/asmscj.2020.sm26(1.7) ◽

2020 ◽

pp. 1-7

Author(s):

Ida Normaya Mohd Nasir ◽

Mohd Tahir Ismail

Keyword(s):

Time Series Data ◽

Global Financial Crisis ◽

Financial Time Series ◽

The United States ◽

Series Data ◽

Crude Oil Prices ◽

Global Events ◽

Using Data ◽

Impulse Indicator Saturation ◽

Detection Of Outliers

Financial time series data often affected by various unexpected events which known as the outliers. The aim of this study is to detect the outliers in high frequency data using Impulse Indicator Saturation approach (IIS).Monte Carlo simulations illustrate the ability of IIS to detect outliers by using data with various simulation settings. For empirical application, we have chosen the Malaysia Shariah compliant index which is the FBM EMAS Shariah (FBMS) index. The result of this study discovered the presence of 47 outliers which related to several global events such as global financial crisis (2008 & 2009), the falling of stock market (2011), the United States debt-ceiling crisis (2013) and the declination of international crude oil prices (2014). Keywords: outliers; volatility; stock indices; IIS

Download Full-text

Estimating time-varying selection coefficients from time series data of allele frequencies

10.1101/2020.11.17.387761 ◽

2020 ◽

Author(s):

Iain Mathieson

Keyword(s):

Time Series ◽

East Asia ◽

Ancient Dna ◽

Experimental Evolution ◽

Time Series Data ◽

Natural Populations ◽

Allele Frequencies ◽

Series Data ◽

Alcohol Metabolism ◽

The Past

AbstractTime series data of allele frequencies are a powerful resource for detecting and classifying natural and artificial selection. Ancient DNA now allows us to observe these trajectories in natural populations of long-lived species such as humans. Here, we develop a hidden Markov model to infer selection coefficients that vary over time. We show through simulations that our approach can accurately estimate both selection coefficients and the timing of changes in selection. Finally, we analyze some of the strongest signals of selection in the human genome using ancient DNA. We show that the European lactase persistence mutation was selected over the past 5,000 years with a selection coefficient of 2-2.5% in Britain, Central Europe and Iberia, but not Italy. In northern East Asia, selection at the ADH1B locus associated with alcohol metabolism intensified around 4,000 years ago, approximately coinciding with the introduction of rice-based agriculture. Finally, a derived allele at the FADS locus was selected in parallel in both Europe and East Asia, as previously hypothesized. Our approach is broadly applicable to both natural and experimental evolution data and shows how time series data can be used to resolve fine-scale details of selection.

Download Full-text

Neural data science: accelerating the experiment-analysis-theory cycle in large-scale neuroscience

10.1101/196949 ◽

2017 ◽

Cited By ~ 4

Author(s):

L Paninski ◽

J.P Cunningham

Keyword(s):

Science Fiction ◽

Large Scale ◽

Data Science ◽

Time Series Data ◽

Series Data ◽

Multielectrode Arrays ◽

Analysis Theory ◽

Broad Array ◽

Scientific Questions

AbstractModern large - scale multineuronal recording methodologies, including multielectrode arrays, calcium imaging, and optogenetic techniques, produce single - neuron resolution data of a magnitude and precision that were the realm of science fiction twenty years ago. The major bottlenecks in systems and circuit neuroscience no longer lie in simply collecting data from large neural populations, but also in understanding this data: developing novel scientific questions, with corresponding analysis techniques and experimental designs to fully harness these new capabilities and meaningfully interrogate these questions. Advances in methods for signal processing, network analysis, dimensionality reduction, and optimal control – developed in lockstep with advances in experimental neurotechnology - - promise major breakthroughs in multiple fundamental neuroscience problems. These trends are clear in a broad array of subfields of modern neuroscience; this review focuses on recent advances in methods for analyzing neural time - series data with single - neuronal precision. Figure 1.The central role of data science in modern large - scale neuroscience.Topics reviewed herein are indicated in black.

Download Full-text

Options for calibrating CERES-maize genotype specific parameters under data-scarce environments

10.1101/353045 ◽

2018 ◽

Author(s):

A.A Adnan ◽

J. Diels ◽

J.M. Jibrin ◽

A.Y. Kamara ◽

P. Craufurd ◽

...

Keyword(s):

Field Experiments ◽

Time Series Data ◽

Simulation Models ◽

Coefficient Of Determination ◽

Series Data ◽

Planting Dates ◽

Maize Varieties ◽

Using Data ◽

Set Up ◽

Nitrogen Contents

AbstractMost crop simulation models require the use of Genotype Specific Parameters (GSPs) which provide the Genotype component of G×E×M interactions. Estimation of GSPs is the most difficult aspect of most modelling exercises because it requires expensive and time-consuming field experiments. GSPs could also be estimated using multi-year and multi locational data from breeder evaluation experiments. This research was set up with the following objectives: i) to determine GSPs of 10 newly released maize varieties for the Nigerian Savannas using data from both calibration experiments and by using existing data from breeder varietal evaluation trials; ii) to compare the accuracy of the GSPs generated using experimental and breeder data; and iii) to evaluate CERES-Maize model to simulate grain and tissue nitrogen contents. For experimental evaluation, 8 different experiments were conducted during the rainy and dry seasons of 2016 across the Nigerian Savanna. Breeder evaluation data was also collected for 2 years and 7 locations. The calibrated GSPs were evaluated using data from a 4 year experiment conducted under varying nitrogen rates (0, 60 and 120kg N ha−1). For the model calibration using experimental data, calculated model efficiency (EF) values ranged between 0.86-0.92 and coefficient of determination (d-index) between 0.92-0.98. Calibration of time-series data produced nRMSE below 7% while all prediction deviations were below 10% of the mean. For breeder experiments, EF (0.52-0.81) and d-index (0.46-0.83) ranges were lower. Prediction deviations were below 17% of the means for all measured variables. Model evaluation using both experimental and breeder trials resulted in good agreement (low RMSE, high EF and d-index values) between observed and simulated grain yields, and tissue and grain nitrogen contents. We conclude that higher calibration accuracy of CERES-Maize model is achieved from detailed experiments. If unavailable, data from breeder experimental trials collected from many locations and planting dates can be used with lower but acceptable accuracy.

Download Full-text

Revisiting the Holt-Winters' Additive Method for Better Forecasting

International Journal of Enterprise Information Systems ◽

10.4018/ijeis.2019040103 ◽

2019 ◽

Vol 15 (2) ◽

pp. 43-57

Author(s):

Seng Hansun ◽

Vincent Charles ◽

Christiana Rini Indrati ◽

Subanar

Keyword(s):

Time Series ◽

Average Method ◽

Data Science ◽

Time Series Data ◽

Initial Conditions ◽

Moving Average ◽

Series Data ◽

Initial Values ◽

Additive Method ◽

Weighted Moving Average

Time series are one of the most common data types encountered by data scientists and, in the context of today's exponentially increasing data, learning how to best model them to derive meaningful insights is an important skill in the Big Data and Data Science toolbox. As a result, many researchers have dedicated their efforts to developing time series analysis methods to predict future values based on previously observed values. One of the well-known methods is the Holt-Winters' seasonal method, which is commonly used to capture the seasonality effect in time series data. In this study, the authors aim to build upon the Holt-Winters' additive method by introducing new formulas for finding the initial values. Obtaining more accurate estimations of the initial values could result in a better forecasting result. The authors use the basic principle found in the weighted moving average method to assign more weight to the most recent data and combine it with the original initial conditions found in the Holt-Winters' additive method. Based on the experiment performed, the authors conclude that the new formulas for finding the initial values in the Holt-Winters' additive method could give a better forecasting when compared to the traditional Holt-Winters' additive method and the weighted moving average method in terms of the accuracy level.

Download Full-text