Short-Term Wind Speed Forecasting Using Statistical and Machine Learning Methods

Lucky O. Daniel; Caston Sigauke; Colin Chibaya; Rendani Mbuvha

doi:10.3390/a13060132

Short-Term Wind Speed Forecasting Using Statistical and Machine Learning Methods

Algorithms ◽

10.3390/a13060132 ◽

2020 ◽

Vol 13 (6) ◽

pp. 132 ◽

Cited By ~ 1

Author(s):

Lucky O. Daniel ◽

Caston Sigauke ◽

Colin Chibaya ◽

Rendani Mbuvha

Keyword(s):

Wind Speed ◽

Quantile Regression ◽

Predictive Performance ◽

Additive Models ◽

Combination Method ◽

Forecast Combination ◽

Primary Input ◽

Stochastic Gradient Boosting ◽

Combination Methods ◽

Forecasts Combination

Wind offers an environmentally sustainable energy resource that has seen increasing global adoption in recent years. However, its intermittent, unstable and stochastic nature hampers its representation among other renewable energy sources. This work addresses the forecasting of wind speed, a primary input needed for wind energy generation, using data obtained from the South African Wind Atlas Project. Forecasting is carried out on a two days ahead time horizon. We investigate the predictive performance of artificial neural networks (ANN) trained with Bayesian regularisation, decision trees based stochastic gradient boosting (SGB) and generalised additive models (GAMs). The results of the comparative analysis suggest that ANN displays superior predictive performance based on root mean square error (RMSE). In contrast, SGB shows outperformance in terms of mean average error (MAE) and the related mean average percentage error (MAPE). A further comparison of two forecast combination methods involving the linear and additive quantile regression averaging show the latter forecast combination method as yielding lower prediction accuracy. The additive quantile regression averaging based prediction intervals also show outperformance in terms of validity, reliability, quality and accuracy. Interval combination methods show the median method as better than its pure average counterpart. Point forecasts combination and interval forecasting methods are found to improve forecast performance.

Download Full-text

Day Ahead Hourly Global Horizontal Irradiance Forecasting—Application to South African Data

Energies ◽

10.3390/en12183569 ◽

2019 ◽

Vol 12 (18) ◽

pp. 3569 ◽

Cited By ~ 10

Author(s):

Phathutshedzo Mpfumali ◽

Caston Sigauke ◽

Alphonce Bere ◽

Sophie Mulaudzi

Keyword(s):

Machine Learning ◽

Solar Power ◽

Prediction Interval ◽

Combination Method ◽

Support Vector ◽

Forecast Combination ◽

Learning Methods ◽

Machine Learning Methods ◽

Stochastic Gradient Boosting ◽

Combination Methods

Due to its variability, solar power generation poses challenges to grid energy management. In order to ensure an economic operation of a national grid, including its stability, it is important to have accurate forecasts of solar power. The current paper discusses probabilistic forecasting of twenty-four hours ahead of global horizontal irradiance (GHI) using data from the Tellerie radiometric station in South Africa for the period August 2009 to April 2010. Variables are selected using a least absolute shrinkage and selection operator (Lasso) via hierarchical interactions and the parameters of the developed models are estimated using the Barrodale and Roberts’s algorithm. Two forecast combination methods are used in this study. The first is a convex forecast combination algorithm where the average loss suffered by the models is based on the pinball loss function. A second forecast combination method, which is quantile regression averaging (QRA), is also used. The best set of forecasts is selected based on the prediction interval coverage probability (PICP), prediction interval normalised average width (PINAW) and prediction interval normalised average deviation (PINAD). The results demonstrate that QRA gives more robust prediction intervals than the other models. A comparative analysis is done with two machine learning methods—stochastic gradient boosting and support vector regression—which are used as benchmark models. Empirical results show that the QRA model yields the most accurate forecasts compared to the machine learning methods based on the probabilistic error measures. Results on combining prediction interval limits show that the PMis the best prediction limits combination method as it gives a hit rate of 0.955 which is very close to the target of 0.95. This modelling approach is expected to help in optimising the integration of solar power in the national grid.

Download Full-text

Optimal forecasting accuracy using Lp-norm combination

METRON ◽

10.1007/s40300-021-00218-5 ◽

2021 ◽

Author(s):

Massimiliano Giacalone

Keyword(s):

Real Data ◽

Optimization Techniques ◽

Combination Method ◽

Forecast Combination ◽

Forecasting Accuracy ◽

Lp Norm ◽

Combination Methods ◽

Historical Series ◽

Non Gaussian ◽

Standard Regression

AbstractA well-known result in statistics is that a linear combination of two-point forecasts has a smaller Mean Square Error (MSE) than the two competing forecasts themselves (Bates and Granger in J Oper Res Soc 20(4):451–468, 1969). The only case in which no improvements are possible is when one of the single forecasts is already the optimal one in terms of MSE. The kinds of combination methods are various, ranging from the simple average (SA) to more robust methods such as the one based on median or Trimmed Average (TA) or Least Absolute Deviations or optimization techniques (Stock and Watson in J Forecast 23(6):405–430, 2004). Standard regression-based combination approaches may fail to get a realistic result if the forecasts show high collinearity in several situations or the data distribution is not Gaussian. Therefore, we propose a forecast combination method based on Lp-norm estimators. These estimators are based on the Generalized Error Distribution, which is a generalization of the Gaussian distribution, and they can be used to solve the cases of multicollinearity and non-Gaussianity. In order to demonstrate the potential of Lp-norms, we conducted a simulated and an empirical study, comparing its performance with other standard-regression combination approaches. We carried out the simulation study with different values of the autoregressive parameter, by alternating heteroskedasticity and homoskedasticity. On the other hand, the real data application is based on the daily Bitfinex historical series of bitcoins (2014–2020) and the 25 historical series relating to companies included in the Dow Jonson, were subsequently considered. We showed that, by combining different GARCH and the ARIMA models, assuming both Gaussian and non-Gaussian distributions, the Lp-norm scheme improves the forecasting accuracy with respect to other regression-based combination procedures.

Download Full-text

A Changing Weights Spatial Forecast Combination Approach with an Application to Housing Price Prediction

International Journal of Economics and Finance ◽

10.5539/ijef.v12n4p11 ◽

2020 ◽

Vol 12 (4) ◽

pp. 11

Author(s):

Chuanhua Wei ◽

Chenping Du ◽

Nana Zheng

Keyword(s):

Spatial Data ◽

Time Series Data ◽

House Prices ◽

Housing Price ◽

Combination Method ◽

Series Data ◽

Forecast Combination ◽

Combination Methods ◽

Seminal Article ◽

Combination Approach

Forecast combination has been widely applied in various fields since the seminal article of Bates and Granger (1969). However, these research were focused only on time series data. Few study focus on the spatial data, this paper proposes a novel adaptive spatial forecast combination method with varying weights based on the geographically weighted regression technique. Finally, the proposed method is applied to the Boston house prices prediction, and the results indicate that our procedure performs better than the other forecast combination methods.

Download Full-text

Predicting Malaria Transmission Dynamics in Dangassa, Mali: A Novel Approach Using Functional Generalized Additive Models

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17176339 ◽

2020 ◽

Vol 17 (17) ◽

pp. 6339

Author(s):

François Freddy Ateba ◽

Manuel Febrero-Bande ◽

Issaka Sagara ◽

Nafomon Sogoba ◽

Mahamoudou Touré ◽

...

Keyword(s):

Wind Speed ◽

Malaria Incidence ◽

Generalized Additive Models ◽

Community Health Center ◽

Additive Model ◽

Additive Models ◽

Functional Regression ◽

Novel Approach ◽

Malaria Incidence Rate ◽

Mean Wind Speed

Mali aims to reach the pre-elimination stage of malaria by the next decade. This study used functional regression models to predict the incidence of malaria as a function of past meteorological patterns to better prevent and to act proactively against impending malaria outbreaks. All data were collected over a five-year period (2012–2017) from 1400 persons who sought treatment at Dangassa’s community health center. Rainfall, temperature, humidity, and wind speed variables were collected. Functional Generalized Spectral Additive Model (FGSAM), Functional Generalized Linear Model (FGLM), and Functional Generalized Kernel Additive Model (FGKAM) were used to predict malaria incidence as a function of the pattern of meteorological indicators over a continuum of the 18 weeks preceding the week of interest. Their respective outcomes were compared in terms of predictive abilities. The results showed that (1) the highest malaria incidence rate occurred in the village 10 to 12 weeks after we observed a pattern of air humidity levels >65%, combined with two or more consecutive rain episodes and a mean wind speed <1.8 m/s; (2) among the three models, the FGLM obtained the best results in terms of prediction; and (3) FGSAM was shown to be a good compromise between FGLM and FGKAM in terms of flexibility and simplicity. The models showed that some meteorological conditions may provide a basis for detection of future outbreaks of malaria. The models developed in this paper are useful for implementing preventive strategies using past meteorological and past malaria incidence.

Download Full-text

Comparing Charlson and Elixhauser comorbidity indices with different weightings to predict in-hospital mortality: an analysis of national inpatient data

BMC Health Services Research ◽

10.1186/s12913-020-05999-5 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Narayan Sharma ◽

René Schwendimann ◽

Olga Endrich ◽

Dietmar Ausserhofer ◽

Michael Simon

Keyword(s):

Hospital Mortality ◽

Patient Population ◽

Generalized Additive Models ◽

Routine Data ◽

Predictive Performance ◽

Population Based ◽

Mortality Prediction ◽

Additive Models ◽

General Hospitals ◽

Net Reclassification Improvement

Abstract Background Understanding how comorbidity measures contribute to patient mortality is essential both to describe patient health status and to adjust for risks and potential confounding. The Charlson and Elixhauser comorbidity indices are well-established for risk adjustment and mortality prediction. Still, a different set of comorbidity weights might improve the prediction of in-hospital mortality. The present study, therefore, aimed to derive a set of new Swiss Elixhauser comorbidity weightings, to validate and compare them against those of the Charlson and Elixhauser-based van Walraven weights in an adult in-patient population-based cohort of general hospitals. Methods Retrospective analysis was conducted with routine data of 102 Swiss general hospitals (2012–2017) for 6.09 million inpatient cases. To derive the Swiss weightings for the Elixhauser comorbidity index, we randomly halved the inpatient data and validated the results of part 1 alongside the established weighting systems in part 2, to predict in-hospital mortality. Charlson and van Walraven weights were applied to Charlson and Elixhauser comorbidity indices. Derivation and validation of weightings were conducted with generalized additive models adjusted for age, gender and hospital types. Results Overall, the Elixhauser indices, c-statistic with Swiss weights (0.867, 95% CI, 0.865–0.868) and van Walraven’s weights (0.863, 95% CI, 0.862–0.864) had substantial advantage over Charlson’s weights (0.850, 95% CI, 0.849–0.851) and in the derivation and validation groups. The net reclassification improvement of new Swiss weights improved the predictive performance by 1.6% on the Elixhauser-van Walraven and 4.9% on the Charlson weights. Conclusions All weightings confirmed previous results with the national dataset. The new Swiss weightings model improved slightly the prediction of in-hospital mortality in Swiss hospitals. The newly derive weights support patient population-based analysis of in-hospital mortality and seek country or specific cohort-based weightings.

Download Full-text

An improved conflicting-evidence combination method based on the redistribution of the basic probability assignment

Applied Intelligence ◽

10.1007/s10489-021-02404-4 ◽

2021 ◽

Author(s):

Zezheng Yan ◽

Hanping Zhao ◽

Xiaowen Mei

Keyword(s):

Evidence Theory ◽

Combination Method ◽

Probability Assignment ◽

Basic Probability Assignment ◽

Combination Methods ◽

Conflicting Evidence ◽

Basic Probability ◽

The Right ◽

Conflict Intensity ◽

Dempster’S Rule Of Combination

AbstractDempster–Shafer evidence theory is widely applied in various fields related to information fusion. However, the results are counterintuitive when highly conflicting evidence is fused with Dempster’s rule of combination. Many improved combination methods have been developed to address conflicting evidence. Nevertheless, all of these approaches have inherent flaws. To solve the existing counterintuitive problem more effectively and less conservatively, an improved combination method for conflicting evidence based on the redistribution of the basic probability assignment is proposed. First, the conflict intensity and the unreliability of the evidence are calculated based on the consistency degree, conflict degree and similarity coefficient among the evidence. Second, the redistribution equation of the basic probability assignment is constructed based on the unreliability and conflict intensity, which realizes the redistribution of the basic probability assignment. Third, to avoid excessive redistribution of the basic probability assignment, the precision degree of the evidence obtained by information entropy is used as the correction factor to modify the basic probability assignment for the second time. Finally, Dempster’s rule of combination is used to fuse the modified basic probability assignment. Several different types of examples and actual data sets are given to illustrate the effectiveness and potential of the proposed method. Furthermore, the comparative analysis reveals the proposed method to be better at obtaining the right results than other related methods.

Download Full-text

Piecewise Aggregate Approximation and Quantile Regression for Wind Speed Analysis

2017 International Conference on Computational Science and Computational Intelligence (CSCI) ◽

10.1109/csci.2017.262 ◽

2017 ◽

Author(s):

Ronaldo R. B. de Aquino ◽

Helen Barboza da Silva ◽

Jonata C. de Albuquerque ◽

Manuel Herrera ◽

Aida A. Ferreira ◽

...

Keyword(s):

Wind Speed ◽

Quantile Regression

Download Full-text

Assessing the effect of meteorological factors on daily children’s respiratory disease hospitalizations: a retrospective study

10.21203/rs.3.rs-20210/v1 ◽

2020 ◽

Author(s):

Wenfang Guo ◽

Letai Yi ◽

Peng Wang ◽

Baojun Wang ◽

Minhui Li

Keyword(s):

Wind Speed ◽

Respiratory Tract ◽

Respiratory Disease ◽

Effective Temperature ◽

Respiratory Tract Infections ◽

Meteorological Factors ◽

Additive Models ◽

Upper Respiratory Tract ◽

Tract Infections ◽

Peak Value

Abstract Background Some previous studies have examined the effects of temperature, humidity, wind speed and atmospheric pressure on children morbidity, but few studies have evaluated health effects of combined effect of various meteorological factors. The purpose of this study was to assess the effect of daily changes in meteorological factors and their comprehensive effects on children’s respiratory disease hospitalizations for different ages, genders and subtypes in Baotou, China. Methods Generalized additive models and distributed lag non-linear models were constructed to simultaneously assess the exposure–response associations between daily admission counts of children with respiratory diseases and daily net effective temperature and other meteorological factors as well as their lag dependencies. Results In general, the cumulative meteorological factors had greater effects on lower respiratory tract infections than upper respiratory tract infections (RR: temperature [4.2 vs. 2.7]; wind speed [3.1 vs. 2.5]; humidity [1.8 vs. 1.3]). The effects on children over 3 years old were greater than those on children aged 0–3 years (OR: temperature [4.4 vs. 1.3]; wind speed [4.4 vs. 1.5]), while the effects on female children were greater than those on male children (OR: temperature [2.6 vs. 1.8]; wind speed [3.3 vs. 1.6]). However, some differences were observed between groups with regard to the effect of humidity. Hence, the net effective temperature was calculated using comprehensive meteorological factors, and the influence range value and peak value of each group were determined. Conclusions The influence of meteorological factors on children’s respiratory disease hospitalizations shows different characteristics in different subgroups. Hence, the net effective temperature was calculated using the comprehensive meteorological factors, and the influence range and peak value of each group were determined so as to recommend the corresponding measures accordingly.

Download Full-text

Machine learning as a successful approach for predicting complex spatio–temporal patterns in animal species abundance

Animal Biodiversity and Conservation ◽

10.32800/abc.2021.44.0289 ◽

2021 ◽

pp. 289-301

Author(s):

B. Martín ◽

J. González–Arias ◽

J. A. Vicente–Vírseda

Keyword(s):

Machine Learning ◽

Random Forest ◽

Animal Species ◽

Temporal Patterns ◽

Additive Models ◽

Gradient Boosting ◽

Support Vector ◽

Stochastic Gradient Boosting ◽

Extreme Gradient Boosting ◽

Spatio Temporal

Our aim was to identify an optimal analytical approach for accurately predicting complex spatio–temporal patterns in animal species distribution. We compared the performance of eight modelling techniques (generalized additive models, regression trees, bagged CART, k–nearest neighbors, stochastic gradient boosting, support vector machines, neural network, and random forest –enhanced form of bootstrap. We also performed extreme gradient boosting –an enhanced form of radiant boosting– to predict spatial patterns in abundance of migrating Balearic shearwaters based on data gathered within eBird. Derived from open–source datasets, proxies of frontal systems and ocean productivity domains that have been previously used to characterize the oceanographic habitats of seabirds were quantified, and then used as predictors in the models. The random forest model showed the best performance according to the parameters assessed (RMSE value and R2). The correlation between observed and predicted abundance with this model was also considerably high. This study shows that the combination of machine learning techniques and massive data provided by open data sources is a useful approach for identifying the long–term spatial–temporal distribution of species at regional spatial scales.

Download Full-text

Forecasting Tourism Demand With a New Time-Varying Forecast Averaging Approach

Journal of Travel Research ◽

10.1177/00472875211061206 ◽

2021 ◽

pp. 004728752110612

Author(s):

Yuying Sun ◽

Jian Zhang ◽

Xin Li ◽

Shouyang Wang

Keyword(s):

Structural Changes ◽

Model Averaging ◽

Structural Instability ◽

Time Varying ◽

Forecast Combination ◽

Tourism Demand ◽

Single Model ◽

Forecasting Accuracy ◽

Combination Methods ◽

Out Of Sample

Existing research has shown that combination can effectively improve tourism forecasting accuracy compared with single model. However, the model uncertainty and structural instability in combination for out-of-sample tourism forecasting may influence the forecasting performance. This paper proposes a novel forecast combination approach based on time-varying jackknife model averaging (TVJMA), which can more efficiently handle structural changes and nonstationary trends in tourism data. Using Hong Kong tourism demand from five major tourism source regions as an empirical study, we investigate whether our proposed nonparametric TVJMA-based approach can improve tourism forecasting accuracy further. Empirical results show that the proposed TVJMA-based approach outperforms other competitors including single model and three combination methods in most cases. Findings indicate the outstanding performance of our method is robust to various forecasting horizons and different estimation periods.

Download Full-text