Using Bootstrap to Increase Data in Predictive Analytics with Extreme Value Distribution

Dang Kien Cuong; Duong Ton Dam; Duong Ton Thai Duong

doi:10.32508/stdjet.v3isi3.608

Using Bootstrap to Increase Data in Predictive Analytics with Extreme Value Distribution

Science & Technology Development Journal - Engineering and Technology ◽

10.32508/stdjet.v3isi3.608 ◽

2021 ◽

Vol 3 (SI3) ◽

pp. SI45-SI51

Author(s):

Dang Kien Cuong ◽

Duong Ton Dam ◽

Duong Ton Thai Duong

Keyword(s):

Time Series ◽

Maximum Likelihood ◽

Multivariate Statistics ◽

Predictive Analytics ◽

Bootstrap Method ◽

Practical Significance ◽

Likelihood Method ◽

Regularity Conditions ◽

Pseudo Data ◽

The Bootstrap Method

The bootstrap is one of the method of studying statistical math which this article uses it but is a major tool for studying and evaluating the values of parameters in probability distribution. Overview of the theory of infinite distribution functions. The tool to deal with the problems raised in the paper is the mathematical methods of random analysis by theory of random process and multivariate statistics. Observations (realisations of a stationary process) are not independent, but dependence in time series is relatively simple example of dependent data. Through a simulation study we found that the pseudo data generated from the bootstrap method always showed a weaker dependence among the observations than the time series they were sampled from, hence we can draw the conclusion that even by re-sampling blocks instead of single observations we will lose some of structural from of the original sample. A potential difficulty by the using of likelihood methods for the GEV concerns the regularity conditions that are required for the usual asymptotic properties associated with the maximum likelihood estimator to be valid. To estimate the value of a parameter in GEV we can use classical methods of mathematical statistics such as the maximum likelihood method or the least squares method, but they all require a certain number samples for verification. For the bootstrap method, this is obviously not needed; here we use the limit theorems of probability theory and multivariate statistics to solve the problem even if there is only one sample data. That is the important practical significance that our paper wants to convey. In predictive analysis problems, in case the actual data is incomplete, not long enough, we can use bootstrap to add data.

Download Full-text

Optimization of Parameters Estimation for Normal Distribution Based on Bootstrap Method

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.52-54.546 ◽

2011 ◽

Vol 52-54 ◽

pp. 546-549

Author(s):

Shi Bo Xin

Keyword(s):

Maximum Likelihood ◽

Normal Distribution ◽

Maximum Likelihood Method ◽

Simulation Analysis ◽

Bootstrap Method ◽

Parameters Estimation ◽

Likelihood Method ◽

Sample Mean ◽

Optimization Of Parameters

According to sample mean submits normal distribution which is extracted from normal distribution, we give the equation of parameters estimation for normal distribution by bootstrap method, then we make a simulation analysis and compare the effect of parameters estimation which uses traditional maximum likelihood method and bootstrap method.

Download Full-text

Modeling time series by aggregating multiple fuzzy cognitive maps

PeerJ Computer Science ◽

10.7717/peerj-cs.726 ◽

2021 ◽

Vol 7 ◽

pp. e726

Author(s):

Tianming Yu ◽

Qunfeng Gan ◽

Guoliang Feng

Keyword(s):

Time Series ◽

Granular Computing ◽

Bootstrap Method ◽

Experimental Studies ◽

Real Life ◽

Cognitive Maps ◽

Fuzzy Cognitive Maps ◽

Single Model ◽

Modelling Methodology ◽

The Bootstrap Method

Background The real time series is affected by various combinations of influences, consequently, it has a variety of variation modality. It is hard to reflect the variation characteristic of the time series accurately when simulating time series only by a single model. Most of the existing methods focused on numerical prediction of time series. Also, the forecast uncertainty of time series is resolved by the interval prediction. However, few researches focus on making the model interpretable and easily comprehended by humans. Methods To overcome this limitation, a new prediction modelling methodology based on fuzzy cognitive maps is proposed. The bootstrap method is adopted to select multiple sub-sequences at first. As a result, the variation modality are contained in these sub-sequences. Then, the fuzzy cognitive maps are constructed in terms of these sub-sequences, respectively. Furthermore, these fuzzy cognitive maps models are merged by means of granular computing. The established model not only performs well in numerical and interval predictions but also has better interpretability. Results Experimental studies involving both synthetic and real-life datasets demonstrate the usefulness and satisfactory efficiency of the proposed approach.

Download Full-text

Parameter estimation for correlated Ornstein-Uhlenbeck time-series

10.1101/2021.02.12.430978 ◽

2021 ◽

Author(s):

Helmut H. Strey ◽

Rajat Kumar ◽

Lilianne Mujica-Parodi

Keyword(s):

Time Series ◽

Maximum Likelihood ◽

Pearson Correlation ◽

Relaxation Times ◽

Correlation Coefficients ◽

Brain Regions ◽

Likelihood Method ◽

Characteristic Relaxation Time ◽

Coupling Coefficients ◽

Wide Range

In this article, we develop a Maximum likelihood (ML) approach to estimate parameters from correlated time traces that originate from coupled Ornstein-Uhlenbeck processes. The most common technique to characterize the correlation between time-series is to calculate the Pearson correlation coefficient. Here we show that for time series with memory (or a characteristic relaxation time), our method gives more reliable results, but also results in coupling coefficients and their uncertainties given the data. We investigate how these uncertainties depend on the number of samples, the relaxation times and sampling time. To validate our analytic results, we performed simulations over a wide range of correlation coefficients both using our maximum likelihood solutions and Markov-Chain Monte-Carlo (MCMC) simulations. We found that both ML and MCMC result in the same parameter estimations. We also found that when analyzing the same data, the ML and MCMC uncertainties are strongly correlated, while ML underestimates the uncertainties by a factor of 1.5 to 3 over a large range of parameters. For large datasets, we can therfore use the less computationally expensive maximum likelihood method to run over the whole dataset, and then we can use MCMC on a few samples to determine the factor by which the ML method underestimates the uncertainties. To illustrate the application of our method, we apply it to time series of brain activation using fMRI measurements of the human default mode network. We show that our method significantly improves the interpretation of multi-subject measurements of correlations between brain regions by providing parameter confidence intervals for individual measurements, which allows for distinguishing between the variance from differences between subjects from variance due to measurement error.

Download Full-text

Prediction Intervals for Nonlinear Time Series Models Using the Bootstrap Method

Korean Journal of Applied Statistics ◽

10.5351/kjas.2004.17.2.219 ◽

2004 ◽

Vol 17 (2) ◽

pp. 219-228

Keyword(s):

Time Series ◽

Bootstrap Method ◽

Nonlinear Time Series ◽

Prediction Intervals ◽

Time Series Models ◽

The Bootstrap Method

Download Full-text

TAIL INDEX OF AN AR(1) MODEL WITH ARCH(1) ERRORS

Econometric Theory ◽

10.1017/s0266466612000801 ◽

2013 ◽

Vol 29 (5) ◽

pp. 920-940 ◽

Cited By ~ 8

Author(s):

Ngai Hang Chan ◽

Deyuan Li ◽

Liang Peng ◽

Rongmao Zhang

Keyword(s):

Time Series ◽

Confidence Interval ◽

Empirical Likelihood ◽

Asymptotic Variance ◽

Bootstrap Method ◽

Moment Equation ◽

Tail Index ◽

Maximum Likelihood Estimates ◽

Likelihood Method ◽

Empirical Likelihood Method

Relevant sample quantities such as the sample autocorrelation function and extremes contain useful information about autoregressive time series with heteroskedastic errors. As these quantities usually depend on the tail index of the underlying heteroskedastic time series, estimating the tail index becomes an important task. Since the tail index of such a model is determined by a moment equation, one can estimate the underlying tail index by solving the sample moment equation with the unknown parameters being replaced by their quasi-maximum likelihood estimates. To construct a confidence interval for the tail index, one needs to estimate the complicated asymptotic variance of the tail index estimator, however. In this paper the asymptotic normality of the tail index estimator is first derived, and a profile empirical likelihood method to construct a confidence interval for the tail index is then proposed. A simulation study shows that the proposed empirical likelihood method works better than the bootstrap method in terms of coverage accuracy, especially when the process is nearly nonstationary.

Download Full-text

A Reliability Evaluation Method for System's Dependent Competition Failure and Multi-Parameter Degradation Failure

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University ◽

10.1051/jnwpu/20193761191 ◽

2019 ◽

Vol 37 (6) ◽

pp. 1191-1199

Author(s):

Qiguo Hu ◽

Zhan Gao

Keyword(s):

Stochastic Process ◽

Maximum Likelihood ◽

Evaluation Method ◽

Maximum Likelihood Method ◽

Bootstrap Method ◽

Failure Process ◽

Reliability Evaluation ◽

Likelihood Method ◽

Gaussian Stochastic Process ◽

Degradation Failure

In order to enhance the reliability of a system that has dependent competition failure, a reliability evaluation method is proposed to evaluate the dependent competition failure and multi-parameter degradation failure. The multi-parameter degradation failure process is described with the Wiener stochastic process and the inverse Gaussian stochastic process. The Copula function is used to model the system's multi-degradation failure process. The two-stage maximum likelihood method is used to estimate the degradation failure parameters. The conditional probability of dependent competition failure in terms of degradation degree is established. The Bayes-Bootstrap method is utilized to correct the dependent competition failure parameters obtained with the maximum likelihood method and to further establish the system's dependent competition failure model. The degradation data of an aero-engine is used as an example to analyze the reliability under competition between dependent competition failure and multi-parameter degradation failure. The analysis results can effectively demonstrate the reliability of an aero-engine's performance and verify the validity of the model, thus having good engineering application values.

Download Full-text

Modified Maximum-Likelihood Method for Non-Normal Time Series Revisited

Communication in Statistics- Theory and Methods ◽

10.1081/sta-120028381 ◽

2004 ◽

Vol 33 (2) ◽

pp. 397-417 ◽

Cited By ~ 1

Author(s):

Taylan A. Ula ◽

Ceylan Yozgatligil#

Keyword(s):

Time Series ◽

Maximum Likelihood ◽

Maximum Likelihood Method ◽

Likelihood Method ◽

Modified Maximum Likelihood

Download Full-text

Application of a hybrid approach in nonstationary flood frequency analysis – a Polish perspective

Natural Hazards and Earth System Sciences Discussions ◽

10.5194/nhessd-1-6001-2013 ◽

2013 ◽

Vol 1 (5) ◽

pp. 6001-6024 ◽

Cited By ~ 4

Author(s):

K. Kochanek ◽

W. G. Strupczewski ◽

E. Bogdanowicz ◽

W. Feluch ◽

I. Markiewicz

Keyword(s):

Time Series ◽

Maximum Likelihood ◽

Frequency Analysis ◽

Hybrid Approach ◽

Flood Frequency ◽

Likelihood Method ◽

Flood Frequency Analysis ◽

Model Errors ◽

Two Stage ◽

L Moments

Abstract. The alleged changes in rivers' flow regime resulted in the surge in the methods of non-stationary flood frequency analysis (NFFA). The maximum likelihood method is said to produce big systematic errors in moments and quantiles resulting mainly from bad assumption of the model (model error) unless this model is the normal distribution. Since the estimators by the method of linear moments (L-moments) yield much lower model errors than those by the maximum likelihood, to improve the accuracy of the parameters and quantiles in non-stationary case, a new two-stage methodology of NFFA based on the concept of L-moments was developed. Despite taking advantage of the positive characteristics of L-moments, a new technique also allows to keep the calculations "distribution independent" as long as possible. These two stages consists in (1) least square estimation of trends in mean value and/or in standard deviation and "de-trendisation" of the time series and (2) estimation of parameters and quantiles by means of stationary sample with L-moments method and "re-trendisation" of quantiles. As a result time-dependent quantiles for a given time and return period can be calculated. The comparative results of Monte Carlo simulations confirmed the superiority of two-stage NFFA methodology over the classical maximum likelihood one. Further analysis of trends in GEV-parent-distributed generic time series by means of both NFFA methods revealed big differences between classical and two-stage estimators of trends got for the same data by the same model (GEV or Gumbel). Additionally, it turned out that the quantiles estimated by the methods of traditional stationary flood frequency analysis equal only to those non-stationary calculated for a strict middle of the time series. It proves that use of traditional stationary methods in conditions of variable regime is too much a simplification and leads to erroneous results. Therefore, when the phenomenon is non-stationary, so should be the methods used for its interpretation.

Download Full-text

A DIFFUSION APPROACH TO ECONOMIC TIME SERIES

International Journal of Theoretical and Applied Finance ◽

10.1142/s0219024900000619 ◽

2000 ◽

Vol 03 (03) ◽

pp. 567-568

Author(s):

M. CIOGLI ◽

G. ROTUNDO ◽

B. TIROZZI

Keyword(s):

Time Series ◽

Maximum Likelihood ◽

Diffusion Equation ◽

Maximum Likelihood Method ◽

Likelihood Method ◽

Economic Time Series ◽

Martingale Theory ◽

Hedging Strategy ◽

Economic Time

A diffusion equation for the price evolution of the Italian share "Olivetti" is found by investigating a series of its data. The coefficients of this equation are found by using the maximum likelihood method based on martingale theory. We evaluate pricing and hedging strategy by the Sornette and Bouchaud approach.

Download Full-text

Predictive studies of solid waste production capacity in fast food restaurants using the bootstrap method and time series

International Journal of Advanced Engineering Research and Science ◽

10.22161/ijaers.78.12 ◽

2020 ◽

Vol 7 (8) ◽

pp. 98-112

Author(s):

Lívia de Souza Alexandre ◽

Cleomacio Miguel da Silva ◽

José Cícero de Castro

Keyword(s):

Time Series ◽

Solid Waste ◽

Fast Food ◽

Bootstrap Method ◽

Production Capacity ◽

Waste Production ◽

Fast Food Restaurants ◽

The Bootstrap Method

Download Full-text