Modeling of Extreme Values via Exponential Normalization Compared with Linear and Power Normalization

Haroon Mohamed Barakat; Osama Mohareb Khaled; Nourhan Khalil Rakha

doi:10.3390/sym12111876

Modeling of Extreme Values via Exponential Normalization Compared with Linear and Power Normalization

Symmetry ◽

10.3390/sym12111876 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1876

Author(s):

Haroon Mohamed Barakat ◽

Osama Mohareb Khaled ◽

Nourhan Khalil Rakha

Keyword(s):

Extreme Values ◽

Real Data ◽

Extreme Value ◽

Data Sets ◽

Power Normalization ◽

Generalized Pareto ◽

Extreme Value Theorem ◽

Generalized Pareto Distributions ◽

Asymmetric Distributions ◽

Theoretical Results

Several new asymmetric distributions have arisen naturally in the modeling extreme values are uncovered and elucidated. The present paper deals with the extreme value theorem (EVT) under exponential normalization. An estimate of the shape parameter of the asymmetric generalized value distributions that related to this new extension of the EVT is obtained. Moreover, we develop the mathematical modeling of the extreme values by using this new extension of the EVT. We analyze the extreme values by modeling the occurrence of the exceedances over high thresholds. The natural distributions of such exceedances, new four generalized Pareto families of asymmetric distributions under exponential normalization (GPDEs), are described and their properties revealed. There is an evident symmetry between the new obtained GPDEs and those generalized Pareto distributions arisen from EVT under linear and power normalization. Estimates for the extreme value index of the four GPDEs are obtained. In addition, simulation studies are conducted in order to illustrate and validate the theoretical results. Finally, a comparison study between the different extreme models is done throughout real data sets.

Download Full-text

Extreme Value Analysis for Time-variable Mixed Workload

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.17671 ◽

2021 ◽

Author(s):

Szilárd Bozóki ◽

András Pataricza

Keyword(s):

Extreme Values ◽

Real Life ◽

Extreme Value ◽

Time Variable ◽

Extreme Value Analysis ◽

Computing Systems ◽

Value Analysis ◽

Root Cause ◽

Extreme Value Theorem ◽

Dynamic Time

Proper timeliness is vital for a lot of real-world computing systems. Understanding the phenomena of extreme workloads is essential because unhandled, extreme workloads could cause violation of timeliness requirements, service degradation, and even downtime. Extremity can have multiple roots: (1) service requests can naturally produce extreme workloads; (2) bursts could randomly occur on a probabilistic basis in case of a mixed workload in multiservice systems; (3) workload spikes typically happen in deadline bound tasks.Extreme Value Analysis (EVA) is a statistical method for modeling the extremely deviant values corresponding to the largest values. The foundation mathematics of EVA, the Extreme Value Theorem, requires the dataset to be independent and identically distributed. However, this is not generally true in practice because, usually, real-life processes are a mixture of sources with identifiable patterns. For example, seasonality and periodic fluctuations are regularly occurring patterns. Deadlines can be purely periodic, e.g., monthly tax submissions, or time variable, e.g., university homework submission with variable semester time schedules.We propose to preprocess the data using time series decomposition to separate the stochastic process causing extreme values. Moreover, we focus on the case where the root cause of the extreme values is the same mechanism: a deadline. We exploit known deadlines using dynamic time warp to search for the recurring similar workload peak patterns varying in time and amplitude.

Download Full-text

Extreme Value Index Estimation by Means of an Inequality Curve

Mathematics ◽

10.3390/math8101834 ◽

2020 ◽

Vol 8 (10) ◽

pp. 1834

Author(s):

Emanuele Taufer ◽

Flavio Santi ◽

Pier Luigi Novi Inverardi ◽

Giuseppe Espa ◽

Maria Michela Dickson

Keyword(s):

Real Data ◽

Extreme Value ◽

Extreme Value Index ◽

Data Sets ◽

Powerful Method ◽

Estimation Strategy ◽

Graphical Interpretation ◽

Regularly Varying ◽

Index Estimation

A characterizing property of Zenga (1984) inequality curve is exploited in order to develop an estimator for the extreme value index of a distribution with regularly varying tail. The approach proposed here has a nice graphical interpretation which provides a powerful method for the analysis of the tail of a distribution. The properties of the proposed estimation strategy are analysed theoretically and by means of simulations. The usefulness of the method will be tested also on real data sets.

Download Full-text

New Families of Generalized Lomax Distributions: Properties and Applications

International Journal of Statistics and Probability ◽

10.5539/ijsp.v8n6p51 ◽

2019 ◽

Vol 8 (6) ◽

pp. 51 ◽

Cited By ~ 1

Author(s):

Ahmad Alzaghal ◽

Duha Hamed

Keyword(s):

Maximum Likelihood ◽

Structural Properties ◽

Simulation Study ◽

Real Data ◽

Extreme Value ◽

Data Sets ◽

Extreme Value Distributions ◽

Method Of Maximum Likelihood ◽

The Right ◽

Family Of Distributions

In this paper, we propose new families of generalized Lomax distributions named T-LomaxfYg. Using the methodology of the Transformed-Transformer, known as T-X framework, the T-Lomax families introduced are arising from the quantile functions of exponential, Weibull, log-logistic, logistic, Cauchy and extreme value distributions. Various structural properties of the new families are derived including moments, modes and Shannon entropies. Several new generalized Lomax distributions are studied. The shapes of these T-LomaxfYg distributions are very flexible and can be symmetric, skewed to the right, skewed to the left, or bimodal. The method of maximum likelihood is proposed for estimating the distributions parameters and a simulation study is carried out to assess its performance. Four applications of real data sets are used to demonstrate the flexibility of T-LomaxfYg family of distributions in fitting unimodal and bimodal data sets from di erent disciplines.

Download Full-text

On a characteristic property of generalized Pareto distributions, extreme value distributions and their max domains of attraction

Statistical Papers ◽

10.1007/s00362-009-0214-z ◽

2009 ◽

Vol 51 (2) ◽

pp. 455-463 ◽

Cited By ~ 1

Author(s):

S. Ravi

Keyword(s):

Characteristic Property ◽

Extreme Value ◽

Extreme Value Distributions ◽

Domains Of Attraction ◽

Generalized Pareto ◽

Pareto Distributions ◽

Generalized Pareto Distributions

Download Full-text

Learning Manifolds from Non-stationary Streams

10.21203/rs.3.rs-958925/v1 ◽

2021 ◽

Author(s):

Suchismit Mahapatra ◽

Varun Chandola

Keyword(s):

Data Distribution ◽

Gaussian Process Regression ◽

Real Data ◽

Streaming Data ◽

Data Sets ◽

Dimensional Representation ◽

Initial Batch ◽

Reduction Methods ◽

Theoretical Results

Abstract Streaming adaptations of manifold learning based dimensionality reduction methods, such as Isomap, are based on the assumption that a small initial batch of observations is enough for exact learning of the manifold, while remaining streaming data instances can be cheaply mapped to this manifold. However, there are no theoretical results to show that this core assumption is valid. Moreover, such methods typically assume that the underlying data distribution is stationary and are not equipped to detect, or handle, sudden changes or gradual drifts in the distribution that may occur when the data is streaming. We present theoretical results to show that the quality of a manifold asymptotically converges as the size of data increases. We then show that a Gaussian Process Regression (GPR) model, that uses a manifold-specific kernel function and is trained on an initial batch of sufficient size, can closely approximate the state-of-art streaming Isomap algorithms. The predictive variance obtained from the GPR prediction is then shown to be an effective detector of changes in the underlying data distribution. Results on several synthetic and real data sets show that the resulting algorithm can effectively learn lower dimensional representation of high dimensional data in a streaming setting, while identifying shifts in the generative distribution.

Download Full-text

Topics in data analysis using R in extreme value theory

Advances in Methodology and Statistics ◽

10.51936/qsdg2096 ◽

2013 ◽

Vol 10 (1) ◽

Author(s):

Helena Penalva ◽

Manuela Neves

Keyword(s):

Applied Sciences ◽

Data Analysis ◽

Open Source Software ◽

Extreme Value Theory ◽

Extreme Values ◽

Value Theory ◽

Extreme Value ◽

Extreme Value Statistics ◽

Data Sets ◽

Statistics Of Extremes

The statistical Extreme Value Theory has grown gradually from the beginning of the 20th century. Its unquestionable importance in applications was definitely recognized after Gumbel's book in 1958, Statistics of Extremes. Nowadays there is a wide number of applied sciences where extreme value statistics are largely used. So, accurately modeling extreme events has become more and more important and the analysis requires tools that must be simple to use but also should consider complex statistical models in order to produce valid inferences. To deal with accurate, friendly, free and open-source software is of great value for practitioners and researchers. This paper presents a review of the main steps for initializing a data analysis of extreme values in R environment. Some well documented packages are briefly described and two data sets will be considered for illustrating the use of some functions.

Download Full-text

Estimation of Extreme Values by the Average Conditional Exceedance Rate Method

Journal of Probability and Statistics ◽

10.1155/2013/797014 ◽

2013 ◽

Vol 2013 ◽

pp. 1-15 ◽

Cited By ~ 13

Author(s):

A. Naess ◽

O. Gaidai ◽

O. Karpa

Keyword(s):

Time Series ◽

Extreme Values ◽

Practical Interest ◽

Extreme Value Distribution ◽

Real Data ◽

Extreme Value ◽

Point Of View ◽

Statistical Dependence ◽

Sampled Data ◽

Threshold Method

This paper details a method for extreme value prediction on the basis of a sampled time series. The method is specifically designed to account for statistical dependence between the sampled data points in a precise manner. In fact, if properly used, the new method will provide statistical estimates of the exact extreme value distribution provided by the data in most cases of practical interest. It avoids the problem of having to decluster the data to ensure independence, which is a requisite component in the application of, for example, the standard peaks-over-threshold method. The proposed method also targets the use of subasymptotic data to improve prediction accuracy. The method will be demonstrated by application to both synthetic and real data. From a practical point of view, it seems to perform better than the POT and block extremes methods, and, with an appropriate modification, it is directly applicable to nonstationary time series.

Download Full-text

The Exponentiated Generalized Topp Leone-G Family of Distributions: Properties and Applications

Pakistan Journal of Statistics and Operation Research ◽

10.18187/pjsor.v15i1.2166 ◽

2019 ◽

Vol 15 (1) ◽

pp. 1-24 ◽

Cited By ~ 4

Author(s):

Hesham Mohamed Reyad ◽

Morad Alizadeh ◽

Farrukh Jamal ◽

Soha Othman ◽

G G Hamedani

Keyword(s):

Extreme Values ◽

Residual Life ◽

Information Matrix ◽

Real Data ◽

Maximum Likelihood Estimates ◽

Model Parameters ◽

Data Sets ◽

New Class ◽

New Family ◽

Mathematical Properties

In this paper, we propose a new class of continuous distributions called the exponentiated generalized Topp Leone-G family that extends the Topp Leone-G family introduced by Al-Shomrani et al. (2016). We derive explicit expressions for certain mathematical properties of the new family such as; ordinary and incomplete moments, generating functions, reliability analysis, Lorenz and Bonferroni curves, RÃ©nyi entropy, stress strength model, moment of residual and reversed residual life, order statistics and extreme values. We discuss the maximum likelihood estimates and the observed information matrix for the model parameters. Two real data sets are used to illustrate the flexibility of the new family.

Download Full-text

Towards Reducing Biases in Combining Multiple Experts Online

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/416 ◽

2021 ◽

Author(s):

Yi Sun ◽

Iván Ramírez Díaz ◽

Alfredo Cuesta Infante ◽

Kalyan Veeramachaneni

Keyword(s):

Decision Making ◽

Real Time ◽

Real Life ◽

Real Data ◽

Decision Making Process ◽

Data Sets ◽

Finite Set ◽

Multiple Experts ◽

Approximate Group ◽

Theoretical Results

In many real life situations, including job and loan applications, gatekeepers must make justified and fair real-time decisions about a person’s fitness for a particular opportunity. In this paper, we aim to accomplish approximate group fairness in an online stochastic decision-making process, where the fairness metric we consider is equalized odds. Our work follows from the classical learning-from-experts scheme, assuming a finite set of classifiers (human experts, rules, options, etc) that cannot be modified. We run separate instances of the algorithm for each label class as well as sensitive groups, where the probability of choosing each instance is optimized for both fairness and regret. Our theoretical results show that approximately equalized odds can be achieved without sacrificing much regret. We also demonstrate the performance of the algorithm on real data sets commonly used by the fairness community.

Download Full-text

A Novel Approach to Increase the Goodness of Fits with an Application to Real and Simulated Data Sets

Mathematical Problems in Engineering ◽

10.1155/2021/9717872 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Muhammad Farooq ◽

Qamruz zaman ◽

Muhammad Ijaz ◽

Said Farooq Shah ◽

Mutua Kilai

Keyword(s):

Weibull Distribution ◽

Extreme Values ◽

Probability Distributions ◽

Probability Model ◽

Simulated Data ◽

Real Data ◽

Data Sets ◽

Estimation Of Parameters ◽

Novel Approach ◽

Lifetime Analysis

In practice, the data sets with extreme values are possible in many fields such as engineering, lifetime analysis, business, and economics. A lot of probability distributions are derived and presented to increase the model flexibility in the presence of such values. The current study also focuses on investigations to derive a new probability model New Flexible Family (NFF) of distributions. The significance of NFF is carried out using the Weibull distribution called New Flexible Weibull distribution or in short NFW. Various mathematical properties of NFW have been discussed including the estimation of parameters and entropy measures. Two real data sets with extreme values and a simulation study have been conducted so as to delineate the importance of NFW. Furthermore, NFW is compared with other existing probability distributions; numerically, it has been observed that the new mechanism of producing the lifetime probability distributions plays a significant role in making predictions about the population than others using the data sets with extreme values.

Download Full-text