NONPARAMETRIC ALGORITHM OF IDENTIFICATION OF CLASSES CORRESPONDING TO SINGLE-MODE FRAGMENTS OF THE PROBABILITY DENSITY OF MULTIDIMENSIONAL RANDOM VARIABLES

2019 ◽  
2021 ◽  
Vol 45 (2) ◽  
pp. 253-260
Author(s):  
I.V. Zenkov ◽  
A.V. Lapko ◽  
V.A. Lapko ◽  
S.T. Im ◽  
V.P. Tuboltsev ◽  
...  

A nonparametric algorithm for automatic classification of large statistical data sets is proposed. The algorithm is based on a procedure for optimal discretization of the range of values of a random variable. A class is a compact group of observations of a random variable corresponding to a unimodal fragment of the probability density. The considered algorithm of automatic classification is based on the «compression» of the initial information based on the decomposition of a multidimensional space of attributes. As a result, a large statistical sample is transformed into a data array composed of the centers of multidimensional sampling intervals and the corresponding frequencies of random variables. To substantiate the optimal discretization procedure, we use the results of a study of the asymptotic properties of a kernel-type regression estimate of the probability density. An optimal number of sampling intervals for the range of values of one- and two-dimensional random variables is determined from the condition of the minimum root-mean square deviation of the regression probability density estimate. The results obtained are generalized to the discretization of the range of values of a multidimensional random variable. The optimal discretization formula contains a component that is characterized by a nonlinear functional of the probability density. An analytical dependence of the detected component on the antikurtosis coefficient of a one-dimensional random variable is established. For independent components of a multidimensional random variable, a methodology is developed for calculating estimates of the optimal number of sampling intervals for random variables and their lengths. On this basis, a nonparametric algorithm for the automatic classification is developed. It is based on a sequential procedure for checking the proximity of the centers of multidimensional sampling intervals and relationships between frequencies of the membership of the random variables from the original sample of these intervals. To further increase the computational efficiency of the proposed automatic classification algorithm, a multithreaded method of its software implementation is used. The practical significance of the developed algorithms is confirmed by the results of their application in processing remote sensing data.


2020 ◽  
pp. 9-13
Author(s):  
A. V. Lapko ◽  
V. A. Lapko

An original technique has been justified for the fast bandwidths selection of kernel functions in a nonparametric estimate of the multidimensional probability density of the Rosenblatt–Parzen type. The proposed method makes it possible to significantly increase the computational efficiency of the optimization procedure for kernel probability density estimates in the conditions of large-volume statistical data in comparison with traditional approaches. The basis of the proposed approach is the analysis of the optimal parameter formula for the bandwidths of a multidimensional kernel probability density estimate. Dependencies between the nonlinear functional on the probability density and its derivatives up to the second order inclusive of the antikurtosis coefficients of random variables are found. The bandwidths for each random variable are represented as the product of an undefined parameter and their mean square deviation. The influence of the error in restoring the established functional dependencies on the approximation properties of the kernel probability density estimation is determined. The obtained results are implemented as a method of synthesis and analysis of a fast bandwidths selection of the kernel estimation of the two-dimensional probability density of independent random variables. This method uses data on the quantitative characteristics of a family of lognormal distribution laws.


Geophysics ◽  
2021 ◽  
pp. 1-43
Author(s):  
Dario Grana

Rock physics models are physical equations that map petrophysical properties into geophysical variables, such as elastic properties and density. These equations are generally used in quantitative log and seismic interpretation to estimate the properties of interest from measured well logs and seismic data. Such models are generally calibrated using core samples and well log data and result in accurate predictions of the unknown properties. Because the input data are often affected by measurement errors, the model predictions are often uncertain. Instead of applying rock physics models to deterministic measurements, I propose to apply the models to the probability density function of the measurements. This approach has been previously adopted in literature using Gaussian distributions, but for petrophysical properties of porous rocks, such as volumetric fractions of solid and fluid components, the standard probabilistic formulation based on Gaussian assumptions is not applicable due to the bounded nature of the properties, the multimodality, and the non-symmetric behavior. The proposed approach is based on the Kumaraswamy probability density function for continuous random variables, which allows modeling double bounded non-symmetric distributions and is analytically tractable, unlike the Beta or Dirichtlet distributions. I present a probabilistic rock physics model applied to double bounded continuous random variables distributed according to a Kumaraswamy distribution and derive the analytical solution of the posterior distribution of the rock physics model predictions. The method is illustrated for three rock physics models: Raymer’s equation, Dvorkin’s stiff sand model, and Kuster-Toksoz inclusion model.


2021 ◽  
Author(s):  
Tim C Jenkins

Abstract Superposed wavefunctions in quantum mechanics lead to a squared amplitude that introduces interference into a probability density, which has long been a puzzle because interference between probability densities exists nowhere else in probability theory. In recent years, Man’ko and coauthors have successfully reconciled quantum and classic probability using a symplectic tomographic model. Nevertheless, there remains an unexplained coincidence in quantum mechanics, namely, that mathematically, the interference term in the squared amplitude of superposed wavefunctions gives the squared amplitude the form of a variance of a sum of correlated random variables, and we examine whether there could be an archetypical variable behind quantum probability that provides a mathematical foundation that observes both quantum and classic probability directly. The properties that would need to be satisfied for this to be the case are identified, and a generic hidden variable that satisfies them is found that would be present everywhere, transforming into a process-specific variable wherever a quantum process is active. Uncovering this variable confirms the possibility that it could be the stochastic archetype of quantum probability.


2021 ◽  
Author(s):  
Tim C Jenkins

Abstract Superposed wavefunctions in quantum mechanics lead to a squared amplitude that introduces interference into a probability density, which has long been a puzzle because interference between probability densities exists nowhere else in probability theory. In recent years Man’ko and co-authors have successfully reconciled quantum and classical probability using a symplectic tomographic model. Nevertheless, there remains an unexplained coincidence in quantum mechanics, namely that mathematically the interference term in the squared amplitude of superposed wavefunctions has the form of a variance of a sum of correlated random variables and we examine whether there could be an archetypical variable behind quantum probability that provides a mathematical foundation that observes both quantum and classical probability directly. The properties that would need to be satisfied for this to be the case are identified, and a generic variable that satisfies them is found that would be present everywhere, transforming into a process-specific variable wherever a quantum process is active. This hidden generic variable appears to be such an archetype.


Author(s):  
Robert J Marks II

In this Chapter, we present application of Fourier analysis to probability, random variables and stochastic processes [1089, 1097, 1387, 1329]. Arandom variable, X, is the assignment of a number to the outcome of a random experiment. We can, for example, flip a coin and assign an outcome of a heads as X = 1 and a tails X = 0. Often the number is equated to the numerical outcome of the experiment, such as the number of dots on the face of a rolled die or the measurement of a voltage in a noisy circuit. The cumulative distribution function is defined by FX(x) = Pr[X ≤ x]. (4.1) The probability density function is the derivative fX(x) = d /dxFX(x). Our treatment of random variables focuses on use of Fourier analysis. Due to this viewpoint, the development we use is unconventional and begins immediately in the next section with discussion of properties of the probability density function.


2013 ◽  
Vol 135 (5) ◽  
Author(s):  
Baizhan Xia ◽  
Dejie Yu

To calculate the probability density function of the response of a random acoustic field, a change-of-variable perturbation stochastic finite element method (CVPSFEM), which integrates the perturbation stochastic finite element method (PSFEM) and the change-of-variable technique in a unified form, is proposed. In the proposed method, the response of a random acoustic field is approximated as a linear function of the random variables based on a first order stochastic perturbation analysis. According to the linear relationship between the response and the random variables, the formal expression of the probability density function of the response of a random acoustic field is obtained by the change-of-variable technique. The numerical examples on a two-dimensional (2D) acoustic tube and a three-dimensional (3D) acoustic cavity of an automobile cabin verify the accuracy and efficiency of the proposed method. Hence, the proposed method can be considered as an effective method to quantify the effects of the parametric randomness of a random acoustic field on the sound pressure response.


Author(s):  
Zhangli Hu ◽  
Xiaoping Du

In traditional reliability problems, the distribution of a basic random variable is usually unimodal; in other words, the probability density of the basic random variable has only one peak. In real applications, some basic random variables may follow bimodal distributions with two peaks in their probability density. When binomial variables are involved, traditional reliability methods, such as the first-order second moment (FOSM) method and the first-order reliability method (FORM), will not be accurate. This study investigates the accuracy of using the saddlepoint approximation (SPA) for bimodal variables and then employs SPA-based reliability methods with first-order approximation to predict the reliability. A limit-state function is at first approximated with the first-order Taylor expansion so that it becomes a linear combination of the basic random variables, some of which are bimodally distributed. The SPA is then applied to estimate the reliability. Examples show that the SPA-based reliability methods are more accurate than FOSM and FORM.


Sign in / Sign up

Export Citation Format

Share Document