scholarly journals Nonparametric estimation for probability mass function with Disake: an R package for discrete associated kernel estimators

2015 ◽  
Vol Volume 19 - 2015 - Special... ◽  
Author(s):  
W.E. Wansouwé ◽  
C.C. Kokonendji ◽  
D.T. Kolyang

International audience Kernel smoothing is one of the most widely used nonparametric data smoothing techniques. We introduce a new R package, Disake, for computing discrete associated kernel estimators for probability mass function. When working with a kernel estimator, two choices must be made: the kernel function and the smoothing parameter. The Disake package focuses on discrete associated kernels and also on cross-validation and local Bayesian techniques to select the appropriate bandwidth. Applications on simulated data and real data show that the binomial kernel is appropriate for small or moderate count data while the empirical estimator or the discrete triangular kernel is indicated for large samples.

2019 ◽  
Author(s):  
Cong Ma ◽  
Carl Kingsford

AbstractMutual information is widely used to characterize dependence between biological signals, such as co-expression between genes or co-evolution between amino acids. However, measurement error of the biological signals is rarely considered in estimating mutual information. Measurement error is widespread and non-negligible in some cases. As a result, the distribution of the signals is blurred, and the mutual information may be biased when estimated using the blurred measurements. We derive a corrected estimator for mutual information that accounts for the distribution of measurement error. Our corrected estimator is based on the correction of the probability mass function (PMF) or probability density function (PDF, based on kernel density estimation). We prove that the corrected estimator is asymptotically unbiased in the (semi-) discrete case when the distribution of measurement error is known. We show that it reduces the estimation bias in the continuous case under certain assumptions. On simulated data, our corrected estimator leads to a more accurate estimation for mutual information when the sample size is not the limiting factor for estimating PMF or PDF accurately. We compare the uncorrected and corrected estimator on the gene expression data of TCGA breast cancer samples and show a difference in both the value and the ranking of estimated mutual information between the two estimators.


2022 ◽  
Vol 7 (2) ◽  
pp. 1726-1741
Author(s):  
Ahmed Sedky Eldeeb ◽  
◽  
Muhammad Ahsan-ul-Haq ◽  
Mohamed. S. Eliwa ◽  
◽  
...  

<abstract> <p>In this paper, a flexible probability mass function is proposed for modeling count data, especially, asymmetric, and over-dispersed observations. Some of its distributional properties are investigated. It is found that all its statistical and reliability properties can be expressed in explicit forms which makes the proposed model useful in time series and regression analysis. Different estimation approaches including maximum likelihood, moments, least squares, Andersonӳ-Darling, Cramer von-Mises, and maximum product of spacing estimator, are derived to get the best estimator for the real data. The estimation performance of these estimation techniques is assessed via a comprehensive simulation study. The flexibility of the new discrete distribution is assessed using four distinctive real data sets ԣoronavirus-flood peaks-forest fire-Leukemia? Finally, the new probabilistic model can serve as an alternative distribution to other competitive distributions available in the literature for modeling count data.</p> </abstract>


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
B. I. Mohammed ◽  
Abdulaziz S. Alghamdi ◽  
Hassan M. Aljohani ◽  
Md. Moyazzem Hossain

This article proposes a novel class of bivariate distributions that are completely defined by stating their conditionals as Poisson exponential distributions. Numerous statistical properties of this distribution are also examined here, including the conditional probability mass function (PMF) and moments of the new class. The techniques of maximum likelihood and pseudolikelihood are used to estimate the model parameters. Additionally, the effectiveness of the bivariate Poisson exponential conditional (BPEC) distribution is compared to that of the bivariate Poisson conditional (BPC), the bivariate Poisson (BP), the bivariate Poisson–Lindley (BPL), and the bivariate negative binomial (BNB) distributions using a real-world dataset. The findings of Akaike information criterion (AIC) and Bayesian information criterion (BIC) reveal that the BPEC distribution performs better than the other distributions considered in this study. As a result, the authors claim that this distribution may be used to fit dependent and overspread count data.


1996 ◽  
Vol 26 (2) ◽  
pp. 213-224 ◽  
Author(s):  
Karl-Heinz Waldmann

AbstractRecursions are derived for a class of compound distributions having a claim frequency distribution of the well known (a,b)-type. The probability mass function on which the recursions are usually based is replaced by the distribution function in order to obtain increasing iterates. A monotone transformation is suggested to avoid an underflow in the initial stages of the iteration. The faster increase of the transformed iterates is diminished by use of a scaling function. Further, an adaptive weighting depending on the initial value and the increase of the iterates is derived. It enables us to manage an arbitrary large portfolio. Some numerical results are displayed demonstrating the efficiency of the different methods. The computation of the stop-loss premiums using these methods are indicated. Finally, related iteration schemes based on the cumulative distribution function are outlined.


Author(s):  
Zixi Han ◽  
Zixian Jiang ◽  
Sophie Ehrt ◽  
Mian Li

Abstract The design of a gas turbine compressor vane carrier (CVC) should meet mechanical integrity requirements on, among others, low-cycle fatigue (LCF). The number of cycles to the LCF failure is the result of cyclic mechanical and thermal strain effects caused by operating conditions on the components. The conventional LCF assessment is usually based on the assumption on standard operating cycles — supplemented by the consideration of predefined extreme operations and safety factors to compensate a potential underestimate on the LCF damage caused by multiple reasons such as non-standard operating cycles. However, real operating cycles can vary significantly from those standard ones considered in the conventional methods. The conventional prediction of LCF life can be very different from real cases, due to the included safety margins. This work presents a probabilistic method to estimate the distributions of the LCF life under varying operating conditions using operational fleet data. Finite element analysis (FEA) results indicate that the first ramp-up loading in each cycle and the turning time before hot-restart cycles are two predominant contributors to the LCF damage. A surrogate model of LCF damage has been built with regard to these two features to reduce the computational cost of FEA. Miner’s rule is applied to calculate the accumulated LCF damage on the component and then obtain the LCF life. The proposed LCF assessment approach has two special points. First, a new data processing technique inspired by the cumulative sum (CUSUM) control chart is proposed to identify the first ramp-up period of each cycle from noised operational data. Second, the probability mass function of the LCF life for a CVC is estimated using the sequential convolution of the single-cycle damage distribution obtained from operational data. The result from the proposed method shows that the mean value of the LCF life at a critical location of the CVC is significantly larger than the calculated result from the deterministic assessment, and the LCF lives for different gas turbines of the same class are also very different. Finally, to avoid high computational cost of sequential convolution, a quick approximation approach for the probability mass function of the LCF life is given. With the capability of dealing with varying operating conditions and noises in the operational data, the enhanced LCF assessment approach proposed in this work provides a probabilistic reference both for reliability analysis in CVC design, and for predictive maintenance in after-sales service.


Author(s):  
Panpan Zhang

In this paper, several properties of a class of trees presenting preferential attachment phenomenon—plane-oriented recursive trees (PORTs) are uncovered. Specifically, we investigate the degree profile of a PORT by determining the exact probability mass function of the degree of a node with a fixed label. We compute the expectation and the variance of degree variable via a Pólya urn approach. In addition, we study a topological index, Zagreb index, of this class of trees. We calculate the exact first two moments of the Zagreb index (of PORTs) by using recurrence methods. Lastly, we determine the limiting degree distribution in PORTs that grow in continuous time, where the embedding is done in a Poissonization framework. We show that it is exponential after proper scaling.


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 163 ◽  
Author(s):  
Qian Pan ◽  
Deyun Zhou ◽  
Yongchuan Tang ◽  
Xiaoyang Li ◽  
Jichuan Huang

Dempster-Shafer evidence theory (DST) has shown its great advantages to tackle uncertainty in a wide variety of applications. However, how to quantify the information-based uncertainty of basic probability assignment (BPA) with belief entropy in DST framework is still an open issue. The main work of this study is to define a new belief entropy for measuring uncertainty of BPA. The proposed belief entropy has two components. The first component is based on the summation of the probability mass function (PMF) of single events contained in each BPA, which are obtained using plausibility transformation. The second component is the same as the weighted Hartley entropy. The two components could effectively measure the discord uncertainty and non-specificity uncertainty found in DST framework, respectively. The proposed belief entropy is proved to satisfy the majority of the desired properties for an uncertainty measure in DST framework. In addition, when BPA is probability distribution, the proposed method could degrade to Shannon entropy. The feasibility and superiority of the new belief entropy is verified according to the results of numerical experiments.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 754 ◽  
Author(s):  
Kaitian Cao ◽  
Ping Qian ◽  
Jing An ◽  
Li Wang

In this study, a novel and exact closed-form expression for detection probability of energy detection (ED) in terms of Meijer’s G-function over α-μ generalized fading channels was derived. It is more accurate and practical than the existing exact expressions and has wide application prospects in the performance evaluations in various areas of wireless communications, especially in the wireless sensor network (WSN) and the cognitive radio network (CRN). Furthermore, an exact and simple analytical solution for the sample size meeting the desired detection performance in terms of the probability mass function of a Poisson distribution was also solved. Simulations verified the detection performance and accuracy of our derived expressions with a small sample size compared to the existing exact expressions and approximations.


Sign in / Sign up

Export Citation Format

Share Document