FREQUENTIST INFERENCE IN INSURANCE RATEMAKING MODELS ADJUSTING FOR MISREPRESENTATION

2019 ◽  
Vol 49 (1) ◽  
pp. 117-146
Author(s):  
Rexford M. Akakpo ◽  
Michelle Xia ◽  
Alan M. Polansky

AbstractIn insurance underwriting, misrepresentation represents the type of insurance fraud when an applicant purposely makes a false statement on a risk factor that may lower his or her cost of insurance. Under the insurance ratemaking context, we propose to use the expectation-maximization (EM) algorithm to perform maximum likelihood estimation of the regression effects and the prevalence of misrepresentation for the misrepresentation model proposed by Xia and Gustafson [(2016) The Canadian Journal of Statistics, 44, 198–218]. For applying the EM algorithm, the unobserved status of misrepresentation is treated as a latent variable in the complete-data likelihood function. We derive the iterative formulas for the EM algorithm and obtain the analytical form of the Fisher information matrix for frequentist inference on the parameters of interest for lognormal losses. We implement the algorithm and demonstrate that valid inference can be obtained on the risk effect despite the unobserved status of misrepresentation. Applying the proposed algorithm, we perform a loss severity analysis with the Medical Expenditure Panel Survey data. The analysis reveals not only the potential impact misrepresentation may have on the risk effect but also statistical evidence on the presence of misrepresentation in the self-reported insurance status.

2011 ◽  
Vol 48 (A) ◽  
pp. 277-293 ◽  
Author(s):  
Mogens Bladt ◽  
Luz Judith R. Esparza ◽  
Bo Friis Nielsen

This paper is concerned with statistical inference for both continuous and discrete phase-type distributions. We consider maximum likelihood estimation, where traditionally the expectation-maximization (EM) algorithm has been employed. Certain numerical aspects of this method are revised and we provide an alternative method for dealing with the E-step. We also compare the EM algorithm to a direct Newton–Raphson optimization of the likelihood function. As one of the main contributions of the paper, we provide formulae for calculating the Fisher information matrix both for the EM algorithm and Newton–Raphson approach. The inverse of the Fisher information matrix provides the variances and covariances of the estimated parameters.


2011 ◽  
Vol 48 (A) ◽  
pp. 277-293
Author(s):  
Mogens Bladt ◽  
Luz Judith R. Esparza ◽  
Bo Friis Nielsen

This paper is concerned with statistical inference for both continuous and discrete phase-type distributions. We consider maximum likelihood estimation, where traditionally the expectation-maximization (EM) algorithm has been employed. Certain numerical aspects of this method are revised and we provide an alternative method for dealing with the E-step. We also compare the EM algorithm to a direct Newton–Raphson optimization of the likelihood function. As one of the main contributions of the paper, we provide formulae for calculating the Fisher information matrix both for the EM algorithm and Newton–Raphson approach. The inverse of the Fisher information matrix provides the variances and covariances of the estimated parameters.


2015 ◽  
Vol 4 (2) ◽  
pp. 74
Author(s):  
MADE SUSILAWATI ◽  
KARTIKA SARI

Missing data often occur in agriculture and animal husbandry experiment. The missing data in experimental design makes the information that we get less complete. In this research, the missing data was estimated with Yates method and Expectation Maximization (EM) algorithm. The basic concept of the Yates method is to minimize sum square error (JKG), meanwhile the basic concept of the EM algorithm is to maximize the likelihood function. This research applied Balanced Lattice Design with 9 treatments, 4 replications and 3 group of each repetition. Missing data estimation results showed that the Yates method was better used for two of missing data in the position on a treatment, a column and random, meanwhile the EM algorithm was better used to estimate one of missing data and two of missing data in the position of a group and a replication. The comparison of the result JKG of ANOVA showed that JKG of incomplete data larger than JKG of incomplete data that has been added with estimator of data. This suggest  thatwe need to estimate the missing data.


2002 ◽  
Vol 27 (3) ◽  
pp. 291-317 ◽  
Author(s):  
Natasha Rossi ◽  
Xiaohui Wang ◽  
James O. Ramsay

The methods of functional data analysis are used to estimate item response functions (IRFs) nonparametrically. The EM algorithm is used to maximize the penalized marginal likelihood of the data. The penalty controls the smoothness of the estimated IRFs, and is chosen so that, as the penalty is increased, the estimates converge to shapes closely represented by the three-parameter logistic family. The one-dimensional latent trait model is recast as a problem of estimating a space curve or manifold, and, expressed in this way, the model no longer involves any latent constructs, and is invariant with respect to choice of latent variable. Some results from differential geometry are used to develop a data-anchored measure of ability and a new technique for assessing item discriminability. Functional data-analytic techniques are used to explore the functional variation in the estimated IRFs. Applications involving simulated and actual data are included.


Author(s):  
Chandan K. Reddy ◽  
Bala Rajaratnam

In the field of statistical data mining, the Expectation Maximization (EM) algorithm is one of the most popular methods used for solving parameter estimation problems in the maximum likelihood (ML) framework. Compared to traditional methods such as steepest descent, conjugate gradient, or Newton-Raphson, which are often too complicated to use in solving these problems, EM has become a popular method because it takes advantage of some problem specific properties (Xu et al., 1996). The EM algorithm converges to the local maximum of the log-likelihood function under very general conditions (Demspter et al., 1977; Redner et al., 1984). Efficiently maximizing the likelihood by augmenting it with latent variables and guarantees of convergence are some of the important hallmarks of the EM algorithm. EM based methods have been applied successfully to solve a wide range of problems that arise in fields of pattern recognition, clustering, information retrieval, computer vision, bioinformatics (Reddy et al., 2006; Carson et al., 2002; Nigam et al., 2000), etc. Given an initial set of parameters, the EM algorithm can be implemented to compute parameter estimates that locally maximize the likelihood function of the data. In spite of its strong theoretical foundations, its wide applicability and important usage in solving some real-world problems, the standard EM algorithm suffers from certain fundamental drawbacks when used in practical settings. Some of the main difficulties of using the EM algorithm on a general log-likelihood surface are as follows (Reddy et al., 2008): • EM algorithm for mixture modeling converges to a local maximum of the log-likelihood function very quickly. • There are many other promising local optimal solutions in the close vicinity of the solutions obtained from the methods that provide good initial guesses of the solution. • Model selection criterion usually assumes that the global optimal solution of the log-likelihood function can be obtained. However, achieving this is computationally intractable. • Some regions in the search space do not contain any promising solutions. The promising and nonpromising regions co-exist and it becomes challenging to avoid wasting computational resources to search in non-promising regions. Of all the concerns mentioned above, the fact that most of the local maxima are not distributed uniformly makes it important to develop algorithms that not only help in avoiding some inefficient search over the lowlikelihood regions but also emphasize the importance of exploring promising subspaces more thoroughly (Zhang et al, 2004). This subspace search will also be useful for making the solution less sensitive to the initial set of parameters. In this chapter, we will discuss the theoretical aspects of the EM algorithm and demonstrate its use in obtaining the optimal estimates of the parameters for mixture models. We will also discuss some of the practical concerns of using the EM algorithm and present a few results on the performance of various algorithms that try to address these problems.


2000 ◽  
Author(s):  
Sergey V. Beiden ◽  
Gregory Campbell ◽  
Kristen L. Meier ◽  
Robert F. Wagner

Author(s):  
Asger Hobolth ◽  
Jens Ledet Jensen

We describe statistical inference in continuous time Markov processes of DNA sequences related by a phylogenetic tree. The maximum likelihood estimator can be found by the expectation maximization (EM) algorithm and an expression for the information matrix is also derived. We provide explicit analytical solutions for the EM algorithm and information matrix.


2003 ◽  
Vol 81 (2) ◽  
pp. 157-163 ◽  
Author(s):  
ZHANG YUAN-MING ◽  
GAI JUN-YI ◽  
YANG YONG-HUA

In this article, a new algorithm for obtaining the maximum likelihood estimators (MLEs) of parameters in the joint segregation analysis (JSA) of multiple generations of P1, F1, P2, F2 and F2[ratio ]3 (MG5) for quantitative traits was set up. Firstly, owing to the fact that the component variance of the heterogeneous genotype in F2[ratio ]3 included both the first-order genetic parameters (denoted by the means of distributions) and the second-order parameters, a simple closed form for the MLEs of the means of component distributions did not exist while the expectation and maximization (EM) algorithm was used. To simplify the estimation of parameters, the first partial derivative of the above variance on the mean in the sample log-likelihood function was omitted. However, this would be remedied by the iterated method. Then, variances of component distributions for segregating populations were partitioned into major-gene, polygenic and environmental variances so that the generally iterated formulae for estimating the means as well as polygenic and environmental variances of component distributions in the maximization step (M-step) of the EM algorithm were obtained. Therefore, the EM algorithm for estimating parameters in the JSA model for the MG5 was simplified. This is called the expectation and iterated maximization (EIM) algorithm. Finally, an example of the inheritance of the resistance of soybean to beanfly showed that the results of mixed inheritance analysis in this paper coincided with those in both Wang & Gai (2001) and Wei et al. (1989), so the EIM algorithm was appropriate.


Sign in / Sign up

Export Citation Format

Share Document