Modified likelihood root in high dimensions

Author(s):  
Yanbo Tang ◽  
Nancy Reid
2011 ◽  
Vol 11 (3) ◽  
pp. 272
Author(s):  
Ivan Gavrilyuk ◽  
Boris Khoromskij ◽  
Eugene Tyrtyshnikov

Abstract In the recent years, multidimensional numerical simulations with tensor-structured data formats have been recognized as the basic concept for breaking the "curse of dimensionality". Modern applications of tensor methods include the challenging high-dimensional problems of material sciences, bio-science, stochastic modeling, signal processing, machine learning, and data mining, financial mathematics, etc. The guiding principle of the tensor methods is an approximation of multivariate functions and operators with some separation of variables to keep the computational process in a low parametric tensor-structured manifold. Tensors structures had been wildly used as models of data and discussed in the contexts of differential geometry, mechanics, algebraic geometry, data analysis etc. before tensor methods recently have penetrated into numerical computations. On the one hand, the existing tensor representation formats remained to be of a limited use in many high-dimensional problems because of lack of sufficiently reliable and fast software. On the other hand, for moderate dimensional problems (e.g. in "ab-initio" quantum chemistry) as well as for selected model problems of very high dimensions, the application of traditional canonical and Tucker formats in combination with the ideas of multilevel methods has led to the new efficient algorithms. The recent progress in tensor numerical methods is achieved with new representation formats now known as "tensor-train representations" and "hierarchical Tucker representations". Note that the formats themselves could have been picked up earlier in the literature on the modeling of quantum systems. Until 2009 they lived in a closed world of those quantum theory publications and never trespassed the territory of numerical analysis. The tremendous progress during the very recent years shows the new tensor tools in various applications and in the development of these tools and study of their approximation and algebraic properties. This special issue treats tensors as a base for efficient numerical algorithms in various modern applications and with special emphases on the new representation formats.


Author(s):  
Russell Cheng

This chapter examines the well-known Box-Cox method, which transforms a sample of non-normal observations into approximately normal form. Two non-standard aspects are highlighted. First, the likelihood of the transformed sample has an unbounded maximum, so that the maximum likelihood estimate is not consistent. The usually suggested remedy is to assume grouped data so that the sample becomes multinomial. An alternative method is described that uses a modified likelihood similar to the spacings function. This eliminates the infinite likelihood problem. The second problem is that the power transform used in the Box-Cox method is left-bounded so that the transformed observations cannot be exactly normal. This biases estimates of observational probabilities in an uncertain way. Moreover, the distributions fitted to the observations are not necessarily unimodal. A simple remedy is to assume the transformed observations have a left-bounded distribution, like the exponential; this is discussed in detail, and a numerical example given.


Author(s):  
David Kozak ◽  
Stephen Becker ◽  
Alireza Doostan ◽  
Luis Tenorio
Keyword(s):  

Author(s):  
Alireza Vafaei Sadr ◽  
Bruce A. Bassett ◽  
M. Kunz

AbstractAnomaly detection is challenging, especially for large datasets in high dimensions. Here, we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. DRAMA is released as a general python package that implements the general framework with a wide range of built-in options. This approach identifies the primary prototypes in the data with anomalies detected by their large distances from the prototypes, either in the latent space or in the original, high-dimensional space. DRAMA is tested on a wide variety of simulated and real datasets, in up to 3000 dimensions, and is found to be robust and highly competitive with commonly used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning, and highly unbalanced datasets. Besides, DRAMA naturally provides clustering of outliers for subsequent analysis.


Sign in / Sign up

Export Citation Format

Share Document