A Comparison of Self-organising Maps and Principal Components Analysis

2016 ◽  
Vol 58 (6) ◽  
pp. 815-834 ◽  
Author(s):  
Gopal Das ◽  
Manojit Chattopadhyay ◽  
Sumeet Gupta

This paper attempts to compare self-organising maps (SOM) and principal components analysis (CPA) by applying them to the marketing construct ‘retail store personality’. Data were collected for the retail store personality construct via a validated scale from previous studies that had used the mall intercept technique. A total of 367 people responded, of whom 353 were found to be valid for data analysis. Data were analysed using CPA and SOM; both methods gave comparable clustering results, although the results for SOM were quite conclusive. In addition, we found that SOM complemented PCA by providing visual clustering results far superior to those of PCA. SOM can be used to further analyse PCA data using visual clustering features; both could be used in tandem. Although SOM have been used in a number of studies in marketing, this is the first paper to compare PCA and SOM on terms of application to the marketing construct ‘retail store personality’.

2005 ◽  
Vol 35 (12) ◽  
pp. 2860-2874 ◽  
Author(s):  
Nikos Nanos ◽  
Fernando Pardo ◽  
Jesus Alonso Nager ◽  
José Alberto Pardos ◽  
Luis Gil

Vegetation ordination is usually based on classical data reduction techniques such as principal components analysis, correspondence analysis, or multidimensional scaling. The usual methods do not account for multiscale correlations among species. In this paper, we use a geostatistical method, known as multivariate factorial kriging, for studying multiple-scale correlations. The case study was carried out in a mixed broadleaf forest of central Spain. Six tree species were included in the analysis. Data analysis included (i) experimental variogram calculation and modeling with the use of the linear model of coregionalization, (ii) principal components analysis, and (iii) cokriging. The results indicate that correlations among species are different depending on the spatial scale. We conclude that competition for light is the main factor controlling the spatial distribution of species at the plot-level scale of variation. At larger scales of variation, soil conditions and (or) human intervention are the key factors in determining the observed vegetation pattern. Based on the factor scores for the largest scale of variation, we conducted a cluster analysis to identify plots with similar characteristics. The resulting clusters have the remarkable property of being spatially continuous.


2018 ◽  
Vol 96 (7) ◽  
pp. 738-748 ◽  
Author(s):  
Peter D. Wentzell ◽  
Chelsi C. Wicks ◽  
Jez W.B. Braga ◽  
Liz F. Soares ◽  
Tereza C.M. Pastore ◽  
...  

The analysis of multivariate chemical data is commonplace in fields ranging from metabolomics to forensic classification. Many of these studies rely on exploratory visualization methods that represent the multidimensional data in spaces of lower dimensionality, such as hierarchical cluster analysis (HCA) or principal components analysis (PCA). However, such methods rely on assumptions of independent measurement errors with uniform variance and can fail to reveal important information when these assumptions are violated, as they often are for chemical data. This work demonstrates how two alternative methods, maximum likelihood principal components analysis (MLPCA) and projection pursuit analysis (PPA), can reveal chemical information hidden from more traditional techniques. Experimental data to compare different methods consists of near-infrared (NIR) reflectance spectra from 108 samples of wood that are derived from four different species of Brazilian trees. The measurement error characteristics of the spectra are examined and it is shown that, by incorporating measurement error information into the data analysis (through MLPCA) or using alternative projection criteria (i.e., PPA), samples can be separated by species. These techniques are proposed as powerful tools for multivariate data analysis in chemistry.


2018 ◽  
Vol 4 (2) ◽  
pp. 100041 ◽  
Author(s):  
Hristo Todorov ◽  
David Fournier ◽  
Susanne Gerber

Advances in computational power have enabled research to generate significant amounts of data related to complex biological problems. Consequently, applying appropriate data analysis techniques has become paramount to tackle this complexity. However, theoretical understanding of statistical methods is necessary to ensure that the correct method is used and that sound inferences are made based on the analysis. In this article, we elaborate on the theory behind principal components analysis (PCA), which has become a favoured multivariate statistical tool in the field of omics-data analysis. We discuss the necessary prerequisites and steps to produce statistically valid results and provide guidelines for interpreting the output. Using PCA on gene expression data from a mouse experiment, we demonstrate that the main distinctive pattern in the data is associated with the transgenic mouse line and is not related to the mouse gender. A weaker association of the pattern with the genotype was also identified.


1980 ◽  
Vol 19 (04) ◽  
pp. 205-209
Author(s):  
L. A. Abbott ◽  
J. B. Mitton

Data taken from the blood of 262 patients diagnosed for malabsorption, elective cholecystectomy, acute cholecystitis, infectious hepatitis, liver cirrhosis, or chronic renal disease were analyzed with three numerical taxonomy (NT) methods : cluster analysis, principal components analysis, and discriminant function analysis. Principal components analysis revealed discrete clusters of patients suffering from chronic renal disease, liver cirrhosis, and infectious hepatitis, which could be displayed by NT clustering as well as by plotting, but other disease groups were poorly defined. Sharper resolution of the same disease groups was attained by discriminant function analysis.


Sign in / Sign up

Export Citation Format

Share Document