Tornar a Working Papers

Paper #1380

Títol:
Weighted Euclidean biplots
Autors:
Michael Greenacre i Patrick J. F. Groenen
Data:
Juliol 2013
Resum:
We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.
Paraules clau:
biplot, correspondence analysis, distance, majorization, multidimensional scaling, singular-value decomposition, weighted least squares
Codis JEL:
C19, C88
Àrea de Recerca:
Estadística, Econometria i Mètodes Quantitatius
Publicat a:
Journal of Classification, 2016, 33, 442-459

Descarregar el paper en format PDF