Tornar a Working Papers

Paper #309

Títol:
Principal curves and principal oriented points
Autor:
Pedro Delicado
Data:
Juliol 1998
Resum:
Principal curves have been defined Hastie and Stuetzle (JASA, 1989) as smooth curves passing through the middle of a multidimensional data set. They are nonlinear generalizations of the first principal component, a characterization of which is the basis for the principal curves definition. In this paper we propose an alternative approach based on a different property of principal components. Consider a point in the space where a multivariate normal is defined and, for each hyperplane containing that point, compute the total variance of the normal distribution conditioned to belong to that hyperplane. Choose now the hyperplane minimizing this conditional total variance and look for the corresponding conditional mean. The first principal component of the original distribution passes by this conditional mean and it is orthogonal to that hyperplane. This property is easily generalized to data sets with nonlinear structure. Repeating the search from different starting points, many points analogous to conditional means are found. We call them principal oriented points. When a one-dimensional curve runs the set of these special points it is called principal curve of oriented points. Successive principal curves are recursively defined from a generalization of the total variance.
Paraules clau:
Fixed points, generalized total variance, nonlinear multivariate analysis, principal components, smoothing techniques
Codis JEL:
C10, C14
Àrea de Recerca:
Estadística, Econometria i Mètodes Quantitatius
Publicat a:
Journal of Multivariate Analysis, 77, 84-116, 2001
Amb el títol:
Another look at principal curves and surfaces

Descarregar el paper en format PDF