Volver a Working Papers

Paper #309

Título:
Principal curves and principal oriented points
Autor:
Pedro Delicado
Data:
Julio 1998
Resumen:
Principal curves have been defined Hastie and Stuetzle (JASA, 1989) as smooth curves passing through the middle of a multidimensional data set. They are nonlinear generalizations of the first principal component, a characterization of which is the basis for the principal curves definition. In this paper we propose an alternative approach based on a different property of principal components. Consider a point in the space where a multivariate normal is defined and, for each hyperplane containing that point, compute the total variance of the normal distribution conditioned to belong to that hyperplane. Choose now the hyperplane minimizing this conditional total variance and look for the corresponding conditional mean. The first principal component of the original distribution passes by this conditional mean and it is orthogonal to that hyperplane. This property is easily generalized to data sets with nonlinear structure. Repeating the search from different starting points, many points analogous to conditional means are found. We call them principal oriented points. When a one-dimensional curve runs the set of these special points it is called principal curve of oriented points. Successive principal curves are recursively defined from a generalization of the total variance.
Palabras clave:
Fixed points, generalized total variance, nonlinear multivariate analysis, principal components, smoothing techniques
Códigos JEL:
C10, C14
Área de investigación:
Estadística, Econometría y Métodos Cuantitativos
Publicado en:
Journal of Multivariate Analysis, 77, 84-116, 2001
Con el título:
Another look at principal curves and surfaces

Descargar el paper en formato PDF