Tornar a Working Papers

Paper #1554

Títol:
Towards a pragmatic approach to compositional data analysis
Autor:
Michael Greenacre
Data:
Gener 2017
Resum:
Compositional data are nonnegative data with the property of closure: that is, each set of values on their components, or so-called parts, has a fixed sum, usually 1 or 100%. The approach to compositional data analysis originated by John Aitchison uses ratios of parts as the fundamental starting point for description and modeling. I show that a compositional data set can be effectively replaced by a set of ratios, one less than the number of parts, and that these ratios describe an acyclic connected graph of all the parts. Contrary to recent literature, I show that the additive log-ratio transformation can be an excellent substitute for the original data set, as shown in an archaeological data set as well as in three other examples. I propose further that a smaller set of ratios of parts can be determined, either by expert choice or by automatic selection, which explains as much variance as required for all practical purposes. These part ratios can then be validly summarized and analyzed by conventional univariate methods, as well as multivariate methods, where the ratios are preferably log-transformed.
Paraules clau:
compositional data, log-ratio transformation, log-ratio analysis, log-ratio distance, multivariate analysis, ratios, subcompositional coherence, univariate statistics.
Codis JEL:
Z32, C19, C38, C55
Àrea de Recerca:
Estadística, Econometria i Mètodes Quantitatius

Descarregar el paper en format PDF