Back to all papers

Paper #1554

Title:
Towards a pragmatic approach to compositional data analysis
Author:
Michael Greenacre
Date:
January 2017
Abstract:
Compositional data are nonnegative data with the property of closure: that is, each set of values on their components, or so-called parts, has a fixed sum, usually 1 or 100%. The approach to compositional data analysis originated by John Aitchison uses ratios of parts as the fundamental starting point for description and modeling. I show that a compositional data set can be effectively replaced by a set of ratios, one less than the number of parts, and that these ratios describe an acyclic connected graph of all the parts. Contrary to recent literature, I show that the additive log-ratio transformation can be an excellent substitute for the original data set, as shown in an archaeological data set as well as in three other examples. I propose further that a smaller set of ratios of parts can be determined, either by expert choice or by automatic selection, which explains as much variance as required for all practical purposes. These part ratios can then be validly summarized and analyzed by conventional univariate methods, as well as multivariate methods, where the ratios are preferably log-transformed.
Keywords:
compositional data, log-ratio transformation, log-ratio analysis, log-ratio distance, multivariate analysis, ratios, subcompositional coherence, univariate statistics.
JEL codes:
Z32, C19, C38, C55
Area of Research:
Statistics, Econometrics and Quantitative Methods
Comment:

Download the paper in PDF format