3700 State Street. Suite 350 Santa Barbara, CA 93105  |  (805) 880-9300

Order out of Chaos: Geostatistics in Environmental Science

By Everett & Associates on Oct 13, 2012 at 12:30 PM in Environmental Issues

The voluminous and ubiquitous tables of analytical data in environmental reports might just hold the key to deciphering some of the most common questions that arise in environmental disputes: where does this contamination come from? Are there different sources of contamination at this site? How much is mine & how much is yours?

In mathematical terms, a table of analytical data is just a matrix describing points in multi- dimensional space. The objective of eigenvector statistical techniques such as factor analysis and principal components analysis (PCA) is to reveal the simple underlying structure that may exist within a set of multivariate observations. Or, in the words of John Davis, one of the pioneers in geostatistics, they hold the promise of insight when one is faced with “more data than comprehension.”

How can this widely-used, peer-reviewed method of multivariate statistics find order out of apparent chaos? No new information is created, rather, the statistical analysis is used to identify patterns that may otherwise have been lost in the numerical noise.

The reason eigenvector analysis is valuable lies in the underlying mathematical structure of environmental data sets. As multidimensional matrixes, environmental data sets could theoretically be plotted in n-dimensional space (with “n” equal to the number of chemicals detected or analyzed). This presents a comprehension problem because most of us cannot visualize more than three dimensions. However, another characteristic of virtually all environmental data sets is that they are highly interrelated or correlated (i.e., samples high in one VOC are usually also high in a number of other VOCs; or samples high in lead are often also high in selenium and zinc).

The value of eigenvector analysis as a statistical transformation is that it reduces the dimensionality of a data set while maintaining most of the compositional information contained in the data. These methods have been successfully employed to address environmental problems in many media, including groundwater, soil, atmospheric particulates, and sediments in lakes, rivers, bays or the ocean. When conducted with care, eigenvector statistical methods have the power to elucidate some of the common questions in environmental science:

a) The number of chemically distinct sources of contamination;

b) The composition of each source; and

c) The relative contribution of each source.