An improved method for assessing the degree of geochemical similarity (DOGS2) between samples from multi-element geochemical datasets

P. de Caritat, A. Mann
2018 Geochemistry: Exploration, Environment, Analysis  
The multi-element aqua regia National Geochemical Survey of Australia (NGSA) database is used to demonstrate an improved method for quantifying the degree of geochemical similarity (DOGS2) between soil samples. The improvements introduced here address issues relating to compositional data (closure, relative scale). After removing the elements with excessive censored (below detection) values, the rank-based Spearman correlation coefficient (r s ) between samples is calculated for the remaining
more » ... for the remaining 51 elements. Each element is given equal weight through the rank-based correlation. The r s values for pairs of samples of known similar origin (e.g. granitoid-derived) are significantly positive, whereas they are significantly negative for pairs of samples of known dissimilar origin (e.g. granitoid-v. greenstone-derived). Maps of r s for all samples in the database against various reference samples are used to obtain correlation maps for lithological derivations. Likewise, the distribution of soils having a geochemical fingerprint similar to established mineralized provinces can be mapped, providing a simple, first order mineral prospectivity tool. Sensitivity of results to the removal of up to a dozen elements from the correlation indicates the method to be extremely robust. The new method is compliant with contemporary compositional data analysis principles and is applicable to various digestion methods. Multi-element databases, often containing in excess of 50 elements, are a common product of modern soil geochemistry programs, primarily due to the quality and quantity of data produced by modern instrumentation, in particular inductively coupled plasmamass spectrometry (ICP-MS). Presentation of informative statistical analysis derived from such datasets can be challenging and is commonly limited to one or two elements of interest, either because they are sought-after commodities or pathfinders in an exploration program or are potential contaminants in an environmental impact assessment. The alternative approach is to consider a geochemical composition as a multi-dimensional 'whole' and treat the data in a multivariate way. Correlation analysis can thus be used to define elements with common geochemical behaviour. Recently, principal component analysis (PCA) has been used to objectively 'discover' suites of elements with common characteristics (e.g. Caritat & Grunsky 2013; Zhang et al. 2014a), which can then guide follow-up interpretation. Grunsky (2010) provided a comprehensive discussion of multivariate data analysis techniques, including PCA and cluster analysis methods. The systematic and objective examination of databases for pattern recognition or inference of lithology and geology commonly requires advanced statistical and/or coding skills. The widely varying concentrations of elements in a multielement compositional database and their interdependencies present special problems for statistics-based methods of analysis (Aitchison 1986). Determining quantitatively how geological samples are similar or not based on major, trace element or isotopic geochemistry has applications in many fields. These include: sediment provenance, archaeology, agriculture, environmental investigations, geological
doi:10.1144/geochem2018-021 fatcat:7utj6uvfwzda3jarmcomti5azu