Visualization and modeling of sub-populations of compositional data: statistical methods illustrated by means of geochemical data from fumarolic fluids

Vera Pawlowsky-Glahn; Antonella Buccianti

doi:10.1007/s005310100222

Another look at the constant sum problem in geochemistry

Mineralogical Magazine ◽

10.1180/minmag.1992.056.385.03 ◽

1992 ◽

Vol 56 (385) ◽

pp. 469-475 ◽

Cited By ~ 29

Author(s):

H. R. Rollinson

Keyword(s):

Data Analysis ◽

Correlation Analysis ◽

Compositional Data ◽

Statistical Tests ◽

Lava Lake ◽

Geochemical Data ◽

Mathematical Properties ◽

Kīlauea Iki ◽

Log Ratio ◽

Limpopo Belt

AbstractCompositional data—that is data where concentrations are expressed as proportions of a whole, such as percentages or parts per million—have a number of peculiar mathematical properties which make standard statistical tests unworkable. In particular correlation analysis can produce geologically meaningless results. Aitchison (1986) proposed a log-ratio transformation of compositional data which allows inter-element relationships to be investigated. This method was applied to two sets of geochemical data—basalts from Kilauea Iki lava lake and grantic gneisses from the Limpopo Belt—and geologically 'sensible' results were obtained. Geochemists are encouraged to adopt the Aitchison method of data analysis in preference to the traditional but invalid approach which uses compositional data.

Download Full-text

Snow as environmentally low-impact sampling media for mineral exploration - a case study from Northern Finland

10.5194/egusphere-egu21-5174 ◽

2021 ◽

Author(s):

Solveig Pospiech ◽

Anne Taivalkoski ◽

Yann Lahaye ◽

Pertti Sarala ◽

Janne Kinnunen ◽

...

Keyword(s):

Compositional Data ◽

Mineral Exploration ◽

Test Site ◽

Geochemical Data ◽

The European Union ◽

Surface Sampling ◽

Horizon 2020 ◽

The Impact ◽

Log Ratio ◽

Snow Samples

Modern mineral exploration is required to be conducted in a sustainable, environmentally friendly and socially acceptable way. Especially for the geochemical exploration on ecologically sensitive areas this poses a challenge because any heavy machinery or invasive methods might cause long-lasting damage to nature. One way of reducing the impact of mineral exploration on the environment during the early stages of exploration is to use surface sampling media, such as upper soil horizons, water, plants and, on high latitudes, also snow. Of these options, snow has several advantages: Sampling and analysing snow is fast and low in costs, it has no impact on the environment, and in wintertime it is ubiquitous and available independent of the ecosystem. In the &#8220;New Exploration Technologies (NEXT)&#8221; project*, snow samples were collected in March-April 2019 to evaluate the usage of snow as a sampling material for mineral exploration. The test site was the Rajapalot Au-Co prospect in northern Finland, located 60 km west from Rovaniemi and operated by Mawson Oy. A stratified random sampling strategy was applied to place the sampling stations on the test site. The sampling comprised 94 snow samples and 12 field replicates. The samples were analysed at the GTK Research laboratory using a Nu AttoM single collector inductively coupled plasma mass spectrometry (SC-ICPMS) which returned analytical results for 52 elements at the ppt level. After applying quality control to the data, the elements Ba, Ca, Cd, Cr, Cs, Ga, Li, Mg, Rb, Sr, Tl and V showed good quality and were used in the final data analysis. Geochemical data of drill cores were used to train a model to predict bedrock geochemistry based on the 12 available element concentrations of snow analysis. Prior to statistical methods, all geochemical data was transformed to log-ratio scores in order to ensure that results are independent of the selection of elements and to avoid spurious correlations (compositional data approach). Results show that snow data provide reasonable predictions of bedrock geochemistry for elements such as Ca, Cr, Li and Mg, but also for elements not used in snow data, such as Mn and Na. This suggests that snow can serve as a lithogeochemical mapping tool for potential geological domains. For the ore related elements Au, Ag, Co, and U the model provided predictions with higher uncertainty. Yet, the pattern of the predicted values of ore related elements show that snow can also be used to delineate prospective areas for continuing exploration with more sensitive methods. *) This project has received funding from the European Union&#8217;s Horizon 2020 research and innovation programme under grant agreement No 776804.

Download Full-text

Multivariate analysis for geochemical process identification using stream sediment geochemical data: A perspective from compositional data

GEOCHEMICAL JOURNAL ◽

10.2343/geochemj.2.0415 ◽

2016 ◽

Vol 50 (4) ◽

pp. 293-314 ◽

Cited By ~ 18

Author(s):

Yue Liu ◽

Qiuming Cheng ◽

Kefa Zhou ◽

Qinglin Xia ◽

Xinqing Wang

Keyword(s):

Multivariate Analysis ◽

Compositional Data ◽

Stream Sediment ◽

Geochemical Data ◽

Geochemical Process ◽

Process Identification

Download Full-text

Human Milk Fatty Acid Composition of Allergic and Non-Allergic Mothers: The Ulm SPATZ Health Study

Nutrients ◽

10.3390/nu12061740 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1740

Author(s):

Linda P. Siziba ◽

Leonie Lorenz ◽

Bernd Stahl ◽

Marko Mank ◽

Tamas Marosvölgyi ◽

...

Keyword(s):

Fatty Acid Composition ◽

Fatty Acid ◽

Human Milk ◽

Statistical Methods ◽

Acid Composition ◽

Compositional Data ◽

Health Study ◽

Birth Cohort Study ◽

Milk Fatty Acid ◽

Fatty Acid Data

The aim of this study was to determine the differences in human milk fatty acid composition in relation to maternal allergy within a large birth cohort study using statistical methods accounting for the correlations that exist in compositional data. We observed marginal differences in human milk fatty acid composition of allergic and non-allergic mothers. However, our results do not support the hypothesis that human milk fatty acid composition is influenced by allergy or that it differs between mothers with or without allergy. Observed differences in our results between transformed and untransformed fatty acid data call for re-evaluation of previous, as well as future, studies using statistical methods appropriate for compositionality of fatty acid data.

Download Full-text

Application of Data Analytics Techniques to Establish Geometallurgical Relationships to Bond Work Index at the Paracutu Mine, Minas Gerais, Brazil

Minerals ◽

10.3390/min9050302 ◽

2019 ◽

Vol 9 (5) ◽

pp. 302 ◽

Cited By ~ 3

Author(s):

Mahadi Bhuiyan ◽

Kamran Esmaieli ◽

Juan C. Ordóñez-Calderón

Keyword(s):

Compositional Data ◽

Principal Component ◽

Strong Relationship ◽

Mine Planning ◽

Geochemical Data ◽

Circuit Performance ◽

Work Index ◽

Bond Work Index ◽

Log Ratio ◽

Fresh Rock

Analysis of geometallurgical data is essential to building geometallurgical models that capture physical variability in the orebody and can be used for the optimization of mine planning and the prediction of milling circuit performance. However, multivariate complexity and compositional data constraints can make this analysis challenging. This study applies unsupervised and supervised learning to establish relationships between the Bond ball mill work index (BWI) and geomechanical, geophysical and geochemical variables for the Paracatu gold orebody. The regolith and fresh rock geometallurgical domains are established from two cluster sets resulting from K-means clustering of the first three principal component (PC) scores of isometric log-ratio (ilr) coordinates of geochemical data and standardized BWI, geomechanical and geophysical data. The first PC is attributed to weathering and reveals a strong relationship between BWI and rock strength and fracture intensity in the regolith. Random forest (RF) classification of BWI in the fresh rock identifies the greater importance of geochemical ilr balances relative to geomechanical and geophysical variables.

Download Full-text

APPLICATION OF STATISTICAL METHODS FOR PROCESSING THE RESULTS OF GROUND GEOCHEMICAL SURVEY WITH THE PURPOSE OF OIL AND GAS FORECASTING

Interexpo GEO-Siberia ◽

10.33764/2618-981x-2021-2-1-263-270 ◽

2021 ◽

Vol 2 (1) ◽

pp. 263-270

Author(s):

Rustam I. Timshanov ◽

Sergey A. Sheshukov

Keyword(s):

Statistical Methods ◽

Oil And Gas ◽

Quantitative Description ◽

Statistical Processing ◽

Gas Content ◽

Geochemical Data ◽

High Hydrocarbon ◽

Near Surface ◽

Geological Interpretation ◽

Local Structures

To solve the problems of forecasting oil and gas content on one of the local structures of the South Tatar arch (Volzhsko-Kama anteclise), discriminant and neural network analyzes with training on reference wells were applied during the processing the results of geochemical surveys. Comparison with the results of the classical quantitative description of the geochemical field showed mainly the coincidence of areas of high hydrocarbon concentrations in near-surface sediments and anomalies identified by statistical methods. Based on the integration of the results of statistical processing of geochemical data and their geological interpretation, the structure was characterized as promising.

Download Full-text

Geochemistry of Sub-Depositional Environments in Estuarine Sediments: Development of an Approach to Predict Palaeo-Environments from Holocene Cores

Geosciences ◽

10.3390/geosciences12010023 ◽

2022 ◽

Vol 12 (1) ◽

pp. 23

Author(s):

Dahiru D. Muhammed ◽

Naboth Simon ◽

James E. P. Utley ◽

Iris T. E. Verhagen ◽

Robert A. Duller ◽

...

Keyword(s):

Clay Mineral ◽

Depositional Environment ◽

Compositional Data ◽

Classification Tree ◽

Distribution Patterns ◽

Depositional Environments ◽

Estuarine Sediments ◽

Geochemical Data ◽

Mineral Occurrence ◽

Geochemical Classification

In the quest to use modern analogues to understand clay mineral distribution patterns to better predict clay mineral occurrence in ancient and deeply buried sandstones, it has been necessary to define palaeo sub-environments from cores through modern sediment successions. Holocene cores from Ravenglass in the NW of England, United Kingdom, contained metre-thick successions of massive sand that could not be unequivocally interpreted in terms of palaeo sub-environments using conventional descriptive logging facies analysis. We have therefore explored the use of geochemical data from portable X-ray fluorescence analyses, from whole-sediment samples, to develop a tool to uniquely define the palaeo sub-environment based on geochemical data. This work was carried out through mapping and defining sub-depositional environments in the Ravenglass Estuary and collecting 497 surface samples for analysis. Using R statistical software, we produced a classification tree based on surface geochemical data from Ravenglass that can take compositional data for any sediment sample from the core or the surface and define the sub-depositional environment. The classification tree allowed us to geochemically define ten out of eleven of the sub-depositional environments from the Ravenglass Estuary surface sediments. We applied the classification tree to a core drilled through the Holocene succession at Ravenglass, which allowed us to identify the dominant paleo sub-depositional environments. A texturally featureless (massive) metre-thick succession, that had defied interpretation based on core description, was successfully related to a palaeo sub-depositional environment using the geochemical classification approach. Calibrated geochemical classification models may prove to be widely applicable to the interpretation of sub-depositional environments from other marginal marine environments and even from ancient and deeply buried estuarine sandstones.

Download Full-text

Innovative graphical-numerical methods to investigate compositional changes in groundwater systems

10.5194/egusphere-egu2020-19388 ◽

2020 ◽

Author(s):

Roberta Sauro Graziano ◽

Renguang Zuo ◽

Antonella Buccianti ◽

Orlando Vaselli ◽

Barbara Nisi ◽

...

Keyword(s):

Data Analysis ◽

Numerical Methods ◽

Compositional Data ◽

Spatial Behavior ◽

Cumulative Distribution ◽

Geochemical Data ◽

Compositional Data Analysis ◽

Analysis Theory ◽

Non Linear ◽

Groundwater Systems

Groundwater systems are typical dissipative structures and their evolution can be affected by non-linear dynamics. In this framework,&#160;geochemical and hydrological processes are&#160;often characterized by random components mixed with intermittency and presence of positive feedbacks between fluid transport and mineral dissolution. Therefore, in these cases, complex variability structures in the chemical signature of waters are recognized. Large fluctuations in intermittent processes are not rare as in normal and log-normal processes and significantly contribute to the statistical moments, thus moving the physicochemical data from the Euclidean geometry to fractals and multifractals.Since the knowledge of dynamics in water systems has substantial implications in the management of the water resource, groundwater chemistry can better be understood by using innovative graphical and numerical methods in the light of the Compositional Data Analysis Theory (CoDA, Aitchison, 1986), which is particularly suitable to explore the whole composition and the relationships between its parts.The whole compositional change, characterizing each sample with respect to some end-members (i.e. rain waters, pristine waters and sea water), is modeled by using the perturbation operator in the simplex geometry (Pawlowsky-Glahn and Buccianti, 2011). Perturbation factors are calculated and then analyzed by investigating their cumulative distribution function (Pr[X>=x]) with the aim&#160;of registering the presence of power laws (fractal and multifractal dynamics) and forecasting a possible spatial behavior.Results obtained for some aquifers from Tuscany (central Italy) are presented and discussed in the framework of the&#160;GEOBASI project (Nisi et al., 2016). Preliminary evaluations indicate that perturbation factors are sensible tools to: 1) identify the different components (random, deterministic, fractal) contributing to the variability of the geochemical data, 2) discriminate&#160;the role of additive and multiplicative phenomena in time and/or space, 3) highlight the presence of non-linear dissipation with the energy&#160;exchanges between different scales.[Office1]&#160;&#160;Aitchison, J., 1986. &#160;The statistical analysis of compositional data. Monographs on Statistics and Applied Probability (Reprinted in 2003 by The Blackburn Press), Chapman and Hall, 416 p.Nisi, B., Buccianti, A., Raco, B., Battaglini, R., 2016. Analysis of complex regional databases and their support in the identification of background/baseline&#160;compositional facies in groundwater investigation: developments and application examples. Journal of Geochemical Exploration 164, 3-17Pawlowsky-Glahn, V., Buccianti, A., 2011. Compositional Data Analysis: Theory and applications. Chichester, John Wiley & Sons,&#160;378 p.

Download Full-text

A review of statistical methods for dietary pattern analysis

Nutrition Journal ◽

10.1186/s12937-021-00692-7 ◽

2021 ◽

Vol 20 (1) ◽

Author(s):

Junkang Zhao ◽

Zhiyao Li ◽

Qian Gao ◽

Haifeng Zhao ◽

Shuting Chen ◽

...

Keyword(s):

Statistical Methods ◽

Dietary Patterns ◽

Dietary Pattern ◽

Pattern Analysis ◽

Compositional Data ◽

Dietary Quality ◽

Future Research ◽

Compositional Data Analysis ◽

Advantages And Disadvantages ◽

Dietary Pattern Analysis

Abstract Background Dietary pattern analysis is a promising approach to understanding the complex relationship between diet and health. While many statistical methods exist, the literature predominantly focuses on classical methods such as dietary quality scores, principal component analysis, factor analysis, clustering analysis, and reduced rank regression. There are some emerging methods that have rarely or never been reviewed or discussed adequately. Methods This paper presents a landscape review of the existing statistical methods used to derive dietary patterns, especially the finite mixture model, treelet transform, data mining, least absolute shrinkage and selection operator and compositional data analysis, in terms of their underlying concepts, advantages and disadvantages, and available software and packages for implementation. Results While all statistical methods for dietary pattern analysis have unique features and serve distinct purposes, emerging methods warrant more attention. However, future research is needed to evaluate these emerging methods’ performance in terms of reproducibility, validity, and ability to predict different outcomes. Conclusion Selection of the most appropriate method mainly depends on the research questions. As an evolving subject, there is always scope for deriving dietary patterns through new analytic methodologies.

Download Full-text

Geochemical data handling, using multivariate statistical methods for environmental monitoring and pollution studies

Environmental Technology & Innovation ◽

10.1016/j.eti.2020.100645 ◽

2020 ◽

Vol 18 ◽

pp. 100645

Author(s):

Gregory Udie Sikakwe ◽

Arthur Nwachukwu Nwachukwu ◽

Clementina Ukamaka Uwa ◽

God’swill Abam Eyong

Keyword(s):

Environmental Monitoring ◽

Statistical Methods ◽

Geochemical Data ◽

Data Handling ◽

Multivariate Statistical Methods ◽

Multivariate Statistical

Download Full-text