Visualization and modeling of sub-populations of compositional data: statistical methods illustrated by means of geochemical data from fumarolic fluids

2002 ◽  
Vol 91 (2) ◽  
pp. 357-368 ◽  
Author(s):  
Vera Pawlowsky-Glahn ◽  
Antonella Buccianti
1992 ◽  
Vol 56 (385) ◽  
pp. 469-475 ◽  
Author(s):  
H. R. Rollinson

AbstractCompositional data—that is data where concentrations are expressed as proportions of a whole, such as percentages or parts per million—have a number of peculiar mathematical properties which make standard statistical tests unworkable. In particular correlation analysis can produce geologically meaningless results. Aitchison (1986) proposed a log-ratio transformation of compositional data which allows inter-element relationships to be investigated. This method was applied to two sets of geochemical data—basalts from Kilauea Iki lava lake and grantic gneisses from the Limpopo Belt—and geologically 'sensible' results were obtained. Geochemists are encouraged to adopt the Aitchison method of data analysis in preference to the traditional but invalid approach which uses compositional data.


2021 ◽  
Author(s):  
Solveig Pospiech ◽  
Anne Taivalkoski ◽  
Yann Lahaye ◽  
Pertti Sarala ◽  
Janne Kinnunen ◽  
...  

<p>Modern mineral exploration is required to be conducted in a sustainable, environmentally friendly and socially acceptable way. Especially for the geochemical exploration on ecologically sensitive areas this poses a challenge because any heavy machinery or invasive methods might cause long-lasting damage to nature. One way of reducing the impact of mineral exploration on the environment during the early stages of exploration is to use surface sampling media, such as upper soil horizons, water, plants and, on high latitudes, also snow. Of these options, snow has several advantages: Sampling and analysing snow is fast and low in costs, it has no impact on the environment, and in wintertime it is ubiquitous and available independent of the ecosystem.<br>In the “New Exploration Technologies (NEXT)” project*, snow samples were collected in March-April 2019 to evaluate the usage of snow as a sampling material for mineral exploration. The test site was the Rajapalot Au-Co prospect in northern Finland, located 60 km west from Rovaniemi and operated by Mawson Oy. A stratified random sampling strategy was applied to place the sampling stations on the test site. The sampling comprised 94 snow samples and 12 field replicates. The samples were analysed at the GTK Research laboratory using a Nu AttoM single collector inductively coupled plasma mass spectrometry (SC-ICPMS) which returned analytical results for 52 elements at the ppt level. After applying quality control to the data, the elements Ba, Ca, Cd, Cr, Cs, Ga, Li, Mg, Rb, Sr, Tl and V showed good quality and were used in the final data analysis.<br>Geochemical data of drill cores were used to train a model to predict bedrock geochemistry based on the 12 available element concentrations of snow analysis. Prior to statistical methods, all geochemical data was transformed to log-ratio scores in order to ensure that results are independent of the selection of elements and to avoid spurious correlations (compositional data approach). Results show that snow data provide reasonable predictions of bedrock geochemistry for elements such as Ca, Cr, Li and Mg, but also for elements not used in snow data, such as Mn and Na. This suggests that snow can serve as a lithogeochemical mapping tool for potential geological domains. For the ore related elements Au, Ag, Co, and U the model provided predictions with higher uncertainty. Yet, the pattern of the predicted values of ore related elements show that snow can also be used to delineate prospective areas for continuing exploration with more sensitive methods.<br>*) This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 776804.</p>


Nutrients ◽  
2020 ◽  
Vol 12 (6) ◽  
pp. 1740
Author(s):  
Linda P. Siziba ◽  
Leonie Lorenz ◽  
Bernd Stahl ◽  
Marko Mank ◽  
Tamas Marosvölgyi ◽  
...  

The aim of this study was to determine the differences in human milk fatty acid composition in relation to maternal allergy within a large birth cohort study using statistical methods accounting for the correlations that exist in compositional data. We observed marginal differences in human milk fatty acid composition of allergic and non-allergic mothers. However, our results do not support the hypothesis that human milk fatty acid composition is influenced by allergy or that it differs between mothers with or without allergy. Observed differences in our results between transformed and untransformed fatty acid data call for re-evaluation of previous, as well as future, studies using statistical methods appropriate for compositionality of fatty acid data.


Minerals ◽  
2019 ◽  
Vol 9 (5) ◽  
pp. 302 ◽  
Author(s):  
Mahadi Bhuiyan ◽  
Kamran Esmaieli ◽  
Juan C. Ordóñez-Calderón

Analysis of geometallurgical data is essential to building geometallurgical models that capture physical variability in the orebody and can be used for the optimization of mine planning and the prediction of milling circuit performance. However, multivariate complexity and compositional data constraints can make this analysis challenging. This study applies unsupervised and supervised learning to establish relationships between the Bond ball mill work index (BWI) and geomechanical, geophysical and geochemical variables for the Paracatu gold orebody. The regolith and fresh rock geometallurgical domains are established from two cluster sets resulting from K-means clustering of the first three principal component (PC) scores of isometric log-ratio (ilr) coordinates of geochemical data and standardized BWI, geomechanical and geophysical data. The first PC is attributed to weathering and reveals a strong relationship between BWI and rock strength and fracture intensity in the regolith. Random forest (RF) classification of BWI in the fresh rock identifies the greater importance of geochemical ilr balances relative to geomechanical and geophysical variables.


2021 ◽  
Vol 2 (1) ◽  
pp. 263-270
Author(s):  
Rustam I. Timshanov ◽  
Sergey A. Sheshukov

To solve the problems of forecasting oil and gas content on one of the local structures of the South Tatar arch (Volzhsko-Kama anteclise), discriminant and neural network analyzes with training on reference wells were applied during the processing the results of geochemical surveys. Comparison with the results of the classical quantitative description of the geochemical field showed mainly the coincidence of areas of high hydrocarbon concentrations in near-surface sediments and anomalies identified by statistical methods. Based on the integration of the results of statistical processing of geochemical data and their geological interpretation, the structure was characterized as promising.


Geosciences ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 23
Author(s):  
Dahiru D. Muhammed ◽  
Naboth Simon ◽  
James E. P. Utley ◽  
Iris T. E. Verhagen ◽  
Robert A. Duller ◽  
...  

In the quest to use modern analogues to understand clay mineral distribution patterns to better predict clay mineral occurrence in ancient and deeply buried sandstones, it has been necessary to define palaeo sub-environments from cores through modern sediment successions. Holocene cores from Ravenglass in the NW of England, United Kingdom, contained metre-thick successions of massive sand that could not be unequivocally interpreted in terms of palaeo sub-environments using conventional descriptive logging facies analysis. We have therefore explored the use of geochemical data from portable X-ray fluorescence analyses, from whole-sediment samples, to develop a tool to uniquely define the palaeo sub-environment based on geochemical data. This work was carried out through mapping and defining sub-depositional environments in the Ravenglass Estuary and collecting 497 surface samples for analysis. Using R statistical software, we produced a classification tree based on surface geochemical data from Ravenglass that can take compositional data for any sediment sample from the core or the surface and define the sub-depositional environment. The classification tree allowed us to geochemically define ten out of eleven of the sub-depositional environments from the Ravenglass Estuary surface sediments. We applied the classification tree to a core drilled through the Holocene succession at Ravenglass, which allowed us to identify the dominant paleo sub-depositional environments. A texturally featureless (massive) metre-thick succession, that had defied interpretation based on core description, was successfully related to a palaeo sub-depositional environment using the geochemical classification approach. Calibrated geochemical classification models may prove to be widely applicable to the interpretation of sub-depositional environments from other marginal marine environments and even from ancient and deeply buried estuarine sandstones.


2020 ◽  
Author(s):  
Roberta Sauro Graziano ◽  
Renguang Zuo ◽  
Antonella Buccianti ◽  
Orlando Vaselli ◽  
Barbara Nisi ◽  
...  

<p>Groundwater systems are typical dissipative structures and their evolution can be affected by non-linear dynamics. In this framework, geochemical and hydrological processes are often characterized by random components mixed with intermittency and presence of positive feedbacks between fluid transport and mineral dissolution. Therefore, in these cases, complex variability structures in the chemical signature of waters are recognized. Large fluctuations in intermittent processes are not rare as in normal and log-normal processes and significantly contribute to the statistical moments, thus moving the physicochemical data from the Euclidean geometry to fractals and multifractals.</p><p>Since the knowledge of dynamics in water systems has substantial implications in the management of the water resource, groundwater chemistry can better be understood by using innovative graphical and numerical methods in the light of the Compositional Data Analysis Theory (CoDA, Aitchison, 1986), which is particularly suitable to explore the whole composition and the relationships between its parts.</p><p>The whole compositional change, characterizing each sample with respect to some end-members (i.e. rain waters, pristine waters and sea water), is modeled by using the perturbation operator in the simplex geometry (Pawlowsky-Glahn and Buccianti, 2011). Perturbation factors are calculated and then analyzed by investigating their cumulative distribution function (Pr[X>=x]) with the aim of registering the presence of power laws (fractal and multifractal dynamics) and forecasting a possible spatial behavior.</p><p>Results obtained for some aquifers from Tuscany (central Italy) are presented and discussed in the framework of the GEOBASI project (Nisi et al., 2016). Preliminary evaluations indicate that perturbation factors are sensible tools to: 1) identify the different components (random, deterministic, fractal) contributing to the variability of the geochemical data, 2) discriminate the role of additive and multiplicative phenomena in time and/or space, 3) highlight the presence of non-linear dissipation with the energy exchanges between different scales.[Office1] </p><p> </p><p>Aitchison, J., 1986.  The statistical analysis of compositional data. Monographs on Statistics and Applied Probability (Reprinted in 2003 by The Blackburn Press), Chapman and Hall, 416 p.</p><p>Nisi, B., Buccianti, A., Raco, B., Battaglini, R., 2016. Analysis of complex regional databases and their support in the identification of background/baseline compositional facies in groundwater investigation: developments and application examples. Journal of Geochemical Exploration 164, 3-17</p><p>Pawlowsky-Glahn, V., Buccianti, A., 2011. Compositional Data Analysis: Theory and applications. Chichester, John Wiley & Sons, 378 p.</p>


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Junkang Zhao ◽  
Zhiyao Li ◽  
Qian Gao ◽  
Haifeng Zhao ◽  
Shuting Chen ◽  
...  

Abstract Background Dietary pattern analysis is a promising approach to understanding the complex relationship between diet and health. While many statistical methods exist, the literature predominantly focuses on classical methods such as dietary quality scores, principal component analysis, factor analysis, clustering analysis, and reduced rank regression. There are some emerging methods that have rarely or never been reviewed or discussed adequately. Methods This paper presents a landscape review of the existing statistical methods used to derive dietary patterns, especially the finite mixture model, treelet transform, data mining, least absolute shrinkage and selection operator and compositional data analysis, in terms of their underlying concepts, advantages and disadvantages, and available software and packages for implementation. Results While all statistical methods for dietary pattern analysis have unique features and serve distinct purposes, emerging methods warrant more attention. However, future research is needed to evaluate these emerging methods’ performance in terms of reproducibility, validity, and ability to predict different outcomes. Conclusion Selection of the most appropriate method mainly depends on the research questions. As an evolving subject, there is always scope for deriving dietary patterns through new analytic methodologies.


Sign in / Sign up

Export Citation Format

Share Document