Automated integration of partially colocated models: Subsurface zonation using a modified fuzzy  c -means cluster analysis algorithm

Hendrik Paasche; Jens Tronicke; Peter Dietrich

doi:10.1190/1.3374411

Automated integration of partially colocated models: Subsurface zonation using a modified fuzzy c -means cluster analysis algorithm

Geophysics ◽

10.1190/1.3374411 ◽

2010 ◽

Vol 75 (3) ◽

pp. P11-P22 ◽

Cited By ~ 25

Author(s):

Hendrik Paasche ◽

Jens Tronicke ◽

Peter Dietrich

Keyword(s):

Cluster Analysis ◽

Cluster Algorithm ◽

Spatial Information ◽

Expert Knowledge ◽

Structural Information ◽

Geophysical Data ◽

Data Sets ◽

S Wave ◽

Cluster Algorithms ◽

Velocity Models

Partitioning cluster analyses are powerful tools for rapidly and objectively exploring and characterizing disparate geophysical databases with unknown interrelations between individual data sets or models. Despite its high potential to objectively extract the dominant structural information from suites of disparate geophysical data sets or models, cluster-analysis techniques are underused when analyzing geophysical data or models. This is due to the following limitations regarding the applicability of standard partitioning cluster algorithms to geophysical databases: The considered survey or model area must be fully covered by all data sets; cluster algorithms classify data in a multidimensional parameter space while ignoring spatial information present in the databases and are therefore sensitive to high-frequency spatial noise (outliers); and standard cluster algorithms such asfuzzy [Formula: see text]-means (FCM) or crisp [Formula: see text]-means classify data in an unsupervised manner, potentially ignoring expert knowledge additionally available to the experienced human interpreter. We address all of these issues by considering recent modifications to the standard FCM cluster algorithm to tolerate incomplete databases, i.e., survey or model areas not covered by all available data sets, and to consider spatial information present in the database. We have evaluated the regularized missing-value FCM cluster algorithm in a synthetic study and applied it to a database comprising partially colocated crosshole tomographic P- and S-wave-velocity models. Additionally, we were able to demonstrate how further expert knowledge can be incorporated in the cluster analysis to obtain a multiparameter geophysical model to objectively outline the dominant subsurface units, explaining all available geoscientific information.

Download Full-text

Full-waveform inversion for full-wavefield imaging: Decades in the making

The Leading Edge ◽

10.1190/tle40050324.1 ◽

2021 ◽

Vol 40 (5) ◽

pp. 324-334

Author(s):

Rongxin Huang ◽

Zhigang Zhang ◽

Zedong Wu ◽

Zhiyuan Wei ◽

Jiawei Mei ◽

...

Keyword(s):

Model Building ◽

Seismic Imaging ◽

Structural Information ◽

Waveform Inversion ◽

Velocity Model ◽

Data Sets ◽

Full Waveform Inversion ◽

Data Types ◽

Velocity Models ◽

Full Waveform

Seismic imaging using full-wavefield data that includes primary reflections, transmitted waves, and their multiples has been the holy grail for generations of geophysicists. To be able to use the full-wavefield data effectively requires a forward-modeling process to generate full-wavefield data, an inversion scheme to minimize the difference between modeled and recorded data, and, more importantly, an accurate velocity model to correctly propagate and collapse energy of different wave modes. All of these elements have been embedded in the framework of full-waveform inversion (FWI) since it was proposed three decades ago. However, for a long time, the application of FWI did not find its way into the domain of full-wavefield imaging, mostly owing to the lack of data sets with good constraints to ensure the convergence of inversion, the required compute power to handle large data sets and extend the inversion frequency to the bandwidth needed for imaging, and, most significantly, stable FWI algorithms that could work with different data types in different geologic settings. Recently, with the advancement of high-performance computing and progress in FWI algorithms at tackling issues such as cycle skipping and amplitude mismatch, FWI has found success using different data types in a variety of geologic settings, providing some of the most accurate velocity models for generating significantly improved migration images. Here, we take a step further to modify the FWI workflow to output the subsurface image or reflectivity directly, potentially eliminating the need to go through the time-consuming conventional seismic imaging process that involves preprocessing, velocity model building, and migration. Compared with a conventional migration image, the reflectivity image directly output from FWI often provides additional structural information with better illumination and higher signal-to-noise ratio naturally as a result of many iterations of least-squares fitting of the full-wavefield data.

Download Full-text

Separating Phages From Other Virus Families and Classifying the Different Phage Families By GI-Clusters

10.21203/rs.3.rs-1130357/v1 ◽

2021 ◽

Author(s):

Xingang Jia ◽

Qiuhong Han ◽

Zuhong Lu

Keyword(s):

Test Data ◽

Cluster Algorithm ◽

Nearest Neighbors ◽

Euclidean Algorithm ◽

Training Data ◽

Data Sets ◽

Maximum Element ◽

Clustering Techniques ◽

Cluster Algorithms ◽

Biological Entities

Abstract Background: Phages are the most abundant biological entities, but the commonly used clustering techniques are difficult to separate them from other virus families and classify the different phage families together.Results: This work uses GI-clusters to separate phages from other virus families and classify the different phage families, where GI-clusters are constructed by GI-features, GI-features are constructed by the togetherness with F-features, training data, MG-Euclidean and Icc-cluster algorithms, F-features are the frequencies of multiple-nucleotides that are generated from genomes of viruses, MG-Euclidean algorithm is able to put the nearest neighbors in the same mini-groups, and Icc-cluster algorithm put the distant samples to the different mini-clusters. For these viruses that the maximum element of their GI-features are in the same locations, they are put to the same GI-clusters, where the families of viruses in test data are identified by GI-clusters, and the families of GI-clusters are defined by viruses of training data.Conclusions: From analysis of 4 data sets that are constructed by the different family viruses, we demonstrate that GI-clusters are able to separate phages from other virus families, correctly classify the different phage families, and correctly predict the families of these unknown phages also.

Download Full-text

Automated Integration of Large Geophysical Data Sets using Three Partitioning Cluster Algorithms: a Comparison

10.3997/2214-4609-pdb.241.paasche_paper3 ◽

2009 ◽

Author(s):

H. Paasche ◽

D. Eberle

Keyword(s):

Geophysical Data ◽

Data Sets ◽

Cluster Algorithms

Download Full-text

A strategy for coupled 3D imaging of large-scale seismic and electromagnetic data sets: Application to subsalt imaging

Geophysics ◽

10.1190/geo2013-0053.1 ◽

2014 ◽

Vol 79 (3) ◽

pp. ID1-ID13 ◽

Cited By ~ 28

Author(s):

Evan Schankee Um ◽

Michael Commer ◽

Gregory A. Newman

Keyword(s):

Large Scale ◽

Structural Information ◽

Joint Inversion ◽

Seismic Inversion ◽

Intermediate Step ◽

Data Sets ◽

Velocity Models ◽

Velocity Images ◽

Convergence Problems ◽

Subsalt Imaging

Offshore seismic and electromagnetic (EM) imaging for hydrocarbons can require up to tens of millions of parameters to describe the 3D distribution of complex seabed geology and relevant geophysical attributes. The imaging and data volumes for such problems are enormous. Descent-based methods are the only viable imaging approach, where it is often challenging to manage the convergence of stand-alone seismic and EM inversion experiments. When a joint seismic-EM inversion is implemented, convergence problems with descent-based methods are further aggravated. Moreover, resolution mismatches between seismic and EM pose another challenge for joint inversion. To overcome these problems, we evaluated a coupled seismic-EM inversion workflow and applied it to a set of full-wave-seismic, magnetotelluric (MT) and controlled-source electromagnetic (CSEM) data for subsalt imaging. In our workflow, we address disparate resolution properties between seismic and EM data by implementing the seismic inversion in the Laplace domain, where the wave equation is transformed into a diffusion equation. The resolution of seismic data thus becomes comparable to that of EM data. To mitigate the convergence problems, the full joint seismic-EM inverse problem is split into manageable components: separate seismic and EM inversions and an intermediate step that enforces structural coupling through a cross-gradient-only inversion and resistivity-velocity crossplots. In this workflow, stand-alone seismic and MT inversion are performed first. The cross-gradient-only inversion and the crossplots are used to precondition the resistivity and velocity models for subsequent stand-alone inversions. By repeating the sequence of the stand-alone seismic, MT, and cross-gradient-only inversions along with the crossplots, we introduce the seismic structural information into the resistivity model, and vice versa, significantly improving the salt geometry in both resistivity and velocity images. We conclude that the improved salt geometry can then be used to precondition a starting model for CSEM inversions, yielding significant improvement in the resistivity images of hydrocarbon reservoirs adjacent to the salt.

Download Full-text

Integrated data analysis for mineral exploration: A case study of clustering satellite imagery, airborne gamma-ray, and regional geochemical data suites

Geophysics ◽

10.1190/geo2011-0063.1 ◽

2012 ◽

Vol 77 (4) ◽

pp. B167-B176 ◽

Cited By ~ 6

Author(s):

Detlef G. Eberle ◽

Hendrik Paasche

Keyword(s):

Cluster Analysis ◽

Gamma Ray ◽

Mineral Exploration ◽

Landsat Imagery ◽

Correlated Data ◽

Geochemical Data ◽

Data Sets ◽

Cluster Algorithms ◽

Manganese Mineralization ◽

Northern Cape

Partitioning cluster algorithms have proven to be powerful tools for data-driven integration of large geoscientific databases. We used fuzzy Gustafson-Kessel cluster analysis to integrate Landsat imagery, airborne radiometric, and regional geochemical data to aid in the interpretation of a multimethod database. The survey area extends over [Formula: see text] and is located in the Northern Cape Province, South Africa. We carefully selected five variables for cluster analysis to avoid the clustering results being dominated by spatially high-correlated data sets that were present in our database. Unlike other, more popular cluster algorithms, such as k-means or fuzzy c-means, the Gustafson-Kessel algorithm requires no preclustering data processing, such as scaling or adjustment of histographic data distributions. The outcome of cluster analysis was a classified map that delineates prominent near-to-surface structures. To add value to the classified map, we compared the detected structures to mapped geology and additional geophysical ground-truthing data. We were able to associate the structures detected by cluster analysis to geophysical and geological information thus obtaining a pseudolithology map. The latter outlined an area with increased mineral potential where manganese mineralization, i.e., psilomelane, had been located.

Download Full-text

Integrated interpretation of 2D ground-penetrating radar, P-, and S-wave velocity models in terms of petrophysical properties: Assessing uncertainties related to data inversion and petrophysical relations

Interpretation ◽

10.1190/int-2016-0081.1 ◽

2017 ◽

Vol 5 (1) ◽

pp. T121-T130 ◽

Cited By ~ 3

Author(s):

Jens Tronicke ◽

Hendrik Paasche

Keyword(s):

Monte Carlo ◽

Ground Penetrating Radar ◽

Geophysical Data ◽

Engineering Practice ◽

Quantitative Interpretation ◽

Petrophysical Properties ◽

S Wave ◽

Velocity Models ◽

Data Inversion ◽

Ground Penetrating

Near-surface geophysical techniques are extensively used in a variety of engineering, environmental, geologic, and hydrologic applications. While many of these applications ask for detailed, quantitative models of selected material properties, geophysical data are increasingly used to estimate such target properties. Typically, this estimation procedure relies on a two-step workflow including (1) the inversion of geophysical data and (2) the petrophysical translation of the inverted parameter models into the target properties. Standard deterministic implementations of such a quantitative interpretation result in a single best-estimate model, often without considering and propagating the uncertainties related to the two steps. We address this problem by using a rather novel, particle-swarm-based global joint strategy for data inversion and by implementing Monte Carlo procedures for petrophysical property estimation. We apply our proposed workflow to crosshole ground-penetrating radar, P-, and S-wave data sets collected at a well-constrained test site for a detailed geotechnical characterization of unconsolidated sands. For joint traveltime inversion, the chosen global approach results in ensembles of acceptable velocity models, which are analyzed to appraise inversion-related uncertainties. Subsequently, the entire ensembles of inverted velocity models are considered to estimate selected petrophysical properties including porosity, bulk density, and elastic moduli via well-established petrophysical relations implemented in a Monte Carlo framework. Our results illustrate the potential benefit of such an advanced interpretation strategy; i.e., the proposed workflow allows to study how uncertainties propagate into the finally estimated property models, while concurrently we are able to study the impact of uncertainties in the used petrophysical relations (e.g., the influence of uncertain, user-specified parameters). We conclude that such statistical approaches for the quantitative interpretation of geophysical data can be easily extended and adapted to other applications and geophysical methods and might be an important step toward increasing the popularity and acceptance of geophysical tools in engineering practice.

Download Full-text

Automated compilation of pseudo-lithology maps from geophysical data sets: a comparison of Gustafson-Kessel and fuzzyc-means cluster algorithms

Exploration Geophysics ◽

10.1071/eg11014 ◽

2011 ◽

Vol 42 (4) ◽

pp. 275-285 ◽

Cited By ~ 8

Author(s):

Hendrik Paasche ◽

Detlef Eberle

Keyword(s):

Geophysical Data ◽

Data Sets ◽

Cluster Algorithms ◽

Automated Compilation

Download Full-text

Integration and zonation of geophysical data sets with skewed histographic data distribution using Gustafson-Kessel cluster analysis

10.3997/2214-4609-pdb.165.b_op_02 ◽

2010 ◽

Author(s):

H. Paasche ◽

D. G. Eberle

Keyword(s):

Cluster Analysis ◽

Data Distribution ◽

Geophysical Data ◽

Data Sets

Download Full-text

Cooperative inversion of 2D geophysical data sets: A zonal approach based on fuzzy c-means cluster analysis

Geophysics ◽

10.1190/1.2670341 ◽

2007 ◽

Vol 72 (3) ◽

pp. A35-A39 ◽

Cited By ~ 78

Author(s):

Hendrik Paasche ◽

Jens Tronicke

Keyword(s):

Cluster Analysis ◽

Geophysical Methods ◽

Geophysical Data ◽

Data Sets ◽

Data Set ◽

Near Surface ◽

Zonal Model ◽

Subsurface Structures ◽

Single Input ◽

Cooperative Inversion

In many near-surface geophysical applications, it is now common practice to use multiple geophysical methods to explore subsurface structures and parameters. Such multimethod-based exploration strategies can significantly reduce uncertainties and ambiguities in geophysical data analysis and interpretation. We propose a novel 2D approach based on fuzzy [Formula: see text]-means cluster analysis for the cooperative inversion of disparate data sets. We show that this approach results in a single zonal model of subsurface structures in which each zone is characterized by a set of different parameters. This finding implies that no further structural interpretation of geophysical parameter fields is needed, which is a major advantage compared with conventional inversions that rely on a single input data set and cooperative inversion approaches.

Download Full-text

Cluster analysis for large data sets: applications to individual aerosol particles from the mid-pacific

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100132078 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1488-1489

Author(s):

Thomas W. Shattuck ◽

James R. Anderson ◽

Neil W. Tindale ◽

Peter R. Buseck

Keyword(s):

Cluster Analysis ◽

Chemical Reactivity ◽

Large Data ◽

Large Data Sets ◽

Particle Analysis ◽

Data Sets ◽

Halogen Chemistry ◽

Complete Study ◽

Components Analysis ◽

Automated Scanning

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.

Download Full-text