New Metrics for Validation of Data-Driven Random Process Models in Uncertainty Quantification

Hongyi Xu; Zhen Jiang; Daniel W. Apley; Wei Chen

doi:10.1115/1.4031813

New Metrics for Validation of Data-Driven Random Process Models in Uncertainty Quantification

Journal of Verification Validation and Uncertainty Quantification ◽

10.1115/1.4031813 ◽

2015 ◽

Vol 1 (2) ◽

Cited By ~ 4

Author(s):

Hongyi Xu ◽

Zhen Jiang ◽

Daniel W. Apley ◽

Wei Chen

Keyword(s):

Uncertainty Quantification ◽

Random Process ◽

Process Model ◽

Goodness Of Fit ◽

Experimental Tests ◽

Process Models ◽

Data Driven ◽

Gaussian Copula ◽

High Dimensional ◽

Marginal Distributions

Data-driven random process models have become increasingly important for uncertainty quantification (UQ) in science and engineering applications, due to their merit of capturing both the marginal distributions and the correlations of high-dimensional responses. However, the choice of a random process model is neither unique nor straightforward. To quantitatively validate the accuracy of random process UQ models, new metrics are needed to measure their capability in capturing the statistical information of high-dimensional data collected from simulations or experimental tests. In this work, two goodness-of-fit (GOF) metrics, namely, a statistical moment-based metric (SMM) and an M-margin U-pooling metric (MUPM), are proposed for comparing different stochastic models, taking into account their capabilities of capturing the marginal distributions and the correlations in spatial/temporal domains. This work demonstrates the effectiveness of the two proposed metrics by comparing the accuracies of four random process models (Gaussian process (GP), Gaussian copula, Hermite polynomial chaos expansion (PCE), and Karhunen–Loeve (K–L) expansion) in multiple numerical examples and an engineering example of stochastic analysis of microstructural materials properties. In addition to the new metrics, this paper provides insights into the pros and cons of various data-driven random process models in UQ.

Download Full-text

Knowledge Discovery Process Models

Advances in Business Information Systems and Analytics - Business Intelligence and Agile Methodologies for Knowledge-Based Organizations ◽

10.4018/978-1-61350-050-7.ch004 ◽

2012 ◽

pp. 72-100 ◽

Cited By ~ 5

Author(s):

Mouhib Alnoukari ◽

Asim El Sheikh

Keyword(s):

Life Cycle ◽

Knowledge Discovery ◽

Process Model ◽

Common Factor ◽

Final Outcome ◽

Process Models ◽

Data Driven ◽

Discovery Process ◽

The Common ◽

Discovery Process Models

Knowledge Discovery (KD) process model was first discussed in 1989. Different models were suggested starting with Fayyad’s et al (1996) process model. The common factor of all data-driven discovery process is that knowledge is the final outcome of this process. In this chapter, the authors will analyze most of the KD process models suggested in the literature. The chapter will have a detailed discussion on the KD process models that have innovative life cycle steps. It will propose a categorization of the existing KD models. The chapter deeply analyzes the strengths and weaknesses of the leading KD process models, with the supported commercial systems and reported applications, and their matrix characteristics.

Download Full-text

Data-Driven Adaptive Observer for Fault Diagnosis

Mathematical Problems in Engineering ◽

10.1155/2012/832836 ◽

2012 ◽

Vol 2012 ◽

pp. 1-21 ◽

Cited By ~ 90

Author(s):

Shen Yin ◽

Xuebo Yang ◽

Hamid Reza Karimi

Keyword(s):

Fault Diagnosis ◽

Process Model ◽

Large Scale ◽

Process Models ◽

Physical Models ◽

Adaptive Observer ◽

Data Driven ◽

Process Data ◽

Tank System ◽

Residual Generator

This paper presents an approach for data-driven design of fault diagnosis system. The proposed fault diagnosis scheme consists of an adaptive residual generator and a bank of isolation observers, whose parameters are directly identified from the process data without identification of complete process model. To deal with normal variations in the process, the parameters of residual generator are online updated by standard adaptive technique to achieve reliable fault detection performance. After a fault is successfully detected, the isolation scheme will be activated, in which each isolation observer serves as an indicator corresponding to occurrence of a particular type of fault in the process. The thresholds can be determined analytically or through estimating the probability density function of related variables. To illustrate the performance of proposed fault diagnosis approach, a laboratory-scale three-tank system is finally utilized. It shows that the proposed data-driven scheme is efficient to deal with applications, whose analytical process models are unavailable. Especially, for the large-scale plants, whose physical models are generally difficult to be established, the proposed approach may offer an effective alternative solution for process monitoring.

Download Full-text

An Enhanced Squared Exponential Kernel With Manhattan Similarity Measure for High Dimensional Gaussian Process Models

10.1115/detc2021-71445 ◽

2021 ◽

Author(s):

Yanwen Xu ◽

Pingfeng Wang

Keyword(s):

Gaussian Process ◽

Process Model ◽

Similarity Measures ◽

Kriging Model ◽

Process Models ◽

Superior Performance ◽

High Dimensional ◽

Manhattan Distance ◽

Exponential Kernel ◽

Gp Model

Abstract The Gaussian Process (GP) model has become one of the most popular methods and exhibits superior performance among surrogate models in many engineering design applications. However, the standard Gaussian process model is not able to deal with high dimensional applications. The root of the problem comes from the similarity measurements of the GP model that relies on the Euclidean distance, which becomes uninformative in the high-dimensional cases, and causes accuracy and efficiency issues. Limited studies explore this issue. In this study, thereby, we propose an enhanced squared exponential kernel using Manhattan distance that is more effective at preserving the meaningfulness of proximity measures and preferred to be used in the GP model for high-dimensional cases. The experiments show that the proposed approach has obtained a superior performance in high-dimensional problems. Based on the analysis and experimental results of similarity metrics, a guide to choosing the desirable similarity measures which result in the most accurate and efficient results for the Kriging model with respect to different sample sizes and dimension levels is provided in this paper.

Download Full-text

Point Process Models for the Spread of Coccidioidomycosis in California

Infectious Disease Reports ◽

10.3390/idr13020052 ◽

2021 ◽

Vol 13 (2) ◽

pp. 558-570

Author(s):

Jiajia Wang ◽

Ryan J. Harrigan ◽

Frederic P. Schoenberg

Keyword(s):

Public Health ◽

Infectious Disease ◽

United States ◽

Point Process ◽

Process Model ◽

Goodness Of Fit ◽

Mean Squared Error ◽

Process Models ◽

Squared Error ◽

Point Process Models

Coccidioidomycosis is an infectious disease of humans and other mammals that has seen a recent increase in occurrence in the southwestern United States, particularly in California. A rise in cases and risk to public health can serve as the impetus to apply newly developed methods that can quickly and accurately predict future caseloads. The recursive and Hawkes point process models with various triggering functions were fit to the data and their goodness of fit evaluated and compared. Although the point process models were largely similar in their fit to the data, the recursive point process model offered a slightly superior fit. We explored forecasting the spread of coccidioidomycosis in California from December 2002 to December 2017 using this recursive model, and we separated the training and testing portions of the data and achieved a root mean squared error of just 3.62 cases/week.

Download Full-text

A Simple Method for Testing Independence of High-Dimensional Random Vectors

Austrian Journal of Statistics ◽

10.17713/ajs.v37i1.291 ◽

2016 ◽

Vol 37 (1) ◽

Author(s):

Gintautas Jakimauskas ◽

Marijus Radavičius ◽

Jurgis Sušinskas

Keyword(s):

Goodness Of Fit ◽

Classification Problem ◽

Data Driven ◽

Sequential Testing ◽

High Dimensional ◽

Computationally Efficient ◽

Random Vectors ◽

Simple Method ◽

Efficient Procedure ◽

Testing Independence

A simple, data-driven and computationally efficient procedure for testing independence of high-dimensional random vectors is proposed. The procedure is based on interpretation of testing goodness-of-fit as the classification problem, a special sequential partition procedure, elements of sequential testing, resampling and randomization. Monte Carlo simulations are carried out to assess the performance of the procedure.

Download Full-text

Computing Confidence Intervals for Point Process Models

Neural Computation ◽

10.1162/neco_a_00198 ◽

2011 ◽

Vol 23 (11) ◽

pp. 2731-2745 ◽

Cited By ~ 3

Author(s):

Sridevi V. Sarma ◽

David P. Nguyen ◽

Gabriela Czanner ◽

Sylvia Wirth ◽

Matthew A. Wilson ◽

...

Keyword(s):

Point Process ◽

Process Model ◽

Goodness Of Fit ◽

Parametric Bootstrap ◽

Process Models ◽

Model Parameters ◽

Bootstrap Sampling ◽

Confidence Bounds ◽

Extrinsic Factors ◽

Point Process Models

Characterizing neural spiking activity as a function of intrinsic and extrinsic factors is important in neuroscience. Point process models are valuable for capturing such information; however, the process of fully applying these models is not always obvious. A complete model application has four broad steps: specification of the model, estimation of model parameters given observed data, verification of the model using goodness of fit, and characterization of the model using confidence bounds. Of these steps, only the first three have been applied widely in the literature, suggesting the need to dedicate a discussion to how the time-rescaling theorem, in combination with parametric bootstrap sampling, can be generally used to compute confidence bounds of point process models. In our first example, we use a generalized linear model of spiking propensity to demonstrate that confidence bounds derived from bootstrap simulations are consistent with those computed from closed-form analytic solutions. In our second example, we consider an adaptive point process model of hippocampal place field plasticity for which no analytical confidence bounds can be derived. We demonstrate how to simulate bootstrap samples from adaptive point process models, how to use these samples to generate confidence bounds, and how to statistically test the hypothesis that neural representations at two time points are significantly different. These examples have been designed as useful guides for performing scientific inference based on point process models.

Download Full-text

A Study of Power Spectral Density Models of Earthquake Ground Motion

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.90-93.1503 ◽

2011 ◽

Vol 90-93 ◽

pp. 1503-1510

Author(s):

Fu Jun Liu ◽

Yu Hua Zhu ◽

Xiao Hui Ma

Keyword(s):

Random Process ◽

Ground Motion ◽

Process Model ◽

Least Square Method ◽

Least Square ◽

Process Models ◽

Earthquake Ground Motion ◽

Power Spectral ◽

Proposed Model ◽

Power Spectral Densities

In this paper, a modified random process model of earthquake ground motion based on the model proposed by JinPing Ou is presented. The parameters in the model except the factor S0 are determined by using the least square method and the power spectral densities of 361 earthquake records. Then the method for determining the parameter S0 is proposed. The good performance of the proposed model in this paper in modeling the earthquake ground motion on firm ground is demonstrated by comparing it with other random process models.

Download Full-text

Politikvermittlung an der Schnittstelle von Unterhaltung und Information : Political communication at the interface of entertainment and information

SPIEL ◽

10.3726/spiel.2018.01.07 ◽

2019 ◽

Vol 4 (1) ◽

pp. 121-145

Author(s):

Larissa Leonhard ◽

Anne Bartsch ◽

Frank M. Schneider

Keyword(s):

Information Processing ◽

Political Participation ◽

Knowledge Acquisition ◽

Political Communication ◽

Information Seeking ◽

Process Model ◽

Political Information ◽

Process Models ◽

Dual Process ◽

Dual Process Model

This article presents an extended dual-process model of entertainment effects on political information processing and engagement. We suggest that entertainment consumption can either be driven by hedonic, escapist motivations that are associated with a superficial mode of information processing, or by eudaimonic, truth-seeking motivations that prompt more elaborate forms of information processing. This framework offers substantial extensions to existing dual-process models of entertainment by conceptualizing the effects of entertainment on active and reflective forms of information seeking, knowledge acquisition and political participation.

Download Full-text

Gaussian Slug - Simple Nonlinearity Enhancement to the 1-Factor and Gaussian Copula Models in Finance, with Parametric Estimation and Goodness-of-Fit Tests on US and Thai Equity Data

SSRN Electronic Journal ◽

10.2139/ssrn.1460576 ◽

2009 ◽

Author(s):

Poomjai Nacaskul

Keyword(s):

Goodness Of Fit ◽

Parametric Estimation ◽

Gaussian Copula ◽

Copula Models ◽

Goodness Of Fit Tests

Download Full-text

Joint Modeling of Precipitation and Temperature Using Copula Theory for Current and Future Prediction under Climate Change Scenarios in Arid Lands (Case Study, Kerman Province, Iran)

Advances in Meteorology ◽

10.1155/2019/6848049 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

T. Mesbahzadeh ◽

M. M. Miglietta ◽

M. Mirakbari ◽

F. Soleimani Sardoo ◽

M. Abdolhoseini

Keyword(s):

Climate Change ◽

Goodness Of Fit ◽

Joint Modeling ◽

Copula Function ◽

Gaussian Copula ◽

Climate Change Scenarios ◽

Climatic Parameters ◽

Goodness Of Fit Test ◽

Copula Theory ◽

Precipitation And Temperature

Precipitation and temperature are very important climatic parameters as their changes may affect life conditions. Therefore, predicting temporal trends of precipitation and temperature is very useful for societal and urban planning. In this research, in order to study the future trends in precipitation and temperature, we have applied scenarios of the fifth assessment report of IPCC. The results suggest that both parameters will be increasing in the studied area (Iran) in future. Since there is interdependence between these two climatic parameters, the independent analysis of the two fields will generate errors in the interpretation of model simulations. Therefore, in this study, copula theory was used for joint modeling of precipitation and temperature under climate change scenarios. By the joint distribution, we can find the structure of interdependence of precipitation and temperature in current and future under climate change conditions, which can assist in the risk assessment of extreme hydrological and meteorological events. Based on the results of goodness of fit test, the Frank copula function was selected for modeling of recorded and constructed data under RCP2.6 scenario and the Gaussian copula function was used for joint modeling of the constructed data under the RCP4.5 and RCP8.5 scenarios.

Download Full-text