A Novel Just-in-Time Learning Strategy for Soft Sensing with Improved Similarity Measure Based on Mutual Information and PLS

Yueli Song; Minglun Ren

doi:10.3390/s20133804

A Novel Just-in-Time Learning Strategy for Soft Sensing with Improved Similarity Measure Based on Mutual Information and PLS

Sensors ◽

10.3390/s20133804 ◽

2020 ◽

Vol 20 (13) ◽

pp. 3804

Author(s):

Yueli Song ◽

Minglun Ren

Keyword(s):

Mutual Information ◽

Similarity Measure ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Learning Strategy ◽

Industrial Process ◽

Just In Time ◽

Measure Methods ◽

Input Variables ◽

Modeling Strategy

In modern industrial process control, just-in-time learning (JITL)-based soft sensors have been widely applied. An accurate similarity measure is crucial in JITL-based soft sensor modeling since it is not only the basis for selecting the nearest neighbor samples but also determines sample weights. In recent years, JITL similarity measure methods have been greatly enriched, including methods based on Euclidean distance, weighted Euclidean distance, correlation, etc. However, due to the different influence of input variables on output, the complex nonlinear relationship between input and output, the collinearity between input variables, and other complex factors, the above similarity measure methods may become inaccurate. In this paper, a new similarity measure method is proposed by combining mutual information (MI) and partial least squares (PLS). A two-stage calculation framework, including a training stage and a prediction stage, was designed in this study to reduce the online computational burden. In the prediction stage, to establish the local model, an improved locally weighted PLS (LWPLS) with variables and samples double-weighted was adopted. The above operations constitute a novel JITL modeling strategy, which is named MI-PLS-LWPLS. By comparison with other related JITL methods, the effectiveness of the MI-PLS-LWPLS method was verified through case studies on both a synthetic Friedman dataset and a real industrial dataset.

Download Full-text

The Analysis of Similarity Measure Function in Image Matching Algorithms

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.842.649 ◽

2013 ◽

Vol 842 ◽

pp. 649-653 ◽

Cited By ~ 1

Author(s):

Hong Liang Liu ◽

Wei Song ◽

Peng Yu Na ◽

Ming Li ◽

Pei Yang

Keyword(s):

Correlation Coefficient ◽

Similarity Measure ◽

Image Matching ◽

Euclidean Distance ◽

Distance Measure ◽

Noise Pollution ◽

Manhattan Distance ◽

Real Time Processing ◽

Measure Function ◽

Measure Methods

Similarity measure function is one of the most important factors influencing the matching precision in the field of computer vision. In this paper, a survey is done on the application frequency of distance similarity measure methods and related similarity measure methods, also the statistic characteristic is been given. The significance of Measure functions variable parameters in image matching is showed. In the real time processing aspect, drawn the conclusion that Manhattan distance measure is the fastest, Euclidean distance take second place, correlation coefficient is worst. However, in the robustness of the noise pollution aspect, correlation coefficient has the strongest robustness, then followed is Manhattan distance, Euclidean distance is worst.

Download Full-text

The Earth Mover’s Distance as a Metric for the Space of Inorganic Compositions

10.26434/chemrxiv.12777566.v1 ◽

2020 ◽

Author(s):

Cameron Hargreaves ◽

Matthew Dyer ◽

Michael Gaultois ◽

Vitaliy Kurlin ◽

Matthew J Rosseinsky

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Inorganic Crystal Structure Database ◽

Earth Mover’S Distance ◽

Chemical Similarity ◽

Earth Mover's Distance ◽

Neighbor Search ◽

The Earth ◽

Binary Compounds

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.

Download Full-text

Time-aspect-sentiment Recommendation Models Based on Novel Similarity Measure Methods

ACM Transactions on the Web ◽

10.1145/3375548 ◽

2020 ◽

Vol 14 (2) ◽

pp. 1-26

Author(s):

Guohui Li ◽

Qi Chen ◽

Bolong Zheng ◽

Nguyen Quoc Viet Hung ◽

Pan Zhou ◽

...

Keyword(s):

Similarity Measure ◽

Measure Methods

Download Full-text

Phase mutual information as a similarity measure for registration

Medical Image Analysis ◽

10.1016/j.media.2005.01.002 ◽

2005 ◽

Vol 9 (4) ◽

pp. 330-343 ◽

Cited By ~ 103

Author(s):

Matthew Mellor ◽

Michael Brady

Keyword(s):

Mutual Information ◽

Similarity Measure

Download Full-text

An Interactive Content Based Image Retrieval Method Integrating Intersection Kernel Based Support Vector Machine and Histogram Intersection Based Similarity Measure for Nearest Neighbor Ranking

Information and Communication Technologies - Communications in Computer and Information Science ◽

10.1007/978-3-642-15766-0_74 ◽

2010 ◽

pp. 458-462

Author(s):

Tanusree Bhattacharjee ◽

Biplab Banerjee ◽

Nirmalya Chowdhury

Keyword(s):

Support Vector Machine ◽

Image Retrieval ◽

Similarity Measure ◽

Nearest Neighbor ◽

Content Based Image Retrieval ◽

Support Vector ◽

Retrieval Method ◽

Histogram Intersection

Download Full-text

Efficient Shared Execution Processing of k-Nearest Neighbor Joins in Road Networks

Mobile Information Systems ◽

10.1155/2018/1243289 ◽

2018 ◽

Vol 2018 ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Hyung-Ju Cho

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Real Life ◽

Road Networks ◽

Nearest Neighbors ◽

Superior Performance ◽

K Nearest Neighbor ◽

Wide Range ◽

Primitive Operation ◽

Nested Loop

We investigate the k-nearest neighbor (kNN) join in road networks to determine the k-nearest neighbors (NNs) from a dataset S to every object in another dataset R. The kNN join is a primitive operation and is widely used in many data mining applications. However, it is an expensive operation because it combines the kNN query and the join operation, whereas most existing methods assume the use of the Euclidean distance metric. We alternatively consider the problem of processing kNN joins in road networks where the distance between two points is the length of the shortest path connecting them. We propose a shared execution-based approach called the group-nested loop (GNL) method that can efficiently evaluate kNN joins in road networks by exploiting grouping and shared execution. The GNL method can be easily implemented using existing kNN query algorithms. Extensive experiments using several real-life roadmaps confirm the superior performance and effectiveness of the proposed method in a wide range of problem settings.

Download Full-text

Ensemble Just-In-Time Learning-Based Soft Sensor for Mooney Viscosity Prediction in an Industrial Rubber Mixing Process

Advances in Polymer Technology ◽

10.1155/2020/6575326 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Huaiping Jin ◽

Jiangang Li ◽

Meng Wang ◽

Bin Qian ◽

Biao Yang ◽

...

Keyword(s):

Gaussian Process Regression ◽

Soft Sensor ◽

Viscosity Measurement ◽

Finite Mixture ◽

Just In Time ◽

Mixing Process ◽

Industrial Rubber ◽

Viscosity Prediction ◽

Mooney Viscosity ◽

Input Variables

The lack of online sensors for Mooney viscosity measurement has posed significant challenges for enabling efficient monitoring, control, and optimization of industrial rubber mixing process. To obtain real-time and accurate estimations of Mooney viscosity, a novel soft sensor method, referred to as multimodal perturbation- (MP-) based ensemble just-in-time learning Gaussian process regression (MP-EJITGPR), is proposed by exploiting ensemble JIT learning. This method employs perturbations on similarity measure and input variables for generating the diversity of JIT learners. Furthermore, a set of accurate and diverse JIT learners are built through an evolutionary multiobjective optimization by balancing the accuracy and diversity objectives explicitly. Moreover, all base JIT learners are combined adaptively using a finite mixture mechanism. The proposed method is applied to an industrial rubber mixing process for Mooney viscosity prediction, and the experimental results demonstrate its effectiveness and superiority over traditional soft sensor methods.

Download Full-text

Input Variables Selection Using Mutual Information for Neuro Fuzzy Modeling with the Application to Time Series Forecasting

2007 International Joint Conference on Neural Networks ◽

10.1109/ijcnn.2007.4371115 ◽

2007 ◽

Cited By ~ 5

Author(s):

M. M. Rezaei Yousefi ◽

M. Mirmomeni ◽

C. Lucas

Keyword(s):

Time Series ◽

Mutual Information ◽

Fuzzy Modeling ◽

Time Series Forecasting ◽

Neuro Fuzzy ◽

Variables Selection ◽

Input Variables

Download Full-text

Prototype Generation Using Self-Organizing Maps for Informativeness-Based Classifier

Computational Intelligence and Neuroscience ◽

10.1155/2017/4263064 ◽

2017 ◽

Vol 2017 ◽

pp. 1-15 ◽

Cited By ~ 2

Author(s):

Leandro Juvêncio Moreira ◽

Leandro A. Silva

Keyword(s):

Vector Quantization ◽

Similarity Measure ◽

Nearest Neighbor ◽

Optimal Choice ◽

Border Region ◽

Training Dataset ◽

Self Organizing Maps ◽

Nearest Neighbor Rule ◽

Two Parameters ◽

Self Organizing

The k nearest neighbor is one of the most important and simple procedures for data classification task. The kNN, as it is called, requires only two parameters: the number of k and a similarity measure. However, the algorithm has some weaknesses that make it impossible to be used in real problems. Since the algorithm has no model, an exhaustive comparison of the object in classification analysis and all training dataset is necessary. Another weakness is the optimal choice of k parameter when the object analyzed is in an overlap region. To mitigate theses negative aspects, in this work, a hybrid algorithm is proposed which uses the Self-Organizing Maps (SOM) artificial neural network and a classifier that uses similarity measure based on information. Since SOM has the properties of vector quantization, it is used as a Prototype Generation approach to select a reduced training dataset for the classification approach based on the nearest neighbor rule with informativeness measure, named iNN. The SOMiNN combination was exhaustively experimented and the results show that the proposed approach presents important accuracy in databases where the border region does not have the object classes well defined.

Download Full-text

Edgeworth Approximation of Multivariate Differential Entropy

Neural Computation ◽

10.1162/0899766054323026 ◽

2005 ◽

Vol 17 (9) ◽

pp. 1903-1910 ◽

Cited By ~ 52

Author(s):

Marc M. Van Hulle

Keyword(s):

Mutual Information ◽

Sample Size ◽

Nearest Neighbor ◽

Differential Entropy ◽

Multivariate Case ◽

Edgeworth Approximation

We develop the general, multivariate case of the Edgeworth approximation of differential entropy and show that it can be more accurate than the nearest-neighbor method in the multivariate case and that it scales better with sample size. Furthermore, we introduce mutual information estimation as an application.

Download Full-text