scholarly journals A Novel Just-in-Time Learning Strategy for Soft Sensing with Improved Similarity Measure Based on Mutual Information and PLS

Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3804
Author(s):  
Yueli Song ◽  
Minglun Ren

In modern industrial process control, just-in-time learning (JITL)-based soft sensors have been widely applied. An accurate similarity measure is crucial in JITL-based soft sensor modeling since it is not only the basis for selecting the nearest neighbor samples but also determines sample weights. In recent years, JITL similarity measure methods have been greatly enriched, including methods based on Euclidean distance, weighted Euclidean distance, correlation, etc. However, due to the different influence of input variables on output, the complex nonlinear relationship between input and output, the collinearity between input variables, and other complex factors, the above similarity measure methods may become inaccurate. In this paper, a new similarity measure method is proposed by combining mutual information (MI) and partial least squares (PLS). A two-stage calculation framework, including a training stage and a prediction stage, was designed in this study to reduce the online computational burden. In the prediction stage, to establish the local model, an improved locally weighted PLS (LWPLS) with variables and samples double-weighted was adopted. The above operations constitute a novel JITL modeling strategy, which is named MI-PLS-LWPLS. By comparison with other related JITL methods, the effectiveness of the MI-PLS-LWPLS method was verified through case studies on both a synthetic Friedman dataset and a real industrial dataset.

2013 ◽  
Vol 842 ◽  
pp. 649-653 ◽  
Author(s):  
Hong Liang Liu ◽  
Wei Song ◽  
Peng Yu Na ◽  
Ming Li ◽  
Pei Yang

Similarity measure function is one of the most important factors influencing the matching precision in the field of computer vision. In this paper, a survey is done on the application frequency of distance similarity measure methods and related similarity measure methods, also the statistic characteristic is been given. The significance of Measure functions variable parameters in image matching is showed. In the real time processing aspect, drawn the conclusion that Manhattan distance measure is the fastest, Euclidean distance take second place, correlation coefficient is worst. However, in the robustness of the noise pollution aspect, correlation coefficient has the strongest robustness, then followed is Manhattan distance, Euclidean distance is worst.


2020 ◽  
Author(s):  
Cameron Hargreaves ◽  
Matthew Dyer ◽  
Michael Gaultois ◽  
Vitaliy Kurlin ◽  
Matthew J Rosseinsky

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.


2020 ◽  
Vol 14 (2) ◽  
pp. 1-26
Author(s):  
Guohui Li ◽  
Qi Chen ◽  
Bolong Zheng ◽  
Nguyen Quoc Viet Hung ◽  
Pan Zhou ◽  
...  

2005 ◽  
Vol 9 (4) ◽  
pp. 330-343 ◽  
Author(s):  
Matthew Mellor ◽  
Michael Brady

2018 ◽  
Vol 2018 ◽  
pp. 1-17 ◽  
Author(s):  
Hyung-Ju Cho

We investigate the k-nearest neighbor (kNN) join in road networks to determine the k-nearest neighbors (NNs) from a dataset S to every object in another dataset R. The kNN join is a primitive operation and is widely used in many data mining applications. However, it is an expensive operation because it combines the kNN query and the join operation, whereas most existing methods assume the use of the Euclidean distance metric. We alternatively consider the problem of processing kNN joins in road networks where the distance between two points is the length of the shortest path connecting them. We propose a shared execution-based approach called the group-nested loop (GNL) method that can efficiently evaluate kNN joins in road networks by exploiting grouping and shared execution. The GNL method can be easily implemented using existing kNN query algorithms. Extensive experiments using several real-life roadmaps confirm the superior performance and effectiveness of the proposed method in a wide range of problem settings.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Huaiping Jin ◽  
Jiangang Li ◽  
Meng Wang ◽  
Bin Qian ◽  
Biao Yang ◽  
...  

The lack of online sensors for Mooney viscosity measurement has posed significant challenges for enabling efficient monitoring, control, and optimization of industrial rubber mixing process. To obtain real-time and accurate estimations of Mooney viscosity, a novel soft sensor method, referred to as multimodal perturbation- (MP-) based ensemble just-in-time learning Gaussian process regression (MP-EJITGPR), is proposed by exploiting ensemble JIT learning. This method employs perturbations on similarity measure and input variables for generating the diversity of JIT learners. Furthermore, a set of accurate and diverse JIT learners are built through an evolutionary multiobjective optimization by balancing the accuracy and diversity objectives explicitly. Moreover, all base JIT learners are combined adaptively using a finite mixture mechanism. The proposed method is applied to an industrial rubber mixing process for Mooney viscosity prediction, and the experimental results demonstrate its effectiveness and superiority over traditional soft sensor methods.


2017 ◽  
Vol 2017 ◽  
pp. 1-15 ◽  
Author(s):  
Leandro Juvêncio Moreira ◽  
Leandro A. Silva

The k nearest neighbor is one of the most important and simple procedures for data classification task. The kNN, as it is called, requires only two parameters: the number of k and a similarity measure. However, the algorithm has some weaknesses that make it impossible to be used in real problems. Since the algorithm has no model, an exhaustive comparison of the object in classification analysis and all training dataset is necessary. Another weakness is the optimal choice of k parameter when the object analyzed is in an overlap region. To mitigate theses negative aspects, in this work, a hybrid algorithm is proposed which uses the Self-Organizing Maps (SOM) artificial neural network and a classifier that uses similarity measure based on information. Since SOM has the properties of vector quantization, it is used as a Prototype Generation approach to select a reduced training dataset for the classification approach based on the nearest neighbor rule with informativeness measure, named iNN. The SOMiNN combination was exhaustively experimented and the results show that the proposed approach presents important accuracy in databases where the border region does not have the object classes well defined.


2005 ◽  
Vol 17 (9) ◽  
pp. 1903-1910 ◽  
Author(s):  
Marc M. Van Hulle

We develop the general, multivariate case of the Edgeworth approximation of differential entropy and show that it can be more accurate than the nearest-neighbor method in the multivariate case and that it scales better with sample size. Furthermore, we introduce mutual information estimation as an application.


Sign in / Sign up

Export Citation Format

Share Document