Anomaly Detection in Automotive Industry Using Clustering Methods—A Case Study

Marcio Trindade Guerreiro; Eliana Maria Andriani Guerreiro; Tathiana Mikamura Barchi; Juliana Biluca; Thiago Antonini Alves; Yara de Souza de Souza Tadano; Flávio Trojan; Hugo Valadares Siqueira

doi:10.3390/app11219868

Anomaly Detection in Automotive Industry Using Clustering Methods—A Case Study

Applied Sciences ◽

10.3390/app11219868 ◽

2021 ◽

Vol 11 (21) ◽

pp. 9868

Author(s):

Marcio Trindade Guerreiro ◽

Eliana Maria Andriani Guerreiro ◽

Tathiana Mikamura Barchi ◽

Juliana Biluca ◽

Thiago Antonini Alves ◽

...

Keyword(s):

Anomaly Detection ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Physical Characteristics ◽

Borda Count ◽

Total Production ◽

Clustering Methods ◽

Self Organizing Maps ◽

Detection And Identification

In automotive industries, pricing anomalies may occur for components of different products, despite their similar physical characteristics, which raises the total production cost of the company. However, detecting such discrepancies is often neglected since it is necessary to find the problems considering the observation of thousands of pieces, which often present inconsistencies when specified by the product engineering team. In this investigation, we propose a solution for a real case study. We use as strategy a set of clustering algorithms to group components by similarity: K-Means, K-Medoids, Fuzzy C-Means (FCM), Hierarchical, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Self-Organizing Maps (SOM), Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and Differential Evolution (DE). We observed that the methods could automatically perform the grouping of parts considering physical characteristics present in the material master data, allowing anomaly detection and identification, which can consequently lead to cost reduction. The computational results indicate that the Hierarchical approach presented the best performance on 1 of 6 evaluation metrics and was the second place on four others indexes, considering the Borda count method. The K-Medoids win for most metrics, but it was the second best positioned due to its bad performance regarding SI-index. By the end, this proposal allowed identify mistakes in the specification and pricing of some items in the company.

Download Full-text

Measuring Functional Urban Shrinkage with Multi-Source Geospatial Big Data: A Case Study of the Beijing-Tianjin-Hebei Megaregion

Remote Sensing ◽

10.3390/rs12162513 ◽

2020 ◽

Vol 12 (16) ◽

pp. 2513 ◽

Cited By ~ 1

Author(s):

Qiwei Ma ◽

Zhaoya Gong ◽

Jing Kang ◽

Ran Tao ◽

Anrong Dang

Keyword(s):

Big Data ◽

Urban Areas ◽

Spatial Clustering ◽

Landscape Fragmentation ◽

Analytical Framework ◽

Urban Land Use ◽

Urban Systems ◽

Landsat 8 ◽

Clustering Methods

Most of the shrinking cities experience an unbalanced deurbanization across different urban areas in cities. However, traditional ways of measuring urban shrinkage are focused on tracking population loss at the city level and are unable to capture the spatially heterogeneous shrinking patterns inside a city. Consequently, the spatial mechanism and patterns of urban shrinkage inside a city remain less understood, which is unhelpful for developing accommodation strategies for shrinkage. The smart city initiatives and practices have provided a rich pool of geospatial big data resources and technologies to tackle the complexity of urban systems. Given this context, we propose a new measure for the delineation of shrinking areas within cities by introducing a new concept of functional urban shrinkage, which aims to capture the mismatch between urban built-up areas and the areas where significantly intensive human activities take place. Taking advantage of a data fusion approach to integrating multi-source geospatial big data and survey data, a general analytical framework is developed to construct functional shrinkage measures. Specifically, Landsat-8 remote sensing images were used for extracting urban built-up areas by supervised neural network classifications and Geographic Information System tools, while cellular signaling data from China Unicom Inc. was used to depict human activity areas generated by spatial clustering methods. Combining geospatial big data with urban land-use functions obtained from land surveys and Points-Of-Interests data, the framework further enables the comparison between cities from dimensions characterized by indices of spatial and urban functional characteristics and the landscape fragmentation; thus, it has the capacity to facilitate an in-depth investigation of fundamental causes and internal mechanisms of urban shrinkage. With a case study of the Beijing-Tianjin-Hebei megaregion using data from various sources collected for the year of 2018, we demonstrate the validity of this approach and its potential generalizability for other spatial contexts in facilitating timely and better-informed planning decision support.

Download Full-text

ELECTROFACIES CLASSIFICATION OF PONTA GROSSA FORMATION BY MULTI-RESOLUTION GRAPH-BASED CLUSTERING (MRGC) AND SELF-ORGANIZING MAPS (SOM) METHODS

Brazilian Journal of Geophysics ◽

10.22564/rbgf.v38i1.2032 ◽

2020 ◽

Vol 38 (1) ◽

pp. 52

Author(s):

Felipe Vasconcelos dos Passos ◽

Marco Antonio Braga ◽

Thiago Gonçalves Carelli ◽

Josiane Branco Plantz

Keyword(s):

Gamma Ray ◽

Clustering Algorithms ◽

Clustering Methods ◽

Self Organizing Maps ◽

Resolution Graph ◽

Lithology Prediction ◽

Paraná Basin ◽

Parana Basin ◽

Graph Based Clustering ◽

Self Organizing

ABSTRACT. In Ponta Grossa Formation, devonian interval of Paraná Basin, Brazil, sampling restrictions are frequent, and lithological interpretations from gamma ray logs are common. However, no single log can discriminate lithology unambiguously. An alternative to reduce the uncertainty of these assessments is to perform multivariate analysis of well logs using data clustering methods. In this sense, this study aims to apply two different clustering algorithms, trained with gamma ray, sonic and resistivity logs. Five electrofacies were differentiated and validated by core data. It was found that one of the electrofacies identified by the model was not distinguished by macroscopic descriptions. However, the model developed is sufficiently accurate for lithological predictions.Keywords: geophysical well logging, lithology prediction, Paraná Basin. CLASSIFICAÇÃO DE ELETROFÁCIES DA FORMAÇÃO PONTA GROSSA UTILIZANDO OS MÉTODOS MULTI-RESOLUTION GRAPH-BASED CLUSTERING (MRGC) E SELF-ORGANIZING MAPS (SOM)RESUMO. Na Formação Ponta Grossa, intervalo devoniano da Bacia do Paraná, Brasil, restrições de amostragem são frequentes e interpretações litológicas dos registros de raios gama são comuns. No entanto, nenhum perfil geofísico único pode discriminar litologias sem ambiguidade. Uma alternativa para reduzir a incerteza dessas avaliações é executar uma análise multivariada combinando vários perfis geofísicos de poços por meio de métodos de agrupamento de dados. Nesse sentido, este estudo tem como objetivo aplicar dois algoritmos de agrupamento aos registros de raios gama, sônico e resistividade para fins de predição litológica. Cinco eletrofácies foram diferenciadas e validadas por dados de testemunhos. Verificou-se que uma classe identificada pelo modelo não foi identificada por descrições macroscópicas. Porém, o modelo é suficientemente preciso para predições litológicas.Palavras-chave: geofísica de poços, predição litológica, correlação rocha-perfil, Bacia do Paraná.

Download Full-text

Self-organizing maps for anomaly detection in fuel consumption. Case study: Illegal fuel storage in Bolivia

2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI) ◽

10.1109/la-cci.2017.8285697 ◽

2017 ◽

Author(s):

Vanessa Gironda Aquize ◽

Eduardo Emery ◽

Fernando Buarque de Lima Neto

Keyword(s):

Anomaly Detection ◽

Fuel Consumption ◽

Self Organizing Maps ◽

Fuel Storage ◽

Self Organizing

Download Full-text

Mobile Anomaly Detection Based on Improved Self-Organizing Maps

Mobile Information Systems ◽

10.1155/2017/5674086 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 3

Author(s):

Chunyong Yin ◽

Sun Zhang ◽

Kwang-jun Kim

Keyword(s):

Anomaly Detection ◽

Mobile Devices ◽

Clustering Algorithms ◽

Recall Rate ◽

Accuracy Rate ◽

Optimal Method ◽

Self Organizing Maps ◽

Different Characteristics ◽

Precision Rate ◽

Self Organizing

Anomaly detection has always been the focus of researchers and especially, the developments of mobile devices raise new challenges of anomaly detection. For example, mobile devices can keep connection with Internet and they are rarely turned off even at night. This means mobile devices can attack nodes or be attacked at night without being perceived by users and they have different characteristics from Internet behaviors. The introduction of data mining has made leaps forward in this field. Self-organizing maps, one of famous clustering algorithms, are affected by initial weight vectors and the clustering result is unstable. The optimal method of selecting initial clustering centers is transplanted from K-means to SOM. To evaluate the performance of improved SOM, we utilize diverse datasets and KDD Cup99 dataset to compare it with traditional one. The experimental results show that improved SOM can get higher accuracy rate for universal datasets. As for KDD Cup99 dataset, it achieves higher recall rate and precision rate.

Download Full-text

Unsupervised Anomaly Detectors to Detect Intrusions in the Current Threat Landscape

ACM/IMS Transactions on Data Science ◽

10.1145/3441140 ◽

2021 ◽

Vol 2 (2) ◽

pp. 1-26

Author(s):

Tommaso Zoppi ◽

Andrea Ceccarelli ◽

Tommaso Capecchi ◽

Andrea Bondavalli

Keyword(s):

Intrusion Detection ◽

Anomaly Detection ◽

Binary Classification ◽

Clustering Algorithms ◽

Good Alternative ◽

Experimental Comparison ◽

Support Vector ◽

Self Organizing Maps ◽

Wide Range ◽

The Individual

Anomaly detection aims at identifying unexpected fluctuations in the expected behavior of a given system. It is acknowledged as a reliable answer to the identification of zero-day attacks to such extent, several ML algorithms that suit for binary classification have been proposed throughout years. However, the experimental comparison of a wide pool of unsupervised algorithms for anomaly-based intrusion detection against a comprehensive set of attacks datasets was not investigated yet. To fill such gap, we exercise 17 unsupervised anomaly detection algorithms on 11 attack datasets. Results allow elaborating on a wide range of arguments, from the behavior of the individual algorithm to the suitability of the datasets to anomaly detection. We conclude that algorithms as Isolation Forests, One-Class Support Vector Machines, and Self-Organizing Maps are more effective than their counterparts for intrusion detection, while clustering algorithms represent a good alternative due to their low computational complexity. Further, we detail how attacks with unstable, distributed, or non-repeatable behavior such as Fuzzing, Worms, and Botnets are more difficult to detect. Ultimately, we digress on capabilities of algorithms in detecting anomalies generated by a wide pool of unknown attacks, showing that achieved metric scores do not vary with respect to identifying single attacks.

Download Full-text

Using Self-Organizing Maps for Rural Territorial Typology

Computational Methods for Agricultural Research ◽

10.4018/978-1-61692-871-1.ch007 ◽

2011 ◽

pp. 107-126 ◽

Cited By ~ 3

Author(s):

Marcos Santos da Silva ◽

Edmar Ramos de Siqueira ◽

Olívio Teixeira ◽

Maria Manos ◽

Antônio Monteiro

Keyword(s):

Neural Network ◽

Temporal Analysis ◽

The Self ◽

Self Organizing Map ◽

Clustering Methods ◽

Self Organizing Maps ◽

Policy Makers ◽

Spatial Temporal Analysis ◽

Self Organizing

This work assessed the capacity of the self-organizing map, an unsupervised artificial neural network, to aid the process of territorial design through visualization and clustering methods applied to a multivariate geospatial temporal dataset. The method was applied in the case study of Sergipe‘s institutional regional partition (Territories of Identity). Results have shown that the proposed method can improve the exploratory spatial-temporal analysis capacity of policy makers that are interested in territorial typology. A new partition for rural planning was elaborated and confirmed the coherence of the Territories of Identity.

Download Full-text

Comparative Analysis of Three Methods for HYSPLIT Atmospheric Trajectories Clustering

Atmosphere ◽

10.3390/atmos12060698 ◽

2021 ◽

Vol 12 (6) ◽

pp. 698

Author(s):

Likai Cui ◽

Xiaoquan Song ◽

Guoqiang Zhong

Keyword(s):

Clustering Analysis ◽

Clustering Algorithms ◽

Principal Component ◽

Clustering Methods ◽

Trajectory Data ◽

Hysplit Model ◽

Self Organizing Maps ◽

Evaluation Indexes ◽

Global Data Assimilation System ◽

Environmental Prediction

Using the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model to obtain backward trajectories and then conduct clustering analysis is a common method to analyze potential sources and transmission paths of atmospheric particulate pollutants. Taking Qingdao (N36 E120) as an example, the global data assimilation system (GDAS 1°) of days from 2015 to 2018 provided by National Centers for Environmental Prediction (NCEP) is used to process the backward 72 h trajectory data of 3 arrival heights (10 m, 100 m, 500 m) through the HYSPLIT model with a data interval of 6 h (UTC 0:00, 6:00, 12:00, and 18:00 per day). Three common clustering methods of trajectory data, i.e., K-means, Hierarchical clustering (Hier), and Self-organizing maps (SOM), are used to conduct clustering analysis of trajectory data, and the results are compared with those of the HYSPLIT model released by National Oceanic and Atmospheric Administration (NOAA). Principal Component Analysis (PCA) is used to analyze the original trajectory data. The internal evaluation indexes of Davies–Bouldin Index (DBI), Silhouette Coefficient (SC), Calinski Harabasz Index (CH), and I index are used to quantitatively evaluate the three clustering algorithms. The results show that there is little information in the height data, and thus only two-dimensional plane data are used for clustering. From the results of clustering indexes, the clustering results of SOM and K-means are better than the Hier and HYSPLIT model. In addition, it is found that DBI and I index can help to select the number of clusters, of which DBI is preferred for cluster analysis.

Download Full-text

Generating IoT traffic: A Case Study on Anomaly Detection

2020 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN ◽

10.1109/lanman49260.2020.9153235 ◽

2020 ◽

Cited By ~ 1

Author(s):

Hung Nguyen-An ◽

Thomas Silverston ◽

Taku Yamazaki ◽

Takumi Miyoshi

Keyword(s):

Anomaly Detection

Download Full-text

Edge Computing for Data Anomaly Detection of Multi-Sensors in Underground Mining

Electronics ◽

10.3390/electronics10030302 ◽

2021 ◽

Vol 10 (3) ◽

pp. 302

Author(s):

Chunde Liu ◽

Xianli Su ◽

Chuanwen Li

Keyword(s):

Energy Consumption ◽

Anomaly Detection ◽

Underground Mining ◽

Heterogeneous Data ◽

Edge Computing ◽

Sensor Nodes ◽

Detection Methods ◽

Detection Accuracy ◽

Clustering Methods ◽

Safety Warning

There is a growing interest in safety warning of underground mining due to the huge threat being faced by those working in underground mining. Data acquisition of sensors based on Internet of Things (IoT) is currently the main method, but the data anomaly detection and analysis of multi-sensors is a challenging task: firstly, the data that are collected by different sensors of underground mining are heterogeneous; secondly, real-time is required for the data anomaly detection of safety warning. Currently, there are many anomaly detection methods, such as traditional clustering methods K-means and C-means. Meanwhile, Artificial Intelligence (AI) is widely used in data analysis and prediction. However, K-means and C-means cannot directly process heterogeneous data, and AI algorithms require equipment with high computing and storage capabilities. IoT equipment of underground mining cannot perform complex calculation due to the limitation of energy consumption. Therefore, many existing methods cannot be directly used for IoT applications in underground mining. In this paper, a multi-sensors data anomaly detection method based on edge computing is proposed. Firstly, an edge computing model is designed, and according to the computing capabilities of different types of devices, anomaly detection tasks are migrated to different edge devices, which solve the problem of insufficient computing capabilities of the devices. Secondly, according to the requirements of different anomaly detection tasks, edge anomaly detection algorithms for sensor nodes and sink nodes are designed respectively. Lastly, an experimental platform is built for performance comparison analysis, and the experimental results show that the proposed algorithm has better performance in anomaly detection accuracy, delay, and energy consumption.

Download Full-text

Meteorological and human mobility data on predicting COVID-19 cases by a novel hybrid decomposition method with anomaly detection analysis: a case study in the capitals of Brazil

Expert Systems with Applications ◽

10.1016/j.eswa.2021.115190 ◽

2021 ◽

pp. 115190

Author(s):

Tiago Tiburcio da Silva ◽

Rodrigo Francisquini ◽

Maric C. V. Nascimento

Keyword(s):

Anomaly Detection ◽

Decomposition Method ◽

Human Mobility ◽

Mobility Data ◽

Detection Analysis ◽

Hybrid Decomposition

Download Full-text