A data mining strategy for inductive data clustering: a synergy between self-organising neural networks and K-means clustering techniques

Author(s):  
S.S.R. Abidi ◽  
J. Ong
2020 ◽  
Vol 1 (4) ◽  
pp. 1-6
Author(s):  
Arjun Dutta

This paper deals with concise study on clustering: existing methods and developments made at various times. Clustering is defined as an unsupervised learning where the targets are sorted out on the foundation of some similarity inherent among them. In the recent times, we dispense with large masses of data including images, video, social text, DNA, gene information, etc. Data clustering analysis has come out as an efficient technique to accurately achieve the task of categorizing information into sensible groups. Clustering has a deep association with researches in several scientific fields. k-means algorithm was suggested in 1957. K-mean is the most popular partitional clustering method till date. In many commercial and non-commercial fields, clustering techniques are used. The applications of clustering in some areas like image segmentation, object and role recognition and data mining are highlighted. In this paper, we have presented a brief description of the surviving types of clustering approaches followed by a survey of the areas.


2020 ◽  
Vol 4 (3) ◽  
pp. 744
Author(s):  
Murdiaty Murdiaty ◽  
Angela Angela ◽  
Chatrine Sylvia

Indonesia has fertile soil, natural resources and abundant marine resources. However, Indonesia is also not immune to the risk of natural disasters which are a series of events that disturb and threaten life safety and cause material and non-material losses. Indonesia's strategic geological location causes Indonesia to be frequently hit by earthquakes, volcanic eruptions and other natural disasters. From the data collected, natural disasters that occurred in Indonesia consisted of several categories, namely earthquakes, volcanic eruptions, floods, landslides, tornados, and tsunamis. Many natural disasters in Indonesia have caused casualties, both fatalities and injuries, destroying the surrounding area and destroying infrastructure and causing property losses. The trend of increasing incidence of natural disasters needs to be further investigated to prevent the number of victims from increasing. This information can be obtained through a data mining approach given the large amount of data available. In relation to natural disaster data, clustering techniques in data mining are very useful for grouping natural disaster data based on the same characteristics so that the data can be adopted as a groundwork for predicting natural disaster events in the future. Thus, this research is supposed to group natural disaster data using clustering techniques using the k-means algorithm into several groups, in terms of natural disaster types, time of disaster, number of victims, and damage to various facilities as a result of natural disasters


2018 ◽  
Vol 7 (2.32) ◽  
pp. 111
Author(s):  
Y Vijay Bhaskhar Reddy PP COMP.SCI.0560 ◽  
Dr L.S.S Reddy ◽  
Dr S.S.N. Reddy

Data extraction, data processing, pattern mining and clustering are the important features in data mining. The extraction of data and formation of interesting patterns from huge datasets can be used in prediction and decision making for further analysis. This improves, the need for efficient and effective analysis methods to make use of this data. Clustering is one important technique in data mining. In clustering a set of items are divided into several clusters where inter-cluster similarity is minimized and intra-cluster similarity is maximized. Clustering techniques are easy to identify of class in large databases. However, the application to large databases rises the following requirements for clustering techniques: minimal requirements of domain knowledge to determine the input specifications, invention of clusters with absolute shape & certainty of large databases.. The existing clustering techniques offer no solution to the combination of requirements. The proposed clustering technique DBSCAN using KNN relying on a density-based notion of clusters which is accomplished to discover clusters of arbitrary shape.  


2017 ◽  
Vol 3 (2) ◽  
pp. 735-738
Author(s):  
Wolfgang Doneit ◽  
Jana Lohse ◽  
Kristina Glesing ◽  
Clarissa Simon ◽  
Monika Fischer ◽  
...  

AbstractIn the project I-CARE a technical system for tablet devices is developed that captures the personal needs and skills of people with dementia. The system provides activation content such as music videos, biographical photographs and quizzes on various topics of interest to people with dementia, their families and professional caregivers. To adapt the system, the activation content is adjusted to the daily condition of individual users. For this purpose, emotions are automatically detected through facial expressions, motion, and voice. The daily interactions of the users with the tablet devices are documented in log files which can be merged into an event list. In this paper, we propose an advanced format for event lists and a data analysis strategy. A transformation scheme is developed in order to obtain datasets with features and time series for popular methods of data mining. The proposed methods are applied to analysing the interactions of people with dementia with the I-CARE tablet device. We show how the new format of event lists and the innovative transformation scheme can be used to compress the stored data, to identify groups of users, and to model changes of user behaviour. As the I-CARE user studies are still ongoing, simulated benchmark log files are applied to illustrate the data mining strategy. We discuss possible solutions to challenges that appear in the context of I-CARE and that are relevant to a broad range of applications.


In data mining ample techniques use distance based measures for data clustering. Improving clustering performance is the fundamental goal in cluster domain related tasks. Many techniques are available for clustering numerical data as well as categorical data. Clustering is an unsupervised learning technique and objects are grouped or clustered based on similarity among the objects. A new cluster similarity finding measure, which is cosine like cluster similarity measure (CLCSM), is proposed in this paper. The proposed cluster similarity measure is used for data classification. Extensive experiments are conducted by taking UCI machine learning datasets. The experimental results have shown that the proposed cosinelike cluster similarity measure is superior to many of the existing cluster similarity measures for data classification.


Sign in / Sign up

Export Citation Format

Share Document