An EM Clustering Algorithm which Produces a Dual Representation

Author(s):  
Sun Kim ◽  
W. J. Wilbur
2018 ◽  
Vol 2 (1) ◽  
pp. 36-44
Author(s):  
Sitti Sufiah Atirah Rosly ◽  
Balkiah Moktar ◽  
Muhamad Hasbullah Mohd Razali

Air quality is one of the most popular environmental problems in this globalization era. Air pollution is the poisonous air that comes from car emissions, smog, open burning, chemicals from factories and other particles and gases. This harmful air can give adverse effects to human health and the environment. In order to provide information which areas are better for the residents in Malaysia, cluster analysis is used to determine the areas that can be clustering together based on their a ir quality through several air quality substances. Monthly data from 37 monitoring stations in Peninsular Malaysia from the year 2013 to 2015 were used in this study. K - Means (KM) clustering algorithm, Expectation Maximization (EM) clustering algorithm and Density Based (DB) clustering algorithm have been chosen as the techniques to analyze the cluster analysis by utilizing the Waikato Environment for Knowledge Analysis (WEKA) tools. Results show that K - means clustering algorithm is the best method among ot her algorithms due to its simplicity and time taken to build the model. The output of K - means clustering algorithm shows that it can cluster the area into two clusters, namely as cluster 0 and cluster 1. Clusters 0 consist of 16 monitoring stations and clu ster 1 consists of 36 monitoring stations in Peninsular Malaysia.


2012 ◽  
Vol 45 (11) ◽  
pp. 3950-3961 ◽  
Author(s):  
Miin-Shen Yang ◽  
Chien-Yo Lai ◽  
Chih-Ying Lin

Atmospheric science focuses on weather processes and forecasting. Numerical and statistical analysis plays an important role in meteorological research. Meteorological data will be used to predict the changes in climatic patterns by using forecasting models and weather forecasting instruments. Data mining techniques have more scope to discover future weather patterns by analyzing past weather dimensions. In our study two techniques Multiple Linear Regression (MLR) and Expectation Maximization (EM) clustering algorithms are combined for rainfall forecasting. MLR interprets most important parameters of rainfall for clustering algorithm. EM clustering algorithm will find correctly and incorrectly clustered instances when applied on selected partitioned attributes. The model was able to forecast less rainfall, medium rainfall and high rainfall by analyzing past meteorological observations. Standard deviation is used as a measure of error correction to improve the cluster results. Data normalization helps to improve model performance. These findings are useful to determine future climate expectation.


Author(s):  
Yulia Ledeneva ◽  
René García Hernández ◽  
Romyna Montiel Soto ◽  
Rafael Cruz Reyes ◽  
Alexander Gelbukh

Sign in / Sign up

Export Citation Format

Share Document