Improved KNN Algorithm Based on Preprocessing of Center in Smart Cities

Complexity ◽

10.1155/2021/5524388 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Haiyan Wang ◽

Peidi Xu ◽

Jinghua Zhao

Keyword(s):

Machine Learning ◽

Data Mining ◽

Smart Cities ◽

Training Set ◽

Spherical Region ◽

Random Experiment ◽

Region Division ◽

The Stability

The KNN algorithm is one of the most famous algorithms in machine learning and data mining. It does not preprocess the data before classification, which leads to longer time and more errors. To solve the problems, this paper first proposes a PK-means++ algorithm, which can better ensure the stability of a random experiment. Then, based on it and spherical region division, an improved KNNPK+ is proposed. The algorithm can select the center of the spherical region appropriately and then construct an initial classifier for the training set to improve the accuracy and time of classification.

Download Full-text

Towards Behaviour Recognition with Unlabelled Sensor Data

Human Behavior Recognition Technologies ◽

10.4018/978-1-4666-3682-8.ch005 ◽

2013 ◽

pp. 86-110

Author(s):

Sook-Ling Chua ◽

Stephen Marsland ◽

Hans W. Guesgen

Keyword(s):

Machine Learning ◽

Data Mining ◽

Inverse Problem ◽

Sensor Data ◽

Training Set ◽

Learning Methods ◽

Machine Learning Methods ◽

Using Data ◽

Symbolic Approach ◽

Behaviour Recognition

The problem of behaviour recognition based on data from sensors is essentially an inverse problem: given a set of sensor observations, identify the sequence of behaviours that gave rise to them. In a smart home, the behaviours are likely to be the standard human behaviours of living, and the observations will depend upon the sensors that the house is equipped with. There are two main approaches to identifying behaviours from the sensor stream. One is to use a symbolic approach, which explicitly models the recognition process. Another is to use a sub-symbolic approach to behaviour recognition, which is the focus in this chapter, using data mining and machine learning methods. While there have been many machine learning methods of identifying behaviours from the sensor stream, they have generally relied upon a labelled dataset, where a person has manually identified their behaviour at each time. This is particularly tedious to do, resulting in relatively small datasets, and is also prone to significant errors as people do not pinpoint the end of one behaviour and commencement of the next correctly. In this chapter, the authors consider methods to deal with unlabelled sensor data for behaviour recognition, and investigate their use. They then consider whether they are best used in isolation, or should be used as preprocessing to provide a training set for a supervised method.

Download Full-text

A Review on Various Algorithms used in Machine Learning

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952248 ◽

2019 ◽

pp. 915-920

Author(s):

Divya Chaudhary ◽

Er. Richa Vasuja

Keyword(s):

Machine Learning ◽

Data Mining ◽

New Technologies ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Learning System ◽

Training Set ◽

Learning Data ◽

Do So

In today's scenario all of data is being generated by everyone of us . so it becomes vital for us to handle this data. To do so new technologies are being developed such as machine learning, data mining etc. This paper gives the study related to machine learning(ML).Precise approximations are repetitively being produced by Machine Learning algorithms. Machine learning system effectively “learns” how to guess from training set of completed jobs. The main purpose of the review is to give a jagged estimate or overview about the mostly used algorithms in machine learning.

Download Full-text

Disease Diagnosis System using Machine Learning

Journal of Pharmaceutical Research International ◽

10.9734/jpri/2021/v33i33b31810 ◽

2021 ◽

pp. 185-194

Author(s):

Shailesh D. Kamble ◽

Pawan Patel ◽

Punit Fulzele ◽

Yash Bangde ◽

Hitesh Musale ◽

...

Keyword(s):

Machine Learning ◽

Data Mining ◽

Disease Diagnosis ◽

Current Data ◽

The Body ◽

Training Set ◽

Use Of Data ◽

Common Illnesses ◽

Heath Care ◽

Professional Model

The efficient use of data mining in virtual sectors such as e-соmmerсe, and соmmerсe has led to its use in other industries. The mediсаl environment is still rich but weaker in technical analysis field. There is а lot of information that саn оссur within mediсаl systems. Using powerful analytics tооls to identify the hidden relationships with the current data trends. Disease is а term that provides а large number of соnditiоns connected to the heath care. These mediсаl соnditiоns describe unexpected health соnditiоns that directly соntrоl all the оrgаns of the body. Mediсаl data mining methods such as соrроrаte management mines, сlаssifiсаtiоn, integration is used to аnаlyze various types of соmmоn рhysiсаl problems. Seраrаtiоn is an imроrtаnt рrоblem in data mining. Many рорulаr сliрs make decision trees to рrоduсe саtegоry models. Data сlаssifiсаtiоn is based on the ID3 decision tree algorithm that leads to ассurасy, data are estimated to use entrорy verifiсаtiоn methods based on сrоss-seсtiоnаl and segmentation and results are соmраred. The database used for mасhine learning is divided into 3 parts - training, testing, and finally validation. This approach uses а training set to train а model and define its аррrорriаte раrаmeters. А test set is required to test а professional model and its standard performance. It is estimated that 70% of people in India can catch common illnesses such as viruses, flu, coughs, colds etc. every two months. Because most people do not realize that common allergies can be symptoms of something very serious, 25% of people suddenly die from ignoring the first normal symptoms. Therefore, identifying or predicting the disease early using machine learning (ML) is very important to avoid any unwanted injuries.

Download Full-text

Classification of Imbalanced Data with Random Sets and Mean-Variance Filtering

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies ◽

10.4018/978-1-60566-717-1.ch022 ◽

2011 ◽

pp. 338-354

Author(s):

Nikulin Vladimir

Keyword(s):

Data Mining ◽

Linear Regression ◽

Imbalanced Data ◽

Random Sets ◽

Significant Problem ◽

Training Set ◽

Final Model ◽

The Stability ◽

Mean Variance

Imbalanced data represent a significant problem because the corresponding classifier has a tendency to ignore patterns which have smaller representation in the training set. We propose to consider a large number of balanced training subsets where representatives from the larger pattern are selected randomly. As an outcome, the system will produce a matrix of linear regression coefficients where rows represent random subsets and columns represent features. Based on the above matrix we make an assessment of the stability of the influence of the particular features. It is proposed to keep in the model only features with stable influence. The final model represents an average of the single models, which are not necessarily a linear regression. The above model had proven to be efficient and competitive during the PAKDD-2007 Data Mining Competition.

Download Full-text

A Machine Learning SDN-Enabled Big Data Model for IoMT Systems

Electronics ◽

10.3390/electronics10182228 ◽

2021 ◽

Vol 10 (18) ◽

pp. 2228

Author(s):

Khalid Haseeb ◽

Irshad Ahmad ◽

Israr Iqbal Awan ◽

Jaime Lloret ◽

Ignacio Bosch

Keyword(s):

Machine Learning ◽

Big Data ◽

Real Time ◽

Smart Cities ◽

Network Resources ◽

Data Accessibility ◽

Efficient Management ◽

Machine Learning Model ◽

Medical Sensors ◽

The Stability

In recent times, health applications have been gaining rapid popularity in smart cities using the Internet of Medical Things (IoMT). Many real-time solutions are giving benefits to both patients and professionals for remote data accessibility and suitable actions. However, timely medical decisions and efficient management of big data using IoT-based resources are the burning research challenges. Additionally, the distributed nature of data processing in many proposed solutions explicitly increases the threats of information leakages and damages the network integrity. Such solutions impose overhead on medical sensors and decrease the stability of the real-time transmission systems. Therefore, this paper presents a machine-learning model with SDN-enabled security to predict the consumption of network resources and improve the delivery of sensors data. Additionally, it offers centralized-based software define network (SDN) architecture to overcome the network threats among deployed sensors with nominal management cost. Firstly, it offers an unsupervised machine learning technique and decreases the communication overheads for IoT networks. Secondly, it predicts the link status using dynamic metrics and refines its strategies using SDN architecture. In the end, a security algorithm is utilized by the SDN controller that efficiently manages the consumption of the IoT nodes and protects it from unidentified occurrences. The proposed model is verified using simulations and improves system performance in terms of network throughput by 13%, data drop ratio by 39%, data delay by 11%, and faulty packets by 46% compared to HUNA and CMMA schemes.

Download Full-text

Data mining and machine learning methods for sustainable smart cities traffic classification: A survey

Sustainable Cities and Society ◽

10.1016/j.scs.2020.102177 ◽

2020 ◽

Vol 60 ◽

pp. 102177 ◽

Cited By ~ 3

Author(s):

Muhammad Shafiq ◽

Zhihong Tian ◽

Ali Kashif Bashir ◽

Alireza Jolfaei ◽

Xiangzhan Yu

Keyword(s):

Machine Learning ◽

Data Mining ◽

Smart Cities ◽

Traffic Classification ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Data Mining and Machine Learning to Promote Smart Cities: A Systematic Review from 2000 to 2018

Sustainability ◽

10.3390/su11041077 ◽

2019 ◽

Vol 11 (4) ◽

pp. 1077 ◽

Cited By ~ 20

Author(s):

Jovani Souza ◽

Antonio Francisco ◽

Cassiano Piekarski ◽

Guilherme Prado

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Data Mining ◽

Predictive Analytics ◽

Smart Cities ◽

The Sustainable Development ◽

Governmental Agencies ◽

Common Technique ◽

Development Goals ◽

Sustainable Services

Smart cities (SC) promote economic development, improve the welfare of their citizens, and help in the ability of people to use technologies to build sustainable services. However, computational methods are necessary to assist in the process of creating smart cities because they are fundamental to the decision-making process, assist in policy making, and offer improved services to citizens. As such, the aim of this research is to present a systematic review regarding data mining (DM) and machine learning (ML) approaches adopted in the promotion of smart cities. The Methodi Ordinatio was used to find relevant articles and the VOSviewer software was performed for a network analysis. Thirty-nine significant articles were identified for analysis from the Web of Science and Scopus databases, in which we analyzed the DM and ML techniques used, as well as the areas that are most engaged in promoting smart cities. Predictive analytics was the most common technique and the studies focused primarily on the areas of smart mobility and smart environment. This study seeks to encourage approaches that can be used by governmental agencies and companies to develop smart cities, being essential to assist in the Sustainable Development Goals.

Download Full-text

Data Mining and Machine Learning

10.1017/9781108564175 ◽

2020 ◽

Cited By ~ 2

Author(s):

Mohammed J. Zaki ◽

Wagner Meira, Jr

Keyword(s):

Machine Learning ◽

Data Mining

Download Full-text

Scalable Approach to High Coverages on Oxides via Iterative Training of a Machine-Learning Algorithm

10.26434/chemrxiv.10288514.v1 ◽

2019 ◽

Author(s):

Andrew Medford ◽

Shengchun Yang ◽

Fuzhu Liu

Keyword(s):

Machine Learning ◽

Chemical Potential ◽

Learning Algorithm ◽

Absolute Error ◽

Low Energy ◽

Training Data ◽

High Coverage ◽

Metal Compounds ◽

Adsorption Energies ◽

The Stability

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CHx, NHx and OHx species on the oxygen vacancy and pristine rutile TiO2(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N1.12) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.

Download Full-text

Instant medical care and drug suggestion service using data mining and machine learning based intelligent self-diagnosis medical system

International Journal of Advanced Life Sciences ◽

10.26627/ijals/2017/10.03.0022 ◽

2017 ◽

Vol 10 (03) ◽

pp. 318-325

Author(s):

sudha M

Keyword(s):

Machine Learning ◽

Data Mining ◽

Medical Care ◽

Medical System ◽

Using Data

Download Full-text