Autonomous Fingerprinting and Large Experimental Data Set for Visible Light Positioning

Tyrel Glass; Fakhrul Alam; Mathew Legg; Frazer Noble

doi:10.3390/s21093256

Autonomous Fingerprinting and Large Experimental Data Set for Visible Light Positioning

Sensors ◽

10.3390/s21093256 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3256

Author(s):

Tyrel Glass ◽

Fakhrul Alam ◽

Mathew Legg ◽

Frazer Noble

Keyword(s):

Experimental Data ◽

Visible Light ◽

Nearest Neighbor ◽

Channel Model ◽

Ground Truth ◽

Small Data ◽

Large Set ◽

K Nearest Neighbor ◽

Data Set ◽

Model Based

This paper presents an autonomous method of collecting data for Visible Light Positioning (VLP) and a comprehensive investigation of VLP using a large set of experimental data. Received Signal Strength (RSS) data are efficiently collected using a novel method that utilizes consumer grade Virtual Reality (VR) tracking for accurate ground truth recording. An investigation into the accuracy of the ground truth system showed median and 90th percentile errors of 4.24 and 7.35 mm, respectively. Co-locating a VR tracker with a photodiode-equipped VLP receiver on a mobile robotic platform allows fingerprinting on a scale and accuracy that has not been possible with traditional manual collection methods. RSS data at 7344 locations within a 6.3 × 6.9 m test space fitted with 11 VLP luminaires is collected and has been made available for researchers. The quality and the volume of the data allow for a robust study of Machine Learning (ML)- and channel model-based positioning utilizing visible light. Among the ML-based techniques, ridge regression is found to be the most accurate, outperforming Weighted k Nearest Neighbor, Multilayer Perceptron, and random forest, among others. Model-based positioning is more accurate than ML techniques when a small data set is available for calibration and training. However, if a large data set is available for training, ML-based positioning outperforms its model-based counterparts in terms of localization accuracy.

Download Full-text

Determination of Reactivity Ratios from Binary Copolymerization Using the k-Nearest Neighbor Non-Parametric Regression

Polymers ◽

10.3390/polym13213811 ◽

2021 ◽

Vol 13 (21) ◽

pp. 3811

Author(s):

Iosif Sorin Fazakas-Anca ◽

Arina Modrea ◽

Sorin Vlase

Keyword(s):

Experimental Data ◽

Nearest Neighbor ◽

Optimization Method ◽

Reactivity Ratios ◽

Data Sets ◽

K Nearest Neighbor ◽

Integration Algorithm ◽

Data Set ◽

Parametric Regression ◽

Non Parametric

This paper proposes a new method for calculating the monomer reactivity ratios for binary copolymerization based on the terminal model. The original optimization method involves a numerical integration algorithm and an optimization algorithm based on k-nearest neighbour non-parametric regression. The calculation method has been tested on simulated and experimental data sets, at low (<10%), medium (10–35%) and high conversions (>40%), yielding reactivity ratios in a good agreement with the usual methods such as intersection, Fineman–Ross, reverse Fineman–Ross, Kelen–Tüdös, extended Kelen–Tüdös and the error in variable method. The experimental data sets used in this comparative analysis are copolymerization of 2-(N-phthalimido) ethyl acrylate with 1-vinyl-2-pyrolidone for low conversion, copolymerization of isoprene with glycidyl methacrylate for medium conversion and copolymerization of N-isopropylacrylamide with N,N-dimethylacrylamide for high conversion. Also, the possibility to estimate experimental errors from a single experimental data set formed by n experimental data is shown.

Download Full-text

Machine Learning Verdict of EEG Signals in Brain Computer Interface

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1838114 ◽

2018 ◽

pp. 429-441

Author(s):

M. Jeyanthi ◽

C. Velayutham

Keyword(s):

Nearest Neighbor ◽

Technology Development ◽

Vital Role ◽

Svm Classifier ◽

K Nearest Neighbor ◽

Data Mining Technique ◽

Data Set ◽

Eeg Data ◽

Irrelevant Attributes

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.

Download Full-text

Symmetry Breaking and Training from Incomplete Data with Radial Basis Boltzmann Machines

International Journal of Neural Systems ◽

10.1142/s0129065797000318 ◽

1997 ◽

Vol 08 (03) ◽

pp. 301-315 ◽

Cited By ~ 8

Author(s):

Marcel J. Nijman ◽

Hilbert J. Kappen

Keyword(s):

Symmetry Breaking ◽

Incomplete Data ◽

Missing Values ◽

Nearest Neighbor ◽

Boltzmann Machine ◽

K Nearest Neighbor ◽

Data Set ◽

Input Space ◽

Learning Rules ◽

Radial Basis

A Radial Basis Boltzmann Machine (RBBM) is a specialized Boltzmann Machine architecture that combines feed-forward mapping with probability estimation in the input space, and for which very efficient learning rules exist. The hidden representation of the network displays symmetry breaking as a function of the noise in the dynamics. Thus, generalization can be studied as a function of the noise in the neuron dynamics instead of as a function of the number of hidden units. We show that the RBBM can be seen as an elegant alternative of k-nearest neighbor, leading to comparable performance without the need to store all data. We show that the RBBM has good classification performance compared to the MLP. The main advantage of the RBBM is that simultaneously with the input-output mapping, a model of the input space is obtained which can be used for learning with missing values. We derive learning rules for the case of incomplete data, and show that they perform better on incomplete data than the traditional learning rules on a 'repaired' data set.

Download Full-text

A MODIFIED MODEL BASED ON FLOWER POLLINATION ALGORITHM AND K-NEAREST NEIGHBOR FOR DIAGNOSING DISEASES

IIUM Engineering Journal ◽

10.31436/iiumej.v19i1.854 ◽

2018 ◽

Vol 19 (1) ◽

pp. 144-157

Author(s):

Mehdi Zekriyapanah Gashti

Keyword(s):

Breast Cancer ◽

Nearest Neighbor ◽

Heart Diseases ◽

Critical Role ◽

Clinical Manifestations ◽

Flower Pollination Algorithm ◽

K Nearest Neighbor ◽

Data Set ◽

Modified Model ◽

Flower Pollination

Exponential growth of medical data and recorded resources from patients with different diseases can be exploited to establish an optimal association between disease symptoms and diagnosis. The main issue in diagnosis is the variability of the features that can be attributed for particular diseases, since some of these features are not essential for the diagnosis and may even lead to a delay in diagnosis. For instance, diabetes, hepatitis, breast cancer, and heart disease, that express multitudes of clinical manifestations as symptoms, are among the diseases with higher morbidity rate. Timely diagnosis of such diseases can play a critical role in decreasing their effect on patientsâ€™ quality of life and on the costs of their treatment. Thanks to the large data set available, computer aided diagnosis can be an advanced option for early diagnosis of the diseases. In this paper, using a Flower Pollination Algorithm (FPA) and K-Nearest Neighbor (KNN), a new method is suggested for diagnosis. The modified model can diagnose diseases more accurately by reducing the number of features. The main purpose of the modified model is that the Feature Selection (FS) should be done by FPA and data classification should be performed using KNN. The results showed higher efficiency of the modified model on diagnosis of diabetes, hepatitis, breast cancer, and heart diseases compared to the KNN models. ABSTRAK: Pertumbuhan eksponen dalam data perubatan dan sumber direkodkan daripada pesakit dengan penyakit berbeza boleh disalah guna bagi membentuk kebersamaan optimum antara simptom penyakit dan mengenal pasti gejala penyakit (diagnosis). Isu utama dalam diagnosis adalah kepelbagaian ciri yang dimiliki pada penyakit tertentu, sementara ciri-ciri ini tidak penting untuk didiagnosis dan boleh mengarah kepada penangguhan dalam diagnosis. Sebagai contoh, penyakit kencing manis, radang hati, barah payudara dan penyakit jantung, menunjukkan banyak klinikal simptom jelas dan merupakan penyakit tertinggi berlaku dalam masyarakat. Diagnosis tepat pada penyakit tersebut boleh memainkan peranan penting dalam mengurangkan kesan kualitiÂ hidup dan kos rawatan pesakit. Terima kasih kepada set data yang banyak, diagnosis dengan bantuan komputer boleh menjadi pilihan maju menuju ke arah diagnosis awal kepada penyakit. Kertas ini menggunakan Algoritma Flower Pollination (FPA) dan K-Nearest Neighbor (KNN), iaitu kaedah baru dicadangkan bagi diagnosis. Model yang diubah suai boleh mendiagnosis penyakit lebih tepat dengan mengurangkan bilangan ciri-ciri. Tujuan utama model yang diubah suai ini adalah bagi Pemilihan Ciri (FS) perlu dilakukan menggunakan FPA and pengkhususan data perlu dijalankan menggunakan KNN. Keputusan menunjukkan model yang diubah suai lebih cekap dalam mendiagnosis penyakit kencing manis, radang hati, barah payudara dan penyakit jantung berbanding model KNN.

Download Full-text

An Incremental Isomap Method for Hyperspectral Dimensionality Reduction and Classification

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.7.445 ◽

2021 ◽

Vol 87 (6) ◽

pp. 445-455

Author(s):

Yi Ma ◽

Zezhong Zheng ◽

Yutang Ma ◽

Mingcang Zhu ◽

Ran Huang ◽

...

Keyword(s):

Manifold Learning ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Hyperspectral Data ◽

Training Data ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbor ◽

Data Set ◽

Data Points

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.

Download Full-text

Improving the Performance of kNN in the MapReduce Framework Using Locality Sensitive Hashing

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2019100101 ◽

2019 ◽

Vol 10 (4) ◽

pp. 1-16

Author(s):

Sikha Bagui ◽

Arup Kumar Mondal ◽

Subhash Bagui

Keyword(s):

Nearest Neighbor ◽

Parallel Implementation ◽

Block Size ◽

Computation Time ◽

Locality Sensitive Hashing ◽

K Nearest Neighbor ◽

Mapreduce Framework ◽

Data Set ◽

Data Object ◽

Very Large Datasets

In this work the authors present a parallel k nearest neighbor (kNN) algorithm using locality sensitive hashing to preprocess the data before it is classified using kNN in Hadoop's MapReduce framework. This is compared with the sequential (conventional) implementation. Using locality sensitive hashing's similarity measure with kNN, the iterative procedure to classify a data object is performed within a hash bucket rather than the whole data set, greatly reducing the computation time needed for classification. Several experiments were run that showed that the parallel implementation performed better than the sequential implementation on very large datasets. The study also experimented with a few map and reduce side optimization features for the parallel implementation and presented some optimum map and reduce side parameters. Among the map side parameters, the block size and input split size were varied, and among the reduce side parameters, the number of planes were varied, and their effects were studied.

Download Full-text

Identification of Leukemia Subtypes from Microscopic Images Using Convolutional Neural Network

Diagnostics ◽

10.3390/diagnostics9030104 ◽

2019 ◽

Vol 9 (3) ◽

pp. 104 ◽

Cited By ~ 11

Author(s):

Ahmed ◽

Yigit ◽

Isik ◽

Alpkocak

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Leukemia Data

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.

Download Full-text

On-the-fly Ambiguity Resolution Using an Estimator of the Modified Ambiguity Covariance Matrix for the GNSS Positioning Model Based on Phase Data

Artificial Satellites ◽

10.2478/v10018-012-0015-9 ◽

2012 ◽

Vol 47 (3) ◽

pp. 81-90 ◽

Cited By ~ 5

Author(s):

S. Cellmer

Keyword(s):

Covariance Matrix ◽

Ambiguity Resolution ◽

Positive Definiteness ◽

Small Data ◽

Data Set ◽

Gnss Positioning ◽

Model Based ◽

Lambda Method ◽

Phase Data ◽

Single Epoch

On-the-fly Ambiguity Resolution Using an Estimator of the Modified Ambiguity Covariance Matrix for the GNSS Positioning Model Based on Phase Data On-the-fly ambiguity resolution (OTF AR) is based on a small data set, obtained from a very short observation session or even from a single epoch observation. In these cases, a classical approach to ambiguity resolution (e.g. the Lambda method) can meet some numerical problems. The basis of the Lambda method is an integer decorrelation of the positive definite ambiguity covariance matrix (ACM). The necessary condition for the proper performing of this procedure is a positive definiteness of ACM. However, this condition is not satisfied in cases of very short observation sessions or single epoch positioning if phase-only observations are used. The subject of this contribution is such a case where phase-only observations are used in the final part of the computational process. The modification of ACM is proposed in order to ensure its positive definiteness. An estimator of modified ACM is a good ACM approximation for the purpose of performing the LAMBDA method. Another problem of short sessions (or a single epoch) positioning is the poor quality of the float solution. In this paper, a cascade adjustment with wide-lane combinations of signals L1 and L2 as a method of solving this problem is presented.

Download Full-text

Feature Selection Algorithm for Hyperlipidemia Classification

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.701-702.110 ◽

2014 ◽

Vol 701-702 ◽

pp. 110-113

Author(s):

Qi Rui Zhang ◽

He Xian Wang ◽

Jiang Wei Qin

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Information Gain ◽

Classification Systems ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Document Frequency ◽

Selection Algorithms ◽

Term Weights

This paper reports a comparative study of feature selection algorithms on a hyperlipimedia data set. Three methods of feature selection were evaluated, including document frequency (DF), information gain (IG) and aχ2 statistic (CHI). The classification systems use a vector to represent a document and use tfidfie (term frequency, inverted document frequency, and inverted entropy) to compute term weights. In order to compare the effectives of feature selection, we used three classification methods: Naïve Bayes (NB), k Nearest Neighbor (kNN) and Support Vector Machines (SVM). The experimental results show that IG and CHI outperform significantly DF, and SVM and NB is more effective than KNN when macro-averagingF1 measure is used. DF is suitable for the task of large text classification.

Download Full-text

A Novel Hybrid Model Based on Extreme Learning Machine, k-Nearest Neighbor Regression and Wavelet Denoising Applied to Short-Term Electric Load Forecasting

Energies ◽

10.3390/en10050694 ◽

2017 ◽

Vol 10 (5) ◽

pp. 694 ◽

Cited By ~ 9

Author(s):

◽

Keyword(s):

Extreme Learning Machine ◽

Hybrid Model ◽

Nearest Neighbor ◽

Load Forecasting ◽

Wavelet Denoising ◽

Electric Load ◽

K Nearest Neighbor ◽

Short Term ◽

Model Based ◽

Learning Machine

Download Full-text