Coupling NCA Dimensionality Reduction with Machine Learning in Multispectral Rock Classification Problems

Brian Bino Sinaice; Narihiro Owada; Mahdi Saadat; Hisatoshi Toriya; Fumiaki Inagaki; Zibisani Bagai; Youhei Kawamura

doi:10.3390/min11080846

Coupling NCA Dimensionality Reduction with Machine Learning in Multispectral Rock Classification Problems

Minerals ◽

10.3390/min11080846 ◽

2021 ◽

Vol 11 (8) ◽

pp. 846

Author(s):

Brian Bino Sinaice ◽

Narihiro Owada ◽

Mahdi Saadat ◽

Hisatoshi Toriya ◽

Fumiaki Inagaki ◽

...

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Hyperspectral Imaging ◽

Mining Industry ◽

Integrated System ◽

Classification Problems ◽

Combined System ◽

Rock Classification ◽

Spectral Bands ◽

Current Production

Though multitudes of industries depend on the mining industry for resources, this industry has taken hits in terms of declining mineral ore grades and its current use of traditional, time-consuming and computationally costly rock and mineral identification methods. Therefore, this paper proposes integrating Hyperspectral Imaging, Neighbourhood Component Analysis (NCA) and Machine Learning (ML) as a combined system that can identify rocks and minerals. Modestly put, hyperspectral imaging gathers electromagnetic signatures of the rocks in hundreds of spectral bands. However, this data suffers from what is termed the ‘dimensionality curse’, which led to our employment of NCA as a dimensionality reduction technique. NCA, in turn, highlights the most discriminant feature bands, number of which being dependent on the intended application(s) of this system. Our envisioned application is rock and mineral classification via unmanned aerial vehicle (UAV) drone technology. In this study, we performed a 204-hyperspectral to 5-band multispectral reduction, because current production drones are limited to five multispectral bands sensors. Based on these bands, we applied ML to identify and classify rocks, thereby proving our hypothesis, reducing computational costs, attaining an ML classification accuracy of 71%, and demonstrating the potential mining industry optimisations attainable through this integrated system.

Download Full-text

Mahalanobis distance–based kernel supervised machine learning in spectral dimensionality reduction for hyperspectral imaging remote sensing

International Journal of Distributed Sensor Networks ◽

10.1177/1550147720968467 ◽

2020 ◽

Vol 16 (11) ◽

pp. 155014772096846

Author(s):

Jing Liu ◽

Yulong Qiao

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Dimensionality Reduction ◽

Image Classification ◽

Hyperspectral Imaging ◽

Mahalanobis Distance ◽

Hyperspectral Image ◽

Supervised Machine Learning ◽

Practical Applications ◽

Spectral Dimensionality

Spectral dimensionality reduction is a crucial step for hyperspectral image classification in practical applications. Dimensionality reduction has a strong influence on image classification performance with the problems of strong coupling features and high band correlation. To solve these issues, we propose the Mahalanobis distance–based kernel supervised machine learning framework for spectral dimensionality reduction. With Mahalanobis distance matrix–based dimensional reduction, the coupling relationship between features and the elimination of the scale effect are removed in low-dimensional feature space, which benefits the image classification. The experimental results show that compared with other methods, the proposed algorithm demonstrates the best accuracy and efficiency. The Mahalanobis distance–based multiples kernel learning achieves higher classification accuracy than the Euclidean distance kernel function. Accordingly, the proposed Mahalanobis distance–based kernel supervised machine learning method performs well with respect to the spectral dimensionality reduction in hyperspectral imaging remote sensing.

Download Full-text

A Low-Rate Video Approach to Hyperspectral Imaging of Dynamic Scenes

Journal of Imaging ◽

10.3390/jimaging5010006 ◽

2018 ◽

Vol 5 (1) ◽

pp. 6 ◽

Cited By ~ 6

Author(s):

Charles Bachmann ◽

Rehman Eon ◽

Christopher Lapszynski ◽

Gregory Badura ◽

Anthony Vodacek ◽

...

Keyword(s):

Data Acquisition ◽

Hyperspectral Imaging ◽

Dynamic Range ◽

Hyperspectral Image ◽

Image Sequences ◽

Integrated System ◽

Imaging Systems ◽

Spectral Bands ◽

Spatial Dimensions ◽

Low Rate

The increased sensitivity of modern hyperspectral line-scanning systems has led to the development of imaging systems that can acquire each line of hyperspectral pixels at very high data rates (in the 200–400 Hz range). These data acquisition rates present an opportunity to acquire full hyperspectral scenes at rapid rates, enabling the use of traditional push-broom imaging systems as low-rate video hyperspectral imaging systems. This paper provides an overview of the design of an integrated system that produces low-rate video hyperspectral image sequences by merging a hyperspectral line scanner, operating in the visible and near infra-red, with a high-speed pan-tilt system and an integrated IMU-GPS that provides system pointing. The integrated unit is operated from atop a telescopic mast, which also allows imaging of the same surface area or objects from multiple view zenith directions, useful for bi-directional reflectance data acquisition and analysis. The telescopic mast platform also enables stereo hyperspectral image acquisition, and therefore, the ability to construct a digital elevation model of the surface. Imaging near the shoreline in a coastal setting, we provide an example of hyperspectral imagery time series acquired during a field experiment in July 2017 with our integrated system, which produced hyperspectral image sequences with 371 spectral bands, spatial dimensions of 1600 × 212, and 16 bits per pixel, every 0.67 s. A second example times series acquired during a rooftop experiment conducted on the Rochester Institute of Technology campus in August 2017 illustrates a second application, moving vehicle imaging, with 371 spectral bands, 16 bit dynamic range, and 1600 × 300 spatial dimensions every second.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Forensic analysis of beverage stains using hyperspectral imaging

Scientific Reports ◽

10.1038/s41598-021-85737-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Binu Melit Devassy ◽

Sony George

Keyword(s):

Hyperspectral Imaging ◽

Crime Scene ◽

Forensic Analysis ◽

Classification Model ◽

Support Vector ◽

Additional Information ◽

Non Invasive ◽

Spectral Bands ◽

Data Dimensionality Reduction ◽

Gradient Based

AbstractDocumentation and analysis of crime scene evidences are of great importance in any forensic investigation. In this paper, we present the potential of hyperspectral imaging (HSI) to detect and analyze the beverage stains on a paper towel. To detect the presence and predict the age of the commonly used drinks in a crime scene, we leveraged the additional information present in the HSI data. We used 12 different beverages and four types of paper hand towel to create the sample stains in the current study. A support vector machine (SVM) is used to achieve the classification, and a convolutional auto-encoder is used to achieve HSI data dimensionality reduction, which helps in easy perception, process, and visualization of the data. The SVM classification model was re-established for a lighter and quicker classification model on the basis of the reduced dimension. We employed volume-gradient-based band selection for the identification of relevant spectral bands in the HSI data. Spectral data recorded at different time intervals up to 72 h is analyzed to trace the spectral changes. The results show the efficacy of the HSI techniques for rapid, non-contact, and non-invasive analysis of beverage stains.

Download Full-text

FRI0585 HIGH-THROUGHPUT METHODOLOGY FOR EMR-BASED IDENTIFICATION OF CLINICAL SUB-PHENOTYPES IN COMPLEX PATIENT POPULATIONS

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.3489 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 897.2-897

Author(s):

M. Maurits ◽

T. Huizinga ◽

M. Reinders ◽

S. Raychaudhuri ◽

E. Karlson ◽

...

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Dimensionality Reduction ◽

High Throughput ◽

Brain Cancer ◽

Machine Learning Techniques ◽

Summary Statistics ◽

Medical Problems ◽

Learning Techniques ◽

Icd Codes

Background:Heterogeneity in disease populations complicates discovery of risk factors. To identify risk factors for subpopulations of diseases, we need analytical methods that can deal with unidentified disease subgroups.Objectives:Inspired by successful approaches from the Big Data field, we developed a high-throughput approach to identify subpopulations within patients with heterogeneous, complex diseases using the wealth of information available in Electronic Medical Records (EMRs).Methods:We extracted longitudinal healthcare-interaction records coded by 1,853 PheCodes[1] of the 64,819 patients from the Boston’s Partners-Biobank. Through dimensionality reduction using t-SNE[2] we created a 2D embedding of 32,424 of these patients (set A). We then identified distinct clusters post-t-SNE using DBscan[3] and visualized the relative importance of individual PheCodes within them using specialized spectrographs. We replicated this procedure in the remaining 32,395 records (set B).Results:Summary statistics of both sets were comparable (Table 1).Table 1.Summary statistics of the total Partners Biobank dataset and the 2 partitions.Set-Aset-BTotalEntries12,200,31112,177,13124,377,442Patients32,42432,39564,819Patientyears369,546.33368,597.92738,144.2unique ICD codes25,05624,95326,305unique Phecodes1,8511,8531,853We found 284 clusters in set A and 295 in set B, of which 63.4% from set A could be mapped to a cluster in set B with a median (range) correlation of 0.24 (0.03 – 0.58).Clusters represented similar yet distinct clinical phenotypes; e.g. patients diagnosed with “other headache syndrome” were separated into four distinct clusters characterized by migraines, neurofibromatosis, epilepsy or brain cancer, all resulting in patients presenting with headaches (Fig. 1 & 2). Though EMR databases tend to be noisy, our method was also able to differentiate misclassification from true cases; SLE patients with RA codes clustered separately from true RA cases.Figure 1.Two dimensional representation of Set A generated using dimensionality reduction (tSNE) and clustering (DBScan).Figure 2.Phenotype Spectrographs (PheSpecs) of four clusters characterized by “Other headache syndromes”, driven by codes relating to migraine, epilepsy, neurofibromatosis or brain cancer.Conclusion:We have shown that EMR data can be used to identify and visualize latent structure in patient categorizations, using an approach based on dimension reduction and clustering machine learning techniques. Our method can identify misclassified patients as well as separate patients with similar problems into subsets with different associated medical problems. Our approach adds a new and powerful tool to aid in the discovery of novel risk factors in complex, heterogeneous diseases.References:[1] Denny, J.C. et al. Bioinformatics (2010)[2]van der Maaten et al. Journal of Machine Learning Research (2008)[3] Ester, M. et al. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. (1996)Disclosure of Interests:Marc Maurits: None declared, Thomas Huizinga Grant/research support from: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Consultant of: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Marcel Reinders: None declared, Soumya Raychaudhuri: None declared, Elizabeth Karlson: None declared, Erik van den Akker: None declared, Rachel Knevel: None declared

Download Full-text

Hyperspectral Imaging for Bloodstain Identification

Sensors ◽

10.3390/s21093045 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3045

Author(s):

Maheen Zulfiqar ◽

Muhammad Ahmad ◽

Ahmed Sohaib ◽

Manuel Mazzara ◽

Salvatore Distefano

Keyword(s):

Hyperspectral Imaging ◽

Dna Analysis ◽

Forensic Sciences ◽

Blood Protein ◽

Chemical Methods ◽

Acrylic Paint ◽

Spectral Bands ◽

Nail Polish ◽

Spatial Dimensions ◽

Non Destructive

Blood is key evidence to reconstruct crime scenes in forensic sciences. Blood identification can help to confirm a suspect, and for that reason, several chemical methods are used to reconstruct the crime scene however, these methods can affect subsequent DNA analysis. Therefore, this study presents a non-destructive method for bloodstain identification using Hyperspectral Imaging (HSI, 397–1000 nm range). The proposed method is based on the visualization of heme-components bands in the 500–700 nm spectral range. For experimental and validation purposes, a total of 225 blood (different donors) and non-blood (protein-based ketchup, rust acrylic paint, red acrylic paint, brown acrylic paint, red nail polish, rust nail polish, fake blood, and red ink) samples (HSI cubes, each cube is of size 1000 × 512 × 224, in which 1000 × 512 are the spatial dimensions and 224 spectral bands) were deposited on three substrates (white cotton fabric, white tile, and PVC wall sheet). The samples are imaged for up to three days to include aging. Savitzky Golay filtering has been used to highlight the subtle bands of all samples, particularly the aged ones. Based on the derivative spectrum, important spectral bands were selected to train five different classifiers (SVM, ANN, KNN, Random Forest, and Decision Tree). The comparative analysis reveals that the proposed method outperformed several state-of-the-art methods.

Download Full-text

IoT Bonet and Network Intrusion Detection using Dimensionality Reduction and Supervised Machine Learning

2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) ◽

10.1109/uemcon51285.2020.9298146 ◽

2020 ◽

Author(s):

Madhuri Gurunathrao Desai ◽

Yong Shi ◽

Kun Suo

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Dimensionality Reduction ◽

Supervised Machine Learning ◽

Network Intrusion Detection ◽

Network Intrusion

Download Full-text

Supervised Machine Learning Methods and Hyperspectral Imaging Techniques Jointly Applied for Brain Cancer Classification

Sensors ◽

10.3390/s21113827 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3827

Author(s):

Gemma Urbanos ◽

Alberto Martín ◽

Guillermo Vázquez ◽

Marta Villanueva ◽

Manuel Villa ◽

...

Keyword(s):

Machine Learning ◽

Blood Vessel ◽

Hyperspectral Imaging ◽

Imaging Techniques ◽

Venous Blood ◽

Healthy Tissue ◽

Supervised Machine Learning ◽

Support Vector ◽

Arterial Blood

Hyperspectral imaging techniques (HSI) do not require contact with patients and are non-ionizing as well as non-invasive. As a consequence, they have been extensively applied in the medical field. HSI is being combined with machine learning (ML) processes to obtain models to assist in diagnosis. In particular, the combination of these techniques has proven to be a reliable aid in the differentiation of healthy and tumor tissue during brain tumor surgery. ML algorithms such as support vector machine (SVM), random forest (RF) and convolutional neural networks (CNN) are used to make predictions and provide in-vivo visualizations that may assist neurosurgeons in being more precise, hence reducing damages to healthy tissue. In this work, thirteen in-vivo hyperspectral images from twelve different patients with high-grade gliomas (grade III and IV) have been selected to train SVM, RF and CNN classifiers. Five different classes have been defined during the experiments: healthy tissue, tumor, venous blood vessel, arterial blood vessel and dura mater. Overall accuracy (OACC) results vary from 60% to 95% depending on the training conditions. Finally, as far as the contribution of each band to the OACC is concerned, the results obtained in this work are 3.81 times greater than those reported in the literature.

Download Full-text

Implementation Framework for a Blockchain-Based Federated Learning Model for Classification Problems

Symmetry ◽

10.3390/sym13071116 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1116

Author(s):

Zeba Mahmood ◽

Vacius Jusas

Keyword(s):

Machine Learning ◽

Learning Model ◽

Global Perspective ◽

Learning Technology ◽

Classification Problems ◽

Zero Knowledge ◽

Privacy And Security ◽

Implementation Framework ◽

The Past

This paper introduces a blockchain-based federated learning (FL) framework with incentives for participating nodes to enhance the accuracy of classification problems. Machine learning technology has been rapidly developed and changed from a global perspective for the past few years. The FL framework is based on the Ethereum blockchain and creates an autonomous ecosystem, where nodes compete to improve the accuracy of classification problems. With privacy being one of the biggest concerns, FL makes use of the blockchain-based approach to ensure privacy and security. Another important technology that underlies the FL framework is zero-knowledge proofs (ZKPs), which ensure that data uploaded to the network are accurate and private. Basically, ZKPs allow nodes to compete fairly by only submitting accurate models to the parameter server and get rewarded for that. We have conducted an analysis and found that ZKPs can help improve the accuracy of models submitted to the parameter server and facilitate the honest participation of all nodes in FL.

Download Full-text

Prediction of Sugar Content in Port Wine Vintage Grapes Using Machine Learning and Hyperspectral Imaging

Processes ◽

10.3390/pr9071241 ◽

2021 ◽

Vol 9 (7) ◽

pp. 1241

Author(s):

Véronique Gomes ◽

Marco S. Reis ◽

Francisco Rovira-Más ◽

Ana Mendes-Ferreira ◽

Pedro Melo-Pinto

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Hyperspectral Imaging ◽

Sugar Content ◽

Hyperspectral Data ◽

Machine Learning Algorithms ◽

Port Wine ◽

Monitoring And Control ◽

High Quality ◽

Harvesting Stage

The high quality of Port wine is the result of a sequence of winemaking operations, such as harvesting, maceration, fermentation, extraction and aging. These stages require proper monitoring and control, in order to consistently achieve the desired wine properties. The present work focuses on the harvesting stage, where the sugar content of grapes plays a key role as one of the critical maturity parameters. Our approach makes use of hyperspectral imaging technology to rapidly extract information from wine grape berries; the collected spectra are fed to machine learning algorithms that produce estimates of the sugar level. A consistent predictive capability is important for establishing the harvest date, as well as to select the best grapes to produce specific high-quality wines. We compared four different machine learning methods (including deep learning), assessing their generalization capacity for different vintages and varieties not included in the training process. Ridge regression, partial least squares, neural networks and convolutional neural networks were the methods considered to conduct this comparison. The results show that the estimated models can successfully predict the sugar content from hyperspectral data, with the convolutional neural network outperforming the other methods.

Download Full-text