Rapid Classification of Wheat Grain Varieties Using Hyperspectral Imaging and Chemometrics

Yidan Bao; Chunxiao Mi; Na Wu; Fei Liu; Yong He

doi:10.3390/app9194119

Rapid Classification of Wheat Grain Varieties Using Hyperspectral Imaging and Chemometrics

Applied Sciences ◽

10.3390/app9194119 ◽

2019 ◽

Vol 9 (19) ◽

pp. 4119 ◽

Cited By ~ 9

Author(s):

Yidan Bao ◽

Chunxiao Mi ◽

Na Wu ◽

Fei Liu ◽

Yong He

Keyword(s):

Hyperspectral Imaging ◽

Large Scale ◽

Scatter Correction ◽

Principal Component ◽

Support Vector ◽

Wheat Grain ◽

Successive Projections Algorithm ◽

Accurate Identification ◽

Wheat Varieties

The classification of wheat grain varieties is of great value because its high purity is the yield and quality guarantee. In this study, hyperspectral imaging combined with the chemometric methods was applied to explore and implement the varieties classification of wheat seeds. The hyperspectral images of all the samples covering 874–1734 nm bands were collected. Exploratory analysis was first carried out while using principal component analysis (PCA) and linear discrimination analysis (LDA). Spectral preprocessing methods including standard normal variate (SNV), multiplicative scatter correction (MSC), and wavelet transform (WT) were introduced, and their effects on discriminant models were studied to eliminate the interference of instrumental and environmental factors. PCA loading, successive projections algorithm (SPA), and random frog (RF) were applied to extract feature wavelengths for redundancy elimination owing to the possibility of existing redundant spectral information. Classification models were developed based on full wavelengths and feature wavelengths using LDA, support vector machine (SVM), and extreme learning machine (ELM). This optimal model was finally utilized to generate visualization map to observe the classification performance intuitively. When comparing with other models, ELM based on full wavelengths achieved the best accuracy up to 91.3%. The overall results suggested that hyperspectral imaging was a potential tool for the rapid and accurate identification of wheat varieties, which could be conducted in large-scale seeds classification and quality detection in modern seed industry.

Download Full-text

Identification of Geographical Origin of Chinese Chestnuts Using Hyperspectral Imaging with 1D-CNN Algorithm

Agriculture ◽

10.3390/agriculture11121274 ◽

2021 ◽

Vol 11 (12) ◽

pp. 1274

Author(s):

Xingpeng Li ◽

Hongzhe Jiang ◽

Xuesong Jiang ◽

Minghong Shi

Keyword(s):

Hyperspectral Imaging ◽

Geographical Origin ◽

Principal Component ◽

Brand Value ◽

Support Vector ◽

Successive Projections Algorithm ◽

One Dimensional ◽

Machine Learning Methods ◽

Origin Identification ◽

Successive Projections

The adulteration in Chinese chestnuts affects the quality, taste, and brand value. The objective of this study was to explore the feasibility of the hyperspectral imaging (HSI) technique to determine the geographical origin of Chinese chestnuts. An HSI system in spectral range of 400–1000 nm was applied to identify a total of 417 Chinese chestnuts from three different geographical origins. Principal component analysis (PCA) was preliminarily used to investigate the differences of average spectra of the samples from different geographical origins. A deep-learning-based model (1D-CNN, one-dimensional convolutional neural network) was developed first, and then the model based on full spectra and optimal wavelengths were established for various machine learning methods, including partial least squares-discriminant analysis (PLS-DA) and particle swarm optimization-support vector machine (PSO-SVM). The optimal results based on full spectra for 1D-CNN, PLS-DA, and PSO-SVM models were 97.12%, 97.12%, and 95.68%, respectively. Competitive adaptive reweighted sampling (CARS) and a successive projections algorithm (SPA) were individually utilized for wavelengths selection, and the results of simplified models generally improved. The contrasting results demonstrated that the prediction accuracies of SPA-PLS-DA and 1D-CNN both reached 97.12%, but 1D-CNN presented a higher Kappa coefficient value than SPA-PLS-DA. Meanwhile, the sensitivities and specificities of SPA-PLS-DA and 1D-CNN models were both above 90% for the samples from each geographical origin. These results indicated that both SPA-PLS-DA and 1D-CNN models combined with HSI have great potential for the geographical origin identification of Chinese chestnuts.

Download Full-text

Spatial-spectral identification of abnormal leukocytes based on microscopic hyperspectral imaging technology

Journal of Innovative Optical Health Sciences ◽

10.1142/s1793545820500054 ◽

2020 ◽

Vol 13 (02) ◽

pp. 2050005

Author(s):

Xueqi Hu ◽

Jiahua Ou ◽

Mei Zhou ◽

Menghan Hu ◽

Li Sun ◽

...

Keyword(s):

Hyperspectral Imaging ◽

Lymphoblastic Leukemia ◽

Spectral Feature ◽

Support Vector ◽

Watershed Algorithm ◽

Spectral Features ◽

Accurate Identification ◽

Spatial Features ◽

Spectral Identification

Screening and diagnosing of abnormal Leukocytes are crucial for the diagnosis of immune diseases and Acute Lymphoblastic Leukemia (ALL). As the deterioration of abnormal leukocytes is mainly due to the changes in the chromatin distribution, which significantly affects the absorption and reflection of light, the spectral feature is proved to be important for leukocytes classification and identification. This paper proposes an accurate identification method for healthy and abnormal leukocytes based on microscopic hyperspectral imaging (HSI) technology which combines the spectral information. The segmentation of nucleus and cytoplasm is obtained by the morphological watershed algorithm. Then, the spectral features are extracted and combined with the spatial features. Based on this, the support vector machine (SVM) is applied for classification of five types of leukocytes and abnormal leukocytes. Compared with different classification methods, the proposed method utilizes spectral features which highlight the differences between healthy leukocytes and abnormal leukocytes, improving the accuracy in the classification and identification of leukocytes. This paper only selects one subtype of ALL for test, and the proposed method can be applied for detection of other leukemia in the future.

Download Full-text

Chemometrics in Tandem with Hyperspectral Imaging for Detecting Authentication of Raw and Cooked Mutton Rolls

Foods ◽

10.3390/foods10092127 ◽

2021 ◽

Vol 10 (9) ◽

pp. 2127

Author(s):

Hongzhe Jiang ◽

Yi Yang ◽

Minghong Shi

Keyword(s):

Hyperspectral Imaging ◽

Meat Products ◽

Principal Component ◽

Support Vector ◽

Meat Industry ◽

Successive Projections Algorithm ◽

Classification Rate ◽

Meat Species ◽

Spectral Principal Component Analysis ◽

Cooked Meat

Authentication assurance of meat or meat products is critical in the meat industry. Various methods including DNA- or protein-based techniques are accurate for assessing meat authenticity, however, they are destructive, expensive, or laborious. This study explores the feasibility of chemometrics in tandem with hyperspectral imaging (HSI) for identifying raw and cooked mutton rolls substitution by pork and duck rolls. Raw or cooked samples (n = 180) of three meat species were prepared to collect hyperspectral images in range of 400–1000 nm. Spectra were extracted from representative regions of interest (ROIs), and spectral principal component analysis (PCA) revealed that PC1 and PC2 were effective for the identification. Different methods including standard normal variable (SNV), first and second derivatives, and normalization were individually employed for spectral preprocessing, and modeling methods of partial least squares-discriminant analysis (PLS-DA) and support vector machines (SVM) were also individually applied to develop classification models for both the raw and the cooked. Results showed that PLS-DA model developed by raw spectra presented the highest 100% correct classification rate (CCR) of success in all sets. After that, effective wavelengths selected by successive projections algorithm (SPA) built optimal simplified models which didn’t influence the modeling results compared with full spectra regardless of the meat roll states. Therefore, SPA-PLS-DA models were subsequently used to visualize the raw and cooked meat rolls classification. As a consequence, the general meat species of both raw and cooked meat rolls were readily discernible in pixel-wise manner by generating classification maps. The results showed that HSI combined with chemometrics can be used to identify the authentication of raw and cooked mutton rolls substituted by pork and duck rolls accurately. This promising methodology provides a reference which can be extended to the classification or grading of other meat rolls.

Download Full-text

Characterization of Pharmaceutical Tablets Using UV Hyperspectral Imaging as a Rapid Line to Line Analysis Tool

Sensors ◽

10.3390/s21134436 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4436

Author(s):

Mohammad Al Ktash ◽

Mona Stefanakis ◽

Barbara Boldrini ◽

Edwin Ostertag ◽

Marc Brecht

Keyword(s):

Hyperspectral Imaging ◽

Large Scale ◽

Solid Phase ◽

Ultra Violet ◽

Principal Component ◽

Ccd Camera ◽

Conveyor Belt ◽

Hyperspectral Data ◽

Analysis Tool ◽

Sample Set

A laboratory prototype for hyperspectral imaging in ultra-violet (UV) region from 225 to 400 nm was developed and used to rapidly characterize active pharmaceutical ingredients (API) in tablets. The APIs are ibuprofen (IBU), acetylsalicylic acid (ASA) and paracetamol (PAR). Two sample sets were used for a comparison purpose. Sample set one comprises tablets of 100% API and sample set two consists of commercially available painkiller tablets. Reference measurements were performed on the pure APIs in liquid solutions (transmission) and in solid phase (reflection) using a commercial UV spectrometer. The spectroscopic part of the prototype is based on a pushbroom imager that contains a spectrograph and charge-coupled device (CCD) camera. The tablets were scanned on a conveyor belt that is positioned inside a tunnel made of polytetrafluoroethylene (PTFE) in order to increase the homogeneity of illumination at the sample position. Principal component analysis (PCA) was used to differentiate the hyperspectral data of the drug samples. The first two PCs are sufficient to completely separate all samples. The rugged design of the prototype opens new possibilities for further development of this technique towards real large-scale application.

Download Full-text

Land Cover Classification of Nine Perennial Crops Using Sentinel-1 and -2 Data

Remote Sensing ◽

10.3390/rs12010096 ◽

2019 ◽

Vol 12 (1) ◽

pp. 96 ◽

Cited By ~ 6

Author(s):

James Brinkhoff ◽

Justin Vardanega ◽

Andrew J. Robson

Keyword(s):

Land Cover ◽

Large Scale ◽

Satellite Image ◽

Resource Planning ◽

Support Vector ◽

Perennial Crop ◽

Perennial Crops ◽

Object Based ◽

Rbf Kernel

Land cover mapping of intensive cropping areas facilitates an enhanced regional response to biosecurity threats and to natural disasters such as drought and flooding. Such maps also provide information for natural resource planning and analysis of the temporal and spatial trends in crop distribution and gross production. In this work, 10 meter resolution land cover maps were generated over a 6200 km2 area of the Riverina region in New South Wales (NSW), Australia, with a focus on locating the most important perennial crops in the region. The maps discriminated between 12 classes, including nine perennial crop classes. A satellite image time series (SITS) of freely available Sentinel-1 synthetic aperture radar (SAR) and Sentinel-2 multispectral imagery was used. A segmentation technique grouped spectrally similar adjacent pixels together, to enable object-based image analysis (OBIA). K-means unsupervised clustering was used to filter training points and classify some map areas, which improved supervised classification of the remaining areas. The support vector machine (SVM) supervised classifier with radial basis function (RBF) kernel gave the best results among several algorithms trialled. The accuracies of maps generated using several combinations of the multispectral and radar bands were compared to assess the relative value of each combination. An object-based post classification refinement step was developed, enabling optimization of the tradeoff between producers’ accuracy and users’ accuracy. Accuracy was assessed against randomly sampled segments, and the final map achieved an overall count-based accuracy of 84.8% and area-weighted accuracy of 90.9%. Producers’ accuracies for the perennial crop classes ranged from 78 to 100%, and users’ accuracies ranged from 63 to 100%. This work develops methods to generate detailed and large-scale maps that accurately discriminate between many perennial crops and can be updated frequently.

Download Full-text

Multiclass classification of leukemia cancer data using Fuzzy Support Vector Machine (FSVM) with feature selection using Principal Component Analysis (PCA)

Journal of Physics Conference Series ◽

10.1088/1742-6596/1725/1/012012 ◽

2021 ◽

Vol 1725 ◽

pp. 012012

Author(s):

I R Fauzi ◽

Z Rustam ◽

A Wibowo

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Feature Selection ◽

Principal Component ◽

Component Analysis ◽

Multiclass Classification ◽

Support Vector ◽

Fuzzy Support Vector Machine ◽

Cancer Data

Download Full-text

Support Vector Machines in Big Data Classification: A Systematic Literature Review

10.21203/rs.3.rs-663359/v1 ◽

2021 ◽

Author(s):

Mohammad Hassan Almaspoor ◽

Ali Safaei ◽

Afshin Salajegheh ◽

Behrouz Minaei-Bidgoli

Keyword(s):

Machine Learning ◽

Big Data ◽

Large Scale ◽

Support Vector ◽

Research Areas ◽

Large Scale Data ◽

Training Samples ◽

Big Data Classification ◽

Scale Data

Abstract Classification is one of the most important and widely used issues in machine learning, the purpose of which is to create a rule for grouping data to sets of pre-existing categories is based on a set of training sets. Employed successfully in many scientific and engineering areas, the Support Vector Machine (SVM) is among the most promising methods of classification in machine learning. With the advent of big data, many of the machine learning methods have been challenged by big data characteristics. The standard SVM has been proposed for batch learning in which all data are available at the same time. The SVM has a high time complexity, i.e., increasing the number of training samples will intensify the need for computational resources and memory. Hence, many attempts have been made at SVM compatibility with online learning conditions and use of large-scale data. This paper focuses on the analysis, identification, and classification of existing methods for SVM compatibility with online conditions and large-scale data. These methods might be employed to classify big data and propose research areas for future studies. Considering its advantages, the SVM can be among the first options for compatibility with big data and classification of big data. For this purpose, appropriate techniques should be developed for data preprocessing in order to covert data into an appropriate form for learning. The existing frameworks should also be employed for parallel and distributed processes so that SVMs can be made scalable and properly online to be able to handle big data.

Download Full-text

Nondestructive Testing and Visualization of Catechin Content in Black Tea Fermentation Using Hyperspectral Imaging

Sensors ◽

10.3390/s21238051 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8051

Author(s):

Chunwang Dong ◽

Chongshan Yang ◽

Zhongyuan Liu ◽

Rentian Zhang ◽

Peng Yan ◽

...

Keyword(s):

Spectral Data ◽

Prediction Accuracy ◽

Visual Analysis ◽

Population Analysis ◽

Scatter Correction ◽

Principal Component ◽

Black Tea ◽

Support Vector ◽

Epicatechin Gallate ◽

Variable Combination

Catechin is a major reactive substance involved in black tea fermentation. It has a determinant effect on the final quality and taste of made teas. In this study, we applied hyperspectral technology with the chemometrics method and used different pretreatment and variable filtering algorithms to reduce noise interference. After reduction of the spectral data dimensions by principal component analysis (PCA), an optimal prediction model for catechin content was constructed, followed by visual analysis of catechin content when fermenting leaves for different periods of time. The results showed that zero mean normalization (Z-score), multiplicative scatter correction (MSC), and standard normal variate (SNV) can effectively improve model accuracy; while the shuffled frog leaping algorithm (SFLA), the variable combination population analysis genetic algorithm (VCPA-GA), and variable combination population analysis iteratively retaining informative variables (VCPA-IRIV) can significantly reduce spectral data and enhance the calculation speed of the model. We found that nonlinear models performed better than linear ones. The prediction accuracy for the total amount of catechins and for epicatechin gallate (ECG) of the extreme learning machine (ELM), based on optimal variables, reached 0.989 and 0.994, respectively, and the prediction accuracy for EGC, C, EC, and EGCG of the content support vector regression (SVR) models reached 0.972, 0.993, 0.990, and 0.994, respectively. The optimal model offers accurate prediction, and visual analysis can determine the distribution of the catechin content when fermenting leaves for different fermentation periods. The findings provide significant reference material for intelligent digital assessment of black tea during processing.

Download Full-text

iBitter-Fuse: A Novel Sequence-Based Bitter Peptide Predictor by Fusing Multi-View Features

International Journal of Molecular Sciences ◽

10.3390/ijms22168958 ◽

2021 ◽

Vol 22 (16) ◽

pp. 8958

Author(s):

Phasit Charoenkwan ◽

Chanin Nantasenamat ◽

Md. Mehedi Hasan ◽

Mohammad Ali Moni ◽

Pietro Lio’ ◽

...

Keyword(s):

Machine Learning ◽

Large Scale ◽

De Novo ◽

Predictive Performance ◽

Support Vector ◽

Sufficient Information ◽

Self Assessment ◽

Accurate Identification ◽

Bitter Peptides ◽

Accurate Performance

Accurate identification of bitter peptides is of great importance for better understanding their biochemical and biophysical properties. To date, machine learning-based methods have become effective approaches for providing a good avenue for identifying potential bitter peptides from large-scale protein datasets. Although few machine learning-based predictors have been developed for identifying the bitterness of peptides, their prediction performances could be improved. In this study, we developed a new predictor (named iBitter-Fuse) for achieving more accurate identification of bitter peptides. In the proposed iBitter-Fuse, we have integrated a variety of feature encoding schemes for providing sufficient information from different aspects, namely consisting of compositional information and physicochemical properties. To enhance the predictive performance, the customized genetic algorithm utilizing self-assessment-report (GA-SAR) was employed for identifying informative features followed by inputting optimal ones into a support vector machine (SVM)-based classifier for developing the final model (iBitter-Fuse). Benchmarking experiments based on both 10-fold cross-validation and independent tests indicated that the iBitter-Fuse was able to achieve more accurate performance as compared to state-of-the-art methods. To facilitate the high-throughput identification of bitter peptides, the iBitter-Fuse web server was established and made freely available online. It is anticipated that the iBitter-Fuse will be a useful tool for aiding the discovery and de novo design of bitter peptides

Download Full-text

Comparison and Combination of Thermal, Fluorescence, and Hyperspectral Imaging for Monitoring Fusarium Head Blight of Wheat on Spikelet Scale

Sensors ◽

10.3390/s19102281 ◽

2019 ◽

Vol 19 (10) ◽

pp. 2281 ◽

Cited By ~ 16

Author(s):

Anne-Katrin Mahlein ◽

Elias Alisaac ◽

Ali Al Masri ◽

Jan Behmann ◽

Heinz-Wilhelm Dehne ◽

...

Keyword(s):

Fusarium Head Blight ◽

Hyperspectral Imaging ◽

Optical Sensors ◽

Fusarium Culmorum ◽

Maximum Temperature ◽

Support Vector ◽

Head Blight ◽

Maximum Temperature Difference ◽

Simple Ratio

Optical sensors have shown high capabilities to improve the detection and monitoring of plant disease development. This study was designed to compare the feasibility of different sensors to characterize Fusarium head blight (FHB) caused by Fusarium graminearum and Fusarium culmorum. Under controlled conditions, time-series measurements were performed with infrared thermography (IRT), chlorophyll fluorescence imaging (CFI), and hyperspectral imaging (HSI) starting 3 days after inoculation (dai). IRT allowed the visualization of temperature differences within the infected spikelets beginning 5 dai. At the same time, a disorder of the photosynthetic activity was confirmed by CFI via maximal fluorescence yields of spikelets (Fm) 5 dai. Pigment-specific simple ratio PSSRa and PSSRb derived from HSI allowed discrimination between Fusarium-infected and non-inoculated spikelets 3 dai. This effect on assimilation started earlier and was more pronounced with F. graminearum. Except the maximum temperature difference (MTD), all parameters derived from different sensors were significantly correlated with each other and with disease severity (DS). A support vector machine (SVM) classification of parameters derived from IRT, CFI, or HSI allowed the differentiation between non-inoculated and infected spikelets 3 dai with an accuracy of 78, 56 and 78%, respectively. Combining the IRT-HSI or CFI-HSI parameters improved the accuracy to 89% 30 dai.

Download Full-text