Use of advanced statistical learning methods and principal component analysis in quantitative structure–genotoxicity relationship study of amines

Yueying Ren; Baowei Zhao; Xiaojun Yao

doi:10.1135/cccc2010143

Use of advanced statistical learning methods and principal component analysis in quantitative structure–genotoxicity relationship study of amines

Collection of Czechoslovak Chemical Communications ◽

10.1135/cccc2010143 ◽

2011 ◽

Vol 76 (4) ◽

pp. 243-264 ◽

Cited By ~ 1

Author(s):

Yueying Ren ◽

Baowei Zhao ◽

Xiaojun Yao

Keyword(s):

Principal Component Analysis ◽

Statistical Learning ◽

Linear Models ◽

Nonlinear Models ◽

Nonlinear Modeling ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Learning Methods ◽

Test Set

The paper highlighted the use of advanced nonlinear modeling and subset selection techniques in the construction of a good, predictive model for genotoxicity study of amines. Essentials accounting for a reliable model were all considered carefully. Chemicals were represented by a large number of CODESSA descriptors. Division of a whole sample into the training set and the test set was performed by principal component analysis (PCA). Six descriptors selected by the best multi-linear regression (BMLR) method in CODESSA program were used as inputs to build nonlinear models, using advanced statistical learning methods such as support vector machine (SVM) and projection pursuit regression (PPR). The models were validated through three ways, i.e. internal cross-validation (CV), a test set and an independent validation set. Analysis shows that nonlinear models produced better results than linear models and PPR model outperforms the rest in the following order: PPR > SVM > linear SVM ≥ BMLR. In addition, the relationships between the descriptors and the mutagenic behavior of compounds are well discussed.

Download Full-text

Longitudinal Crack Detection Approach Based on Principal Component Analysis and Support Vector Machine for Slab Continuous Casting

steel research international ◽

10.1002/srin.202100168 ◽

2021 ◽

Author(s):

Haiyang Duan ◽

Jingjing Wei ◽

Lin Qi ◽

Xudong Wang ◽

Yu Liu ◽

...

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Continuous Casting ◽

Crack Detection ◽

Longitudinal Crack ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Slab Continuous Casting ◽

Detection Approach

Download Full-text

Prediction of China’s Energy Consumption Based on Robust Principal Component Analysis and PSO-LSSVM Optimized by the Tabu Search Algorithm

Energies ◽

10.3390/en12010196 ◽

2019 ◽

Vol 12 (1) ◽

pp. 196 ◽

Cited By ~ 3

Author(s):

Lihui Zhang ◽

Riletu Ge ◽

Jianxue Chai

Keyword(s):

Principal Component Analysis ◽

Energy Consumption ◽

Tabu Search ◽

Industrial Structure ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Forecasting Model ◽

Robust Principal Component Analysis ◽

Consumption Structure

China’s energy consumption issues are closely associated with global climate issues, and the scale of energy consumption, peak energy consumption, and consumption investment are all the focus of national attention. In order to forecast the amount of energy consumption of China accurately, this article selected GDP, population, industrial structure and energy consumption structure, energy intensity, total imports and exports, fixed asset investment, energy efficiency, urbanization, the level of consumption, and fixed investment in the energy industry as a preliminary set of factors; Secondly, we corrected the traditional principal component analysis (PCA) algorithm from the perspective of eliminating “bad points” and then judged a “bad spot” sample based on signal reconstruction ideas. Based on the above content, we put forward a robust principal component analysis (RPCA) algorithm and chose the first five principal components as main factors affecting energy consumption, including: GDP, population, industrial structure and energy consumption structure, urbanization; Then, we applied the Tabu search (TS) algorithm to the least square to support vector machine (LSSVM) optimized by the particle swarm optimization (PSO) algorithm to forecast China’s energy consumption. We collected data from 1996 to 2010 as a training set and from 2010 to 2016 as the test set. For easy comparison, the sample data was input into the LSSVM algorithm and the PSO-LSSVM algorithm at the same time. We used statistical indicators including goodness of fit determination coefficient (R2), the root means square error (RMSE), and the mean radial error (MRE) to compare the training results of the three forecasting models, which demonstrated that the proposed TS-PSO-LSSVM forecasting model had higher prediction accuracy, generalization ability, and higher training speed. Finally, the TS-PSO-LSSVM forecasting model was applied to forecast the energy consumption of China from 2017 to 2030. According to predictions, we found that China shows a gradual increase in energy consumption trends from 2017 to 2030 and will breakthrough 6000 million tons in 2030. However, the growth rate is gradually tightening and China’s energy consumption economy will transfer to a state of diminishing returns around 2026, which guides China to put more emphasis on the field of energy investment.

Download Full-text

Multi-View Face Detection Based on Kernel Principal Component Analysis and Kernel Support Vector Techniques

International Journal on Soft Computing ◽

10.5121/ijsc.2011.2201 ◽

2011 ◽

Vol 2 (2) ◽

pp. 1-13 ◽

Cited By ~ 5

Author(s):

Muzhir Shaban Al Ani ◽

Alaa Sulaiman Al Waisy

Keyword(s):

Principal Component Analysis ◽

Face Detection ◽

Principal Component ◽

Component Analysis ◽

Kernel Principal Component Analysis ◽

Support Vector

Download Full-text

Spam Detection Approach Based on C-Support Vector Machine and Kernel Principal-Component Analysis

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing ◽

10.1109/iih-msp.2014.64 ◽

2014 ◽

Author(s):

Shu Geng ◽

Liu Lv ◽

Rongjun Liu

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Principal Component ◽

Component Analysis ◽

Kernel Principal Component Analysis ◽

Support Vector ◽

Spam Detection ◽

Detection Approach

Download Full-text

Multimode Monitoring of Oxy-Gas Combustion Through Flame Imaging, Principal Component Analysis, and Kernel Support Vector Machine

Combustion Science and Technology ◽

10.1080/00102202.2016.1250749 ◽

2016 ◽

Vol 189 (5) ◽

pp. 776-792 ◽

Cited By ~ 3

Author(s):

Xiaojing Bai ◽

Gang Lu ◽

Md Moinul Hossain ◽

Yong Yan ◽

Shi Liu

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Gas Combustion ◽

Kernel Support Vector Machine ◽

Flame Imaging

Download Full-text

Multiclass classification of leukemia cancer data using Fuzzy Support Vector Machine (FSVM) with feature selection using Principal Component Analysis (PCA)

Journal of Physics Conference Series ◽

10.1088/1742-6596/1725/1/012012 ◽

2021 ◽

Vol 1725 ◽

pp. 012012

Author(s):

I R Fauzi ◽

Z Rustam ◽

A Wibowo

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Feature Selection ◽

Principal Component ◽

Component Analysis ◽

Multiclass Classification ◽

Support Vector ◽

Fuzzy Support Vector Machine ◽

Cancer Data

Download Full-text

Face Recognition Based on Principal Component Analysis and Support Vector Machine Algorithms

10.23919/ccc52363.2021.9550727 ◽

2021 ◽

Author(s):

Yanbang Zhang ◽

Fen Zhang ◽

Lei Guo

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Face Recognition ◽

Principal Component ◽

Component Analysis ◽

Support Vector

Download Full-text

Batch process monitoring based on global enhanced multiple neighborhoods preserving embedding

Transactions of the Institute of Measurement and Control ◽

10.1177/01423312211044742 ◽

2021 ◽

pp. 014233122110447

Author(s):

Hongjuan Yao ◽

Xiaoqiang Zhao ◽

Wei Li ◽

Yongyong Hui

Keyword(s):

Principal Component Analysis ◽

Fault Detection ◽

Objective Function ◽

Principal Component ◽

Batch Process ◽

Component Analysis ◽

Support Vector ◽

Support Vector Data Description ◽

Order Information ◽

Multiple Neighborhoods

Batch process generally has varying dynamic characteristic that causes low fault detection rate and high false alarm rate, and it is necessary and urgent to monitor batch process. This paper proposes a global enhanced multiple neighborhoods preserving embedding based fault detection strategy for dynamic batch process. Firstly, the angle neighbor is defined and selected to compensate for the insufficient expression for the spatial similarity of samples only by using the distance neighbor, and the time neighbor is introduced to describe the time correlations between samples. These three types of neighbors can fully characterize the similarity of the samples in time and space. Secondly, considering the minimum reconstruction error and the order information of three types of neighbors, an enhanced objective function is constructed to prevent the loss of order information when neighborhood preserving embedding (NPE) calculates the reconstruction weights. Furthermore, the enhanced objective function and a global objective function are organically combined to extract both global and local features, to describe process dynamics and visualize process data in a low-dimensional space. Finally, a monitoring index based on support vector data description is constructed to eliminate adverse effects of non-Gaussian data for monitoring performance. The advantages of the proposed method over principal component analysis, neighborhood preserving embedding, dynamic principal component analysis and time NPE are demonstrated by a numerical example and the penicillin fermentation process simulation.

Download Full-text

Research on Recognition Method of Driving Fatigue State Based on Sample Entropy and Kernel Principal Component Analysis

Entropy ◽

10.3390/e20090701 ◽

2018 ◽

Vol 20 (9) ◽

pp. 701 ◽

Cited By ~ 6

Author(s):

Beige Ye ◽

Taorong Qiu ◽

Xiaoming Bai ◽

Ping Liu

Keyword(s):

Principal Component Analysis ◽

Recognition Accuracy ◽

Principal Component ◽

Component Analysis ◽

Sample Entropy ◽

Kernel Principal Component Analysis ◽

Support Vector ◽

Recognition Method ◽

State Recognition ◽

Driving Fatigue

In view of the nonlinear characteristics of electroencephalography (EEG) signals collected in the driving fatigue state recognition research and the issue that the recognition accuracy of the driving fatigue state recognition method based on EEG is still unsatisfactory, this paper proposes a driving fatigue recognition method based on sample entropy (SE) and kernel principal component analysis (KPCA), which combines the advantage of the high recognition accuracy of sample entropy and the advantages of KPCA in dimensionality reduction for nonlinear principal components and the strong non-linear processing capability. By using support vector machine (SVM) classifier, the proposed method (called SE_KPCA) is tested on the EEG data, and compared with those based on fuzzy entropy (FE), combination entropy (CE), three kinds of entropies including SE, FE and CE that merged with KPCA. Experiment results show that the method is effective.

Download Full-text

Fault diagnosis of modular multilevel converter based on principal component analysis and support vector machine

Journal of Physics Conference Series ◽

10.1088/1742-6596/2030/1/012086 ◽

2021 ◽

Vol 2030 (1) ◽

pp. 012086

Author(s):

Siyu Jiang ◽

Bin Wang ◽

Wanwan Xu

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Fault Diagnosis ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Modular Multilevel Converter ◽

Multilevel Converter

Download Full-text