Dimensionality Reduction of Sensorial Features by Principal Component Analysis for ANN Machine Learning in Tool Condition Monitoring of CFRP Drilling

Detection of disease at earlier stages is the most challenging one. Datasets of different diseases are available online with different number of features corresponding to a particular disease. Many dimensionality reduction and feature extraction techniques are used nowadays to reduce the number of features in dataset and finding the most appropriate ones. This paper explores the difference in performance of different machine learning models using Principal Component Analysis dimensionality reduction technique on the datasets of Chronic kidney disease and Cardiovascular disease. Further, the authors apply Logistic Regression, K Nearest Neighbour, Naïve Bayes, Support Vector Machine and Random Forest Model on the datasets and compare the performance of the model with and without PCA. A key challenge in the field of data mining and machine learning is building accurate and computationally efficient classifiers for medical applications. With an accuracy of 100% in chronic kidney disease and 85% for heart disease, KNN classifier and logistic regression were revealed to be the most optimal method of predictions for kidney and heart disease respectively.

Download Full-text

Classification of Observations through Combination of the Dimension Reduction and the Cluster Analysis

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.13 ◽

2017 ◽

Vol 7 (8) ◽

pp. 30

Author(s):

Hyeuk Kim

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Cluster Analysis ◽

Unsupervised Learning ◽

Principal Component ◽

Component Analysis ◽

Baseball Players ◽

Partitioning Around Medoids ◽

Different Characteristics

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.

Download Full-text

Analysis of the Bath Motion in the MM-SQC Dynamics Using Unsupervised Machine Learning Dimensionality Reduction Approaches: Principal Component Analysis

10.26434/chemrxiv.13332530 ◽

2020 ◽

Author(s):

Jiawei Peng ◽

Yu Xie ◽

Deping Hu ◽

Zhenggang Lan

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Collective Motion ◽

Principal Component ◽

Component Analysis ◽

Nonadiabatic Dynamics ◽

Trajectory Data ◽

Unsupervised Machine Learning ◽

Physical Knowledge ◽

Vibronic Couplings

The system-plus-bath model is an important tool to understand nonadiabatic dynamics for large molecular systems. The understanding of the collective motion of a huge number of bath modes is essential to reveal their key roles in the overall dynamics. We apply the principal component analysis (PCA) to investigate the bath motion based on the massive data generated from the MM-SQC (symmetrical quasi-classical dynamics method based on the Meyer-Miller mapping Hamiltonian) nonadiabatic dynamics of the excited-state energy transfer dynamics of Frenkel-exciton model. The PCA method clearly clarifies that two types of bath modes, which either display the strong vibronic couplings or have the frequencies close to electronic transition, are very important to the nonadiabatic dynamics. These observations are fully consistent with the physical insights. This conclusion is obtained purely based on the PCA understanding of the trajectory data, without the large involvement of pre-defined physical knowledge. The results show that the PCA approach, one of the simplest unsupervised machine learning methods, is very powerful to analyze the complicated nonadiabatic dynamics in condensed phase involving many degrees of freedom.

Download Full-text

Online tool condition monitoring for ultrasonic metal welding via sensor fusion and machine learning

Journal of Manufacturing Processes ◽

10.1016/j.jmapro.2020.12.050 ◽

2021 ◽

Vol 62 ◽

pp. 806-816

Author(s):

Qasim Nazir ◽

Chenhui Shao

Keyword(s):

Machine Learning ◽

Sensor Fusion ◽

Condition Monitoring ◽

Tool Condition Monitoring ◽

Online Tool ◽

Ultrasonic Metal Welding ◽

Tool Condition ◽

Metal Welding

Download Full-text

Analysis of bath motion in MM-SQC dynamics via dimensionality reduction approach: Principal component analysis

The Journal of Chemical Physics ◽

10.1063/5.0039743 ◽

2021 ◽

Vol 154 (9) ◽

pp. 094122

Author(s):

Jiawei Peng ◽

Yu Xie ◽

Deping Hu ◽

Zhenggang Lan

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Reduction Approach

Download Full-text

Comparative Analysis of Machine Learning Techniques with Principal Component Analysis on Kidney and Heart Disease

10.1109/icesc51422.2021.9533011 ◽

2021 ◽

Author(s):

Reena Chandra ◽

Manoj Kapil ◽

Avinash Sharma

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Heart Disease ◽

Comparative Analysis ◽

Principal Component ◽

Component Analysis ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Shape principal component analysis as a targetless photogrammetric technique for condition monitoring of rotating machines

Measurement ◽

10.1016/j.measurement.2018.09.065 ◽

2019 ◽

Vol 132 ◽

pp. 408-422 ◽

Cited By ~ 3

Author(s):

Benjamin Gwashavanhu ◽

P. Stephan Heyns ◽

Abrie J. Oberholster

Keyword(s):

Principal Component Analysis ◽

Condition Monitoring ◽

Principal Component ◽

Component Analysis ◽

Rotating Machines

Download Full-text

Criteria for choosing the number of dimensions in a principal component analysis: An empirical assessment

10.5753/sbbd.2020.13632 ◽

2020 ◽

Author(s):

Renata Silva ◽

Daniel Oliveira ◽

Davi Pereira Santos ◽

Lucio F.D. Santos ◽

Rodrigo Erthal Wilson ◽

...

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Hypothesis Test ◽

Feature Learning ◽

Principal Component ◽

Component Analysis ◽

Scree Plot ◽

Open Issue ◽

Chained Tasks ◽

High Dimensional Datasets

Principal component analysis (PCA) is an efficient model for the optimization problem of finding d' axes of a subspace Rd' ⊆ Rd so that the mean squared distances from a given set R of points to the axes are minimal. Despite being steadily employed since 1901 in different scenarios, e.g., mechanics, PCA has become an important link in machine learning chained tasks, such as feature learning and AutoML designs. A frequent yet open issue that arises from supervised-based problems is how many PCA axes are required for the performance of machine learning constructs to be tuned. Accordingly, we investigate the behavior of six independent and uncoupled criteria for estimating the number of PCA axes, namely Scree-Plot %, Scree Plot Gap, Kaiser-Guttman, Broken-Stick, p-Score, and 2D. In total, we evaluate the performance of those approaches in 20 high dimensional datasets by using (i) four different classifiers, and (ii) a hypothesis test upon the reported F-Measures. Results indicate Broken-Stick and Scree-Plot % criteria consistently outperformed the competitors regarding supervised-based tasks, whereas estimators Kaiser-Guttman and Scree-Plot Gap delivered poor performances in the same scenarios.

Download Full-text

Physical-oriented and machine learning-based emission modeling in a diesel compression ignition engine: Dimensionality reduction and regression

International Journal of Engine Research ◽

10.1177/14680874211070736 ◽

2022 ◽

pp. 146808742110707

Author(s):

Aran Mohammad ◽

Reza Rezaei ◽

Christopher Hayduk ◽

Thaddaeus Delebinski ◽

Saeid Shahpouri ◽

...

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Factor Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Data Driven ◽

Support Vector ◽

Emission Models ◽

Emission Modeling

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.

Download Full-text

A machine learning approach to medical data identification through principal component analysis

Big Data III: Learning, Analytics, and Applications ◽

10.1117/12.2586038 ◽

2021 ◽

Author(s):

Lorenzo E. Jaques ◽

Arthur C. Depoian ◽

Dong Xie ◽

Colleen P. Bailey ◽

Parthasarathy Guturu

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Medical Data ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text