Multi-objective techniques for feature selection and classification in digital mammography

2021 ◽  
Vol 15 (1) ◽  
pp. 115-125
Author(s):  
Shankar Thawkar ◽  
Law Kumar Singh ◽  
Munish Khanna

Feature selection is a crucial stage in the design of a computer-aided classification system for breast cancer diagnosis. The main objective of the proposed research design is to discover the use of multi-objective particle swarm optimization (MOPSO) and Nondominated sorting genetic algorithm-III (NSGA-III) for feature selection in digital mammography. The Pareto-optimal fronts generated by MOPSO and NSGA-III for two conflicting objective functions are used to select optimal features. An artificial neural network (ANN) is used to compute the fitness of objective functions. The importance of features selected by MOPSO and NSGA-III are assessed using artificial neural networks. The experimental results show that MOPSO based optimization is superior to NSGA-III. MOPSO achieves high accuracy with a 55% feature reduction. MOPSO based feature selection and classification deliver an efficiency of 97.54% with 98.22% sensitivity, 96.82% specificity, 0.9508 Cohen’s kappa coefficient, and area under curve AZ= 0.983 ± 0.003.

2022 ◽  
pp. 703-727
Author(s):  
Audu Musa Mabu ◽  
Rajesh Prasad ◽  
Raghav Yadav

With the progression of bioinformatics, applications of GE profiles on cancer diagnosis along with classification have become an intriguing subject in the bioinformatics field. It holds numerous genes with few samples that make it arduous to examine and process. A novel strategy aimed at the classification of GE dataset as well as clustering-centered feature selection is proposed in the paper. The proposed technique first preprocesses the dataset using normalization, and later, feature selection was accomplished with the assistance of feature clustering support vector machine (FCSVM). It has two phases, gene clustering and gene representation. To make the chose top-positioned features worthy for classification, feature reduction is performed by utilizing SVM-recursive feature elimination (SVM-RFE) algorithm. Finally, the feature-reduced data set was classified using artificial neural network (ANN) classifier. When compared with some recent swarm intelligence feature reduction approach, FCSVM-ANN showed an elegant performance.


Author(s):  
Mariana Gomes da Motta Macedo ◽  
Carmelo J. A. Bastos-Filho ◽  
Susana M. Vieira ◽  
João M. C. Sousa

Fish school search (FSS) algorithm has inspired several adaptations for multi-objective problems or binary optimization. However, there is no particular proposition to solve both problems simultaneously. The proposed multi-objective approach binary fish school search (MOBFSS) aims to solve optimization problems with two or three conflicting objective functions with binary decision input variables. MOBFSS is based on the dominance concept used in the multi-objective fish school search (MOFSS) and the threshold technique deployed in the binary fish school search (BFSS). Additionally, the authors evaluate the proposal for feature selection for classification in well-known datasets. Moreover, the authors compare the performance of the proposal with a state-of-art algorithm called BMOPSO-CDR. MOBFSS presents better results than BMOPSO-CDR, especially for datasets with higher complexity.


Symmetry ◽  
2020 ◽  
Vol 12 (2) ◽  
pp. 271 ◽  
Author(s):  
Md Akizur Rahman ◽  
Ravie Chandren Muniyandi

An artificial neural network (ANN) is a tool that can be utilized to recognize cancer effectively. Nowadays, the risk of cancer is increasing dramatically all over the world. Detecting cancer is very difficult due to a lack of data. Proper data are essential for detecting cancer accurately. Cancer classification has been carried out by many researchers, but there is still a need to improve classification accuracy. For this purpose, in this research, a two-step feature selection (FS) technique with a 15-neuron neural network (NN), which classifies cancer with high accuracy, is proposed. The FS method is utilized to reduce feature attributes, and the 15-neuron network is utilized to classify the cancer. This research utilized the benchmark Wisconsin Diagnostic Breast Cancer (WDBC) dataset to compare the proposed method with other existing techniques, showing a significant improvement of up to 99.4% in classification accuracy. The results produced in this research are more promising and significant than those in existing papers.


2020 ◽  
pp. 2385-2394
Author(s):  
Kamal R. AL-Rawi ◽  
Saifaldeen K. AL-Rawi

Wisconsin Breast Cancer Dataset (WBCD) was employed to show the performance of the Adaptive Resonance Theory (ART), specifically the supervised ART-I Artificial Neural Network (ANN), to build a breast cancer diagnosis smart system. It was fed with different learning parameters and sets. The best result was achieved when the model was trained with 50% of the data and tested with the remaining 50%. Classification accuracy was compared to other artificial intelligence algorithms, which included fuzzy classifier, MLP-ANN, and SVM. We achieved the highest accuracy with such low learning/testing ratio.


2021 ◽  
Author(s):  
Yassmine Soussi ◽  
Nizar Rokbani ◽  
Ali Wali ◽  
Adel Alimi

In this paper a new technique is integrated to Multi-Objective Particle Swarm Optimization (MOPSO) algorithm, named Pareto Neighborhood (PN) topology, to produce MOPSO-PN algorithm. This technique involves iteratively selecting a set of best solutions from the Pareto-Optimal-Fronts and trying to explore them in order to find better clustering results in the next iteration. MOPSO-PN was then used as a Multi?Objective Clustering Optimization (MOCO) Algorithm, it was tested on various datasets (real-life and artificial datasets). Two scenarios have been used to test the performances of MOPSO-PN for clustering: In the first scenario MOPSO-PN utilizes, as objective functions, two clusters validity index (Silhouette?Index and overall-cluster-deviation), three datasets for test, four algorithms for comparison and the average Minkowski Score as metric for evaluating the final clustering result; In the second scenario MOPSO-PN used, as objectives functions, three clusters validity index (I-index, Con-index and Sym?index), 20 datasets for test, ten algorithms for comparison and the F-Measure as metric for evaluating the final clustering result. In both scenarios, MOPSO-PN provided a competitive clustering results and a correct number of clusters for all datasets.


Author(s):  
A. B Yusuf ◽  
R. M Dima ◽  
S. K Aina

Breast cancer is the second most commonly diagnosed cancer in women throughout the world. It is on the rise, especially in developing countries, where the majority of cases are discovered late. Breast cancer develops when cancerous tumors form on the surface of the breast cells. The absence of accurate prognostic models to assist physicians recognize symptoms early makes it difficult to develop a treatment plan that would help patients live longer. However, machine learning techniques have recently been used to improve the accuracy and speed of breast cancer diagnosis. If the accuracy is flawless, the model will be more efficient, and the solution to breast cancer diagnosis will be better. Nevertheless, the primary difficulty for systems developed to detect breast cancer using machine-learning models is attaining the greatest classification accuracy and picking the most predictive feature useful for increasing accuracy. As a result, breast cancer prognosis remains a difficulty in today's society. This research seeks to address a flaw in an existing technique that is unable to enhance classification of continuous-valued data, particularly its accuracy and the selection of optimal features for breast cancer prediction. In order to address these issues, this study examines the impact of outliers and feature reduction on the Wisconsin Diagnostic Breast Cancer Dataset, which was tested using seven different machine learning algorithms. The results show that Logistic Regression, Random Forest, and Adaboost classifiers achieved the greatest accuracy of 99.12%, on removal of outliers from the dataset. Also, this filtered dataset with feature selection, on the other hand, has the greatest accuracy of 100% and 99.12% with Random Forest and Gradient boost classifiers, respectively. When compared to other state-of-the-art approaches, the two suggested strategies outperformed the unfiltered data in terms of accuracy. The suggested architecture might be a useful tool for radiologists to reduce the number of false negatives and positives. As a result, the efficiency of breast cancer diagnosis analysis will be increased.


Author(s):  
Deepak Singh ◽  
Dilip Singh Sisodia ◽  
Pradeep Singh

Background: Biomedical data is filled with continuous real values; these values in the feature set tend to create problems like underfitting, the curse of dimensionality and increase in misclassification rate because of higher variance. In response, pre-processing techniques on dataset minimizes the side effects and have shown success in maintaining the adequate accuracy. Aims: Feature selection and discretization are the two necessary preprocessing steps that were effectively employed to handle the data redundancies in the biomedical data. However, in the previous works, the absence of unified effort by integrating feature selection and discretization together in solving the data redundancy problem leads to the disjoint and fragmented field. This paper proposes a novel multi-objective based dimensionality reduction framework, which incorporates both discretization and feature reduction as an ensemble model for performing feature selection and discretization. Selection of optimal features and the categorization of discretized and non-discretized features from the feature subset is governed by the multi-objective genetic algorithm (NSGA-II). The two objectives, minimizing the error rate during the feature selection and maximizing the information gain, while discretization is considered as fitness criteria. Methods: The proposed model used wrapper-based feature selection algorithm to select the optimal features and categorized these selected features into two blocks namely discretized and nondiscretized blocks. The feature belongs to the discretized block will participate in the binary discretization while the second block features will not be discretized and used in its original form. Results: For the establishment and acceptability of the proposed ensemble model, the experiment is conducted on the fifteen medical datasets, and the metric such as accuracy, mean and standard deviation are computed for the performance evaluation of the classifiers. Conclusion: After an extensive experiment conducted on the dataset, it can be said that the proposed model improves the classification rate and outperform the base learner.


Sign in / Sign up

Export Citation Format

Share Document