Breast Cancer Prediction Using Machine Learning Algorithm with Big Data Concept

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1207232 ◽

2020 ◽

pp. 123-127

Author(s):

R. Nirmalan ◽

M. Javith Hussain Khan ◽

V. Sounder ◽

A. Manikkaraja

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Learning Algorithm ◽

Support Vector ◽

Data Set ◽

Cancer Prediction ◽

Modern Computer ◽

Huge Data

The evolution in modern computer technology produce an huge amount of data by the way of using updated technology world with the lot and lot of inventions. The algorithms which we used in machine-learning traditionally might not support the concept of big data. Here we have discussed and implemented the solution for the problem, while predicting breast cancer using big data. DNA methylation (DM) as well gene expression (GE) are the two types of data used for the prediction of breast cancer. The main objective is to classify individual data set in the separate manner. To achieve this main objective, we have used a platform Apache Spark. Here,we have applied three types of algorithms used for classification, they are decision tree, random forest algorithm, support vector machine algorithm which will be mentioned as SVM .These three types of algorithm used for producing models used for breast cancer prediction. Analyze have done for finding which algorithm will produce the better result with good accuracy and less error rate. Additionally, the platforms like Weka and Spark are compared, to find which will have the better performance while dealing with the huge data. The obtained outcome have proved that the Support Vector Machine classifier which is scalable might given the better performance than all other classifiers and it have achieved the lowest error range with the highest accuracy using GE data set

Download Full-text

Big Data for Health Care Analytics using Extreme Machine Learning Based on Map Reduce

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5808.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2758-2762

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Storage ◽

Clinical Data ◽

Disease Risk ◽

Learning Algorithm ◽

Information Storage ◽

Support Vector ◽

Machine Learning Algorithm ◽

Data Set

A large volume of datasets is available in various fields that are stored to be somewhere which is called big data. Big Data healthcare has clinical data set of every patient records in huge amount and they are maintained by Electronic Health Records (EHR). More than 80 % of clinical data is the unstructured format and reposit in hundreds of forms. The challenges and demand for data storage, analysis is to handling large datasets in terms of efficiency and scalability. Hadoop Map reduces framework uses big data to store and operate any kinds of data speedily. It is not solely meant for storage system however conjointly a platform for information storage moreover as processing. It is scalable and fault-tolerant to the systems. Also, the prediction of the data sets is handled by machine learning algorithm. This work focuses on the Extreme Machine Learning algorithm (ELM) that can utilize the optimized way of finding a solution to find disease risk prediction by combining ELM with Cuckoo Search optimization-based Support Vector Machine (CS-SVM). The proposed work also considers the scalability and accuracy of big data models, thus the proposed algorithm greatly achieves the computing work and got good results in performance of both veracity and efficiency.

Download Full-text

Iterative Reweighted Noninteger Norm Regularizing SVM for Gene Expression Data Classification

Computational and Mathematical Methods in Medicine ◽

10.1155/2013/768404 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Jianwei Liu ◽

Shuang Cheng Li ◽

Xionglin Luo

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Adaptive Learning ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Training Dataset ◽

Support Vector ◽

Data Set ◽

Cancer Data ◽

Public Data

Support vector machine is an effective classification and regression method that uses machine learning theory to maximize the predictive accuracy while avoiding overfitting of data.L2regularization has been commonly used. If the training dataset contains many noise variables,L1regularization SVM will provide a better performance. However, bothL1andL2are not the optimal regularization method when handing a large number of redundant values and only a small amount of data points is useful for machine learning. We have therefore proposed an adaptive learning algorithm using the iterative reweightedp-norm regularization support vector machine for 0 <p≤ 2. A simulated data set was created to evaluate the algorithm. It was shown that apvalue of 0.8 was able to produce better feature selection rate with high accuracy. Four cancer data sets from public data banks were used also for the evaluation. All four evaluations show that the new adaptive algorithm was able to achieve the optimal prediction error using apvalue less thanL1norm. Moreover, we observe that the proposedLppenalty is more robust to noise variables than theL1andL2penalties.

Download Full-text

Using Support Vector Machine Detection of Breast Cancer in Early stage

International Journal for Research in Engineering Application & Management ◽

10.35291/2454-9150.2020.0465 ◽

2020 ◽

pp. 213-216

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Early Stage ◽

Breast Cancer Diagnosis ◽

Support Vector ◽

Svm Classifier ◽

K Nearest Neighbors ◽

Data Set ◽

Sensitivity Specificity

The Breast Cancer is disease which tremendously increased in women’s nowadays. Mammography is technique of low-powered X-ray diagnosis approach for detection and diagnosis of cancer diseases at early stage. The proposed system shows the solution of two problems. First shows to detect tumors as suspicious regions with a weak contrast to their background and second shows way to extract features which categorize tumors. Hence this classification can be done with SVM, a great method of statistical learning has made significant achievement in various field. Discovered in the early 90’s, which led to an interest in machine learning? Here the different types of tumor like Benign, Malignant, or Normal image are classified using the SVM classifier. This techniques shows how easily we can detect region of tumor is present in mammogram images with more than 80% of accuracy rates for linear classification using SVM. The 10-fold cross validation to get an accurate outcome is been used by proposed system. The Wisconsin breast cancer diagnosis data set is referred from UCI machine learning repository. The considering accuracy, sensitivity, specificity, false discovery rate, false omission rate and Matthews’s correlation coefficient is appraised in the proposed system. This Provides good result for both training and testing phase. The techniques also shows accuracy of 98.57% and 97.14% by use of Support Vector Machine and K-Nearest Neighbors

Download Full-text

Big Data based breast cancer prediction using kernel support vector machine with the Gray Wolf Optimization algorithm

Applications of Big Data in Healthcare ◽

10.1016/b978-0-12-820203-6.00003-5 ◽

2021 ◽

pp. 173-194

Author(s):

T. Jayasankar ◽

N.B. Prakash ◽

G.R. Hemalakshmi

Keyword(s):

Breast Cancer ◽

Support Vector Machine ◽

Big Data ◽

Optimization Algorithm ◽

Support Vector ◽

Gray Wolf ◽

Cancer Prediction ◽

Kernel Support Vector Machine

Download Full-text

Detection of Breast Cancer Using Machine Learning Support Vector Machine Algorithm

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.7747 ◽

2019 ◽

Vol 16 (2) ◽

pp. 441-444

Author(s):

D. V. Soundari ◽

R. Padmapriya ◽

C. Thirumariselvi ◽

N. Nanthini ◽

K. Priyadharsini

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Support Vector ◽

Learning Support ◽

Support Vector Machine Algorithm ◽

Breast Cancer Data ◽

Data Set ◽

Cancer Data ◽

Hormone Imbalance

A woman majorly suffers due to breast cancer which is due to hormone imbalance. It leads to huge death in recent years. Early detection of the breast cancer is more important to prevent human lives. Image Processing plays an important to classify and detect the same. So this paper proposes machine learning based cancer classification using support vector machine with Wisconsin breast cancer data set.

Download Full-text

Breast Cancer Prediction using SVM with PCA Feature Selection Method

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952277 ◽

2019 ◽

pp. 969-978

Author(s):

Akshya Yadav ◽

Imlikumla Jamir ◽

Raj Rajeshwari Jain ◽

Mayank Sohani

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Learning Algorithm ◽

Feature Selection Method ◽

Selection Method ◽

Training Dataset ◽

Support Vector ◽

Improved Accuracy

Cancer has been characterized as one of the leading diseases that cause death in humans. Breast cancer, being a subtype of cancer, causes death in one out of every eight women worldwide. The solution to counter this is by conducting early and accurate diagnosis for faster treatment. To achieve such accuracy in a short span of time proves difficult with existing techniques. Also, the medical tests conducted in hospitals for detecting cancer is expensive and is difficult for any common man to afford. To counter these problems, in this paper, we use the concept of applying Support Vector machine a Machine Learning algorithm to predict whether a person is prone to breast cancer. We evaluate the performance of this algorithm by calculating its accuracy and apply a min-max scaling method so as to counter and overcome the problem of overfitting and outliers. After scaling of the dataset, we apply a feature selection method called Principle component analysis to improve the algorithms accuracy by decreasing the number of parameters. The final algorithm has improved accuracy with the absence of overfitting and outliers, thus this algorithm can be used to develop and build systems that can be deployed in clinics, hospitals and medical centers for early and quick diagnosis of breast cancer. The training dataset is from the University of Wisconsin (UCI) Machine Learning Repository which is used to evaluate the performance of the Support vector machine by calculating its accuracy.

Download Full-text

Housing Value Forecasting Based on Machine Learning Methods

Abstract and Applied Analysis ◽

10.1155/2014/648047 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 8

Author(s):

Jingyi Mu ◽

Fang Wu ◽

Aihua Zhang

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Least Squares ◽

Optimal Solution ◽

Support Vector ◽

Learning Methods ◽

Data Set ◽

Machine Learning Methods ◽

The Government

In the era of big data, many urgent issues to tackle in all walks of life all can be solved via big data technique. Compared with the Internet, economy, industry, and aerospace fields, the application of big data in the area of architecture is relatively few. In this paper, on the basis of the actual data, the values of Boston suburb houses are forecast by several machine learning methods. According to the predictions, the government and developers can make decisions about whether developing the real estate on corresponding regions or not. In this paper, support vector machine (SVM), least squares support vector machine (LSSVM), and partial least squares (PLS) methods are used to forecast the home values. And these algorithms are compared according to the predicted results. Experiment shows that although the data set exists serious nonlinearity, the experiment result also show SVM and LSSVM methods are superior to PLS on dealing with the problem of nonlinearity. The global optimal solution can be found and best forecasting effect can be achieved by SVM because of solving a quadratic programming problem. In this paper, the different computation efficiencies of the algorithms are compared according to the computing times of relevant algorithms.

Download Full-text

Breast Cancer Detection with Revamped Dataset Using Machine Learning Techniques

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3892 ◽

2021 ◽

Vol 11 (12) ◽

pp. 2996-3009

Author(s):

Sundarambal Balaraman ◽

Ramesh Ramamoorthy ◽

Raja Krishnamoorthi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Machine Learning Techniques ◽

Support Vector ◽

Data Set ◽

Cancer Data ◽

Learning Techniques ◽

Incidence And Mortality

Machine learning is a current topic of interest in research and industry, with the implementation of novel strategies all the time. The main purpose of this research activity is to determine the efficiency of machine learning techniques in the detection research of breast cancer. The incidence and mortality of breast cancer in women are increasing day by day. Worldwide, researchers have worked hard to help clinicians provide the best model for detecting diagnosis and breast cancer. In this work, learning UCI machine Wisconsin breast cancer data from a set of databases, model, and analyze the performance of existing work use, compared to the same data set. The dataset is analyzed, and the revamped dataset is constructed by eliminating redundant features and appending new features essential for prediction. Logistic regression, K nearest neighbors (KNN), support vector machine (SVM), decision trees, random forest, XGBoost, using a machine learning algorithm, such as re-organized data set of artificial neural network AdaBoost, 8 one of prediction build the model application (ANN). Standard to analyze the accuracy rate. In the experiment, these classifications have been shown to work for breast cancer with >97% accuracy. Logistic regression, XGBoost and Adaboost, stand on top with 99.28 percent accuracy. The experiment also, the balanced data set of removal outliers and balance, shows that have a significant impact on the model’s prediction performance.

Download Full-text

A Reckoning Analysis and Assessment of Different Supervised Machine Learning Algorithm for Breast Cancer Prediction

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i3.8388 ◽

2019 ◽

Vol 7 (3) ◽

pp. 83-88

Author(s):

Pragati Prakash ◽

Nidhi Ekka ◽

Manjit Jaiswal

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Learning Algorithm ◽

Supervised Machine Learning ◽

Machine Learning Algorithm ◽

Cancer Prediction

Download Full-text

Misalignment Detection of a Rotating Machine Shaft Using a Support Vector Machine Learning Algorithm

International Journal of Precision Engineering and Manufacturing ◽

10.1007/s12541-020-00462-1 ◽

2021 ◽

Vol 22 (3) ◽

pp. 409-416

Author(s):

Yong Eun Lee ◽

Bok-Kyung Kim ◽

Jun-Hee Bae ◽

Kyung Chun Kim

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Learning Algorithm ◽

Support Vector ◽

Machine Learning Algorithm ◽

Rotating Machine

Download Full-text