scholarly journals Forecasting of Breast Cancer and Diabetes Using Ensemble Learning

2019 ◽  
Vol 1 (1) ◽  
pp. 1-5 ◽  
Author(s):  
Shraboni Rudra ◽  
Minhaz Uddin ◽  
Mohammed Minhajul Alam

Machine learning algorithm plays an important role in our life. It is the subset of Artificial intelligence. Recently, everyone tries to use AI or try to invent something related to    AI for making life easier. In the medical field, Machine learning is used for the recognition and classification of diseases. It can classify cancer, diabetes or other diseases more accurately from datasets. So, we propose a model which is the combination of Support vector machine and Ad boost. This combine method is known as Ensemble learner. In this paper, we are predicting diabetes and breast cancer. We have used SVM for classification purpose then have applied Ad boost for boosting purposes. The number of a diabetes patient is increasing very rapidly. It causes many other diseases like kidney failure; Eye disorder etc. No medicines are invented to prevent diabetes fully.  Breast cancer is increasing very rapidly between women. The cost of breast cancer treatment is very high. More researches are running on diabetes and breast cancer. We proposed our model to predict the diseases more accurately rather than the previous models.

2021 ◽  
Vol 11 (2) ◽  
pp. 61
Author(s):  
Jiande Wu ◽  
Chindo Hicks

Background: Breast cancer is a heterogeneous disease defined by molecular types and subtypes. Advances in genomic research have enabled use of precision medicine in clinical management of breast cancer. A critical unmet medical need is distinguishing triple negative breast cancer, the most aggressive and lethal form of breast cancer, from non-triple negative breast cancer. Here we propose use of a machine learning (ML) approach for classification of triple negative breast cancer and non-triple negative breast cancer patients using gene expression data. Methods: We performed analysis of RNA-Sequence data from 110 triple negative and 992 non-triple negative breast cancer tumor samples from The Cancer Genome Atlas to select the features (genes) used in the development and validation of the classification models. We evaluated four different classification models including Support Vector Machines, K-nearest neighbor, Naïve Bayes and Decision tree using features selected at different threshold levels to train the models for classifying the two types of breast cancer. For performance evaluation and validation, the proposed methods were applied to independent gene expression datasets. Results: Among the four ML algorithms evaluated, the Support Vector Machine algorithm was able to classify breast cancer more accurately into triple negative and non-triple negative breast cancer and had less misclassification errors than the other three algorithms evaluated. Conclusions: The prediction results show that ML algorithms are efficient and can be used for classification of breast cancer into triple negative and non-triple negative breast cancer types.


2012 ◽  
Vol 468-471 ◽  
pp. 2916-2919
Author(s):  
Fan Yang ◽  
Yu Chuan Wu

This paper describes how to use a posture sensor to validate human daily activity and by machine learning algorithm - Support Vector Machine (SVM) an outstanding model is built. The optimal parameter σ and c of RBF kernel SVM were obtained by searching automatically. Those kinematic data was carried out through three major steps: wavelet transformation, Principle Component Analysis (PCA) -based dimensionality reduction and k-fold cross-validation, followed by implementing a best classifier to distinguish 6 difference actions. As an activity classifier, the SVM (Support Vector Machine) algorithm is used, and we have achieved over 94.5% of mean accuracy in detecting differential actions. It shows that the verification approach based on the recognition of human activity detection is valuable and will be further explored in the near future.


Author(s):  
Xiaoming Li ◽  
Yan Sun ◽  
Qiang Zhang

In this paper, we focus on developing a novel method to extract sea ice cover (i.e., discrimination/classification of sea ice and open water) using Sentinel-1 (S1) cross-polarization (vertical-horizontal, VH or horizontal-vertical, HV) data in extra wide (EW) swath mode based on the machine learning algorithm support vector machine (SVM). The classification basis includes the S1 radar backscatter coefficients and texture features that are calculated from S1 data using the gray level co-occurrence matrix (GLCM). Different from previous methods where appropriate samples are manually selected to train the SVM to classify sea ice and open water, we proposed a method of unsupervised generation of the training samples based on two GLCM texture features, i.e. entropy and homogeneity, that have contrasting characteristics on sea ice and open water. We eliminate the most uncertainty of selecting training samples in machine learning and achieve automatic classification of sea ice and open water by using S1 EW data. The comparison shows good agreement between the SAR-derived sea ice cover using the proposed method and a visual inspection, of which the accuracy reaches approximately 90% - 95% based on a few cases. Besides this, compared with the analyzed sea ice cover data Ice Mapping System (IMS) based on 728 S1 EW images, the accuracy of extracted sea ice cover by using S1 data is more than 80%.


Author(s):  
Jebasonia Jebamony ◽  
Dheeba Jacob

Background: Breast cancer is one of the most leading causes of cancer deaths among women. Early detection of cancer increases the survival rate of the affected women. Machine learning approaches that are used for classification of breast cancer usually takes a lot of processing time during the training process. This paper attempts to propose a Machine Learning approach for breast cancer detection in mammograms, which does not depend on the number of training samples. Objective: The paper aims to develop a core vector machine-based diagnosis system for breast cancer detection using the date from MIAS. The main motivation behind using this system is to reduce the computational and memory requirement for large training data and to improve the classification accuracy. Methods: The proposed method has four stages: 1) Pre-processing is done to extract the breast region using global thresholding and enhancement using histogram equalization; 2) identification of potential mass using Otsu thresholding; 3) feature extraction using Laws Texture energy measures; and 4) mass detection is done using Core vector machine (CVM) classifier. Results: Comparative analysis was done with different existing algorithms: Artificial Neural Network (ANN), Support Vector Machine (SVM), and Fuzzy Support Vector Machines (FSVM). The results illustrate that the proposed Core Vector Machine (CVM) classifier produced a promising result in terms of sensitivity (96.9%), misclassification rate (0.0443) and accuracy (95.89%). The time taken for training process is 0.0443, which is less when compared with other machine learning algorithms. Conclusion: Performance analysis shows that CVM classifier is superior to other classifiers like ANN, SVM and FSVM. The computational time of the CVM classifier during the training process was also analysed and found to be better than other discussed algorithms. The results achieved show that CVM classifier is the best algorithm for breast mass detection in mammograms.


2019 ◽  
Vol 8 (2) ◽  
pp. 5401-5405

Breast cancer is an alarming disease which takes millions of lives every year. In 2018, it was anticipated that 627,000 women died due to breast cancer – which is around 15% of all deaths caused due to different types of cancers among women. Currently, risk factors of breast cancer cannot be avoided, and early detection is the only way of survival. Automated detection of breast cancer with the help of image processing methods and machine learning algorithms helps in giving more accurate results and less human power. In the proposed system, multiple features are extracted using HSV histogram, LBP, GLCM, 2-D DWT. Support vector machine and LIBSVM classifiers are used for the classification of mammogram images if it’s benign or malign in nature. For classification, the INbreast dataset have been used which includes 115 cases containing 410 images. The dataset is divided into benign and malign category based upon BI-RAIDS scale. According to this partition we have 243 benign images and 100 malign images present in this dataset and a feature matrix of 595 features in total is generated for balanced and unbalanced datasets respectively and fed into SVM and LIBSVM to distinguish the data. The balanced datasets on LIBSVM gave best results with 92% accuracy, 84% sensitivity, 100% specificity and 91.30% F1 score followed by SVM which gave 75% accuracy, 73.61% sensitivity, 76.66% specificity and 75.8% F1 score.


2018 ◽  
Vol 19 (8) ◽  
pp. 2358 ◽  
Author(s):  
Yunyi Wu ◽  
Guanyu Wang

Toxicity prediction is very important to public health. Among its many applications, toxicity prediction is essential to reduce the cost and labor of a drug’s preclinical and clinical trials, because a lot of drug evaluations (cellular, animal, and clinical) can be spared due to the predicted toxicity. In the era of Big Data and artificial intelligence, toxicity prediction can benefit from machine learning, which has been widely used in many fields such as natural language processing, speech recognition, image recognition, computational chemistry, and bioinformatics, with excellent performance. In this article, we review machine learning methods that have been applied to toxicity prediction, including deep learning, random forests, k-nearest neighbors, and support vector machines. We also discuss the input parameter to the machine learning algorithm, especially its shift from chemical structural description only to that combined with human transcriptome data analysis, which can greatly enhance prediction accuracy.


2021 ◽  
Vol 12 (4) ◽  
pp. 117-137
Author(s):  
Mazen Mobtasem El-Lamey ◽  
Mohab Mohammed Eid ◽  
Muhammad Gamal ◽  
Nour-Elhoda Mohamed Bishady ◽  
Ali Wagdy Mohamed

There are many cancer patients, especially breast cancer patients as it is the most common type of cancer. Due to the huge number of breast cancer patients, many breast cancer-focused hospitals aren't able to process the huge number of patients and might expose some women to late stages of cancer. Thus, the automation of the process can help these hospitals in speeding up the process of cancer detection. In this paper, the authors test several machine learning models such as k-nearest neighbours (KNN), support vector machine (SVM), and artificial neural network (ANN). They then compare their accuracies and losses with themselves and other models that have been developed by other researchers to see whether their approach is efficient or not and to decide what machine learning algorithm is best to use.


2014 ◽  
Vol 687-691 ◽  
pp. 2693-2697
Author(s):  
Li Ding ◽  
Li Mao ◽  
Xiao Feng Wang

One single machine learning algorithm presents shortcomings when the data environment changes in the process of application. This article puts forward a heteromorphic ensemble learning model made up of bayes, support vector machine (SVM) and decision tree which classifies P2P traffic by voting principle. The experiment shows that the model can significantly improve the classification accuracy, and has a good stability.


Author(s):  
Akshya Yadav ◽  
Imlikumla Jamir ◽  
Raj Rajeshwari Jain ◽  
Mayank Sohani

Cancer has been characterized as one of the leading diseases that cause death in humans. Breast cancer, being a subtype of cancer, causes death in one out of every eight women worldwide. The solution to counter this is by conducting early and accurate diagnosis for faster treatment. To achieve such accuracy in a short span of time proves difficult with existing techniques. Also, the medical tests conducted in hospitals for detecting cancer is expensive and is difficult for any common man to afford. To counter these problems, in this paper, we use the concept of applying Support Vector machine a Machine Learning algorithm to predict whether a person is prone to breast cancer. We evaluate the performance of this algorithm by calculating its accuracy and apply a min-max scaling method so as to counter and overcome the problem of overfitting and outliers. After scaling of the dataset, we apply a feature selection method called Principle component analysis to improve the algorithms accuracy by decreasing the number of parameters. The final algorithm has improved accuracy with the absence of overfitting and outliers, thus this algorithm can be used to develop and build systems that can be deployed in clinics, hospitals and medical centers for early and quick diagnosis of breast cancer. The training dataset is from the University of Wisconsin (UCI) Machine Learning Repository which is used to evaluate the performance of the Support vector machine by calculating its accuracy.


Author(s):  
R. Nirmalan ◽  
M. Javith Hussain Khan ◽  
V. Sounder ◽  
A. Manikkaraja

The evolution in modern computer technology produce an huge amount of data by the way of using updated technology world with the lot and lot of inventions. The algorithms which we used in machine-learning traditionally might not support the concept of big data. Here we have discussed and implemented the solution for the problem, while predicting breast cancer using big data. DNA methylation (DM) as well gene expression (GE) are the two types of data used for the prediction of breast cancer. The main objective is to classify individual data set in the separate manner. To achieve this main objective, we have used a platform Apache Spark. Here,we have applied three types of algorithms used for classification, they are decision tree, random forest algorithm, support vector machine algorithm which will be mentioned as SVM .These three types of algorithm used for producing models used for breast cancer prediction. Analyze have done for finding which algorithm will produce the better result with good accuracy and less error rate. Additionally, the platforms like Weka and Spark are compared, to find which will have the better performance while dealing with the huge data. The obtained outcome have proved that the Support Vector Machine classifier which is scalable might given the better performance than all other classifiers and it have achieved the lowest error range with the highest accuracy using GE data set


Sign in / Sign up

Export Citation Format

Share Document