scholarly journals Machine Learning Based Diabetes Classification and Prediction for Healthcare Applications

2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Umair Muneer Butt ◽  
Sukumar Letchmunan ◽  
Mubashir Ali ◽  
Fadratul Hafinaz Hassan ◽  
Anees Baqir ◽  
...  

The remarkable advancements in biotechnology and public healthcare infrastructures have led to a momentous production of critical and sensitive healthcare data. By applying intelligent data analysis techniques, many interesting patterns are identified for the early and onset detection and prevention of several fatal diseases. Diabetes mellitus is an extremely life-threatening disease because it contributes to other lethal diseases, i.e., heart, kidney, and nerve damage. In this paper, a machine learning based approach has been proposed for the classification, early-stage identification, and prediction of diabetes. Furthermore, it also presents an IoT-based hypothetical diabetes monitoring system for a healthy and affected person to monitor his blood glucose (BG) level. For diabetes classification, three different classifiers have been employed, i.e., random forest (RF), multilayer perceptron (MLP), and logistic regression (LR). For predictive analysis, we have employed long short-term memory (LSTM), moving averages (MA), and linear regression (LR). For experimental evaluation, a benchmark PIMA Indian Diabetes dataset is used. During the analysis, it is observed that MLP outperforms other classifiers with 86.08% of accuracy and LSTM improves the significant prediction with 87.26% accuracy of diabetes. Moreover, a comparative analysis of the proposed approach is also performed with existing state-of-the-art techniques, demonstrating the adaptability of the proposed approach in many public healthcare applications.

2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3678
Author(s):  
Dongwon Lee ◽  
Minji Choi ◽  
Joohyun Lee

In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.


2020 ◽  
Vol 27 (3) ◽  
pp. 373-389 ◽  
Author(s):  
Ashesh Chattopadhyay ◽  
Pedram Hassanzadeh ◽  
Devika Subramanian

Abstract. In this paper, the performance of three machine-learning methods for predicting short-term evolution and for reproducing the long-term statistics of a multiscale spatiotemporal Lorenz 96 system is examined. The methods are an echo state network (ESN, which is a type of reservoir computing; hereafter RC–ESN), a deep feed-forward artificial neural network (ANN), and a recurrent neural network (RNN) with long short-term memory (LSTM; hereafter RNN–LSTM). This Lorenz 96 system has three tiers of nonlinearly interacting variables representing slow/large-scale (X), intermediate (Y), and fast/small-scale (Z) processes. For training or testing, only X is available; Y and Z are never known or used. We show that RC–ESN substantially outperforms ANN and RNN–LSTM for short-term predictions, e.g., accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps equivalent to several Lyapunov timescales. The RNN–LSTM outperforms ANN, and both methods show some prediction skills too. Furthermore, even after losing the trajectory, data predicted by RC–ESN and RNN–LSTM have probability density functions (pdf's) that closely match the true pdf – even at the tails. The pdf of the data predicted using ANN, however, deviates from the true pdf. Implications, caveats, and applications to data-driven and data-assisted surrogate modeling of complex nonlinear dynamical systems, such as weather and climate, are discussed.


Computers ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 4 ◽  
Author(s):  
Jurgita Kapočiūtė-Dzikienė ◽  
Robertas Damaševičius ◽  
Marcin Woźniak

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.


2021 ◽  
Vol 1 (1) ◽  
pp. 199-218
Author(s):  
Mostofa Ahsan ◽  
Rahul Gomes ◽  
Md. Minhaz Chowdhury ◽  
Kendall E. Nygard

Machine learning algorithms are becoming very efficient in intrusion detection systems with their real time response and adaptive learning process. A robust machine learning model can be deployed for anomaly detection by using a comprehensive dataset with multiple attack types. Nowadays datasets contain many attributes. Such high dimensionality of datasets poses a significant challenge to information extraction in terms of time and space complexity. Moreover, having so many attributes may be a hindrance towards creation of a decision boundary due to noise in the dataset. Large scale data with redundant or insignificant features increases the computational time and often decreases goodness of fit which is a critical issue in cybersecurity. In this research, we have proposed and implemented an efficient feature selection algorithm to filter insignificant variables. Our proposed Dynamic Feature Selector (DFS) uses statistical analysis and feature importance tests to reduce model complexity and improve prediction accuracy. To evaluate DFS, we conducted experiments on two datasets used for cybersecurity research namely Network Security Laboratory (NSL-KDD) and University of New South Wales (UNSW-NB15). In the meta-learning stage, four algorithms were compared namely Bidirectional Long Short-Term Memory (Bi-LSTM), Gated Recurrent Units, Random Forest and a proposed Convolutional Neural Network and Long Short-Term Memory (CNN-LSTM) for accuracy estimation. For NSL-KDD, experiments revealed an increment in accuracy from 99.54% to 99.64% while reducing feature size of one-hot encoded features from 123 to 50. In UNSW-NB15 we observed an increase in accuracy from 90.98% to 92.46% while reducing feature size from 196 to 47. The proposed approach is thus able to achieve higher accuracy while significantly lowering number of features required for processing.


2020 ◽  
Author(s):  
Frederik Kratzert ◽  
Daniel Klotz ◽  
Günter Klambauer ◽  
Grey Nearing ◽  
Sepp Hochreiter

<p>Simulation accuracy among traditional hydrological models usually degrades significantly when going from single basin to regional scale. Hydrological models perform best when calibrated for specific basins, and do worse when a regional calibration scheme is used. </p><p>One reason for this is that these models do not (have to) learn hydrological processes from data. Rather, they have a predefined model structure and only a handful of parameters adapt to specific basins. This often yields less-than-optimal parameter values when the loss is not determined by a single basin, but by many through regional calibration.</p><p>The opposite is true for data driven approaches where models tend to get better with more and diverse training data. We examine whether this holds true when modeling rainfall-runoff processes with deep learning, or if, like their process-based counterparts, data-driven hydrological models degrade when going from basin to regional scale.</p><p>Recently, Kratzert et al. (2018) showed that the Long Short-Term Memory network (LSTM), a special type of recurrent neural network, achieves comparable performance to the SAC-SMA at basin scale. In follow up work Kratzert et al. (2019a) trained a single LSTM for hundreds of basins in the continental US, which outperformed a set of hydrological models significantly, even compared to basin-calibrated hydrological models. On average, a single LSTM is even better in out-of-sample predictions (ungauged) compared to the SAC-SMA in-sample (gauged) or US National Water Model (Kratzert et al. 2019b).</p><p>LSTM-based approaches usually involve tuning a large number of hyperparameters, such as the number of neurons, number of layers, and learning rate, that are critical for the predictive performance. Therefore, large-scale hyperparameter search has to be performed to obtain a proficient LSTM network.  </p><p>However, in the abovementioned studies, hyperparameter optimization was not conducted at large scale and e.g. in Kratzert et al. (2018) the same network hyperparameters were used in all basins, instead of tuning hyperparameters for each basin separately. It is yet unclear whether LSTMs follow the same trend of traditional hydrological models to degrade performance from basin to regional scale. </p><p>In the current study, we performed a computational expensive, basin-specific hyperparameter search to explore how site-specific LSTMs differ in performance compared to regionally calibrated LSTMs. We compared our results to the mHM and VIC models, once calibrated per-basin and once using an MPR regionalization scheme. These benchmark models were calibrated individual research groups, to eliminate bias in our study. We analyse whether differences in basin-specific vs regional model performance can be linked to basin attributes or data set characteristics.</p><p>References:</p><p>Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, https://doi.org/10.5194/hess-22-6005-2018, 2018. </p><p>Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, https://doi.org/10.5194/hess-23-5089-2019, 2019a. </p><p>Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A. K., Hochreiter, S., & Nearing, G. S.: Toward improved predictions in ungauged basins: Exploiting the power of machine learning. Water Resources Research, 55. https://doi.org/10.1029/2019WR026065, 2019b.</p>


2020 ◽  
Vol 17 (8) ◽  
pp. 3449-3452
Author(s):  
M. S. Roobini ◽  
Y. Sai Satwick ◽  
A. Anil Kumar Reddy ◽  
M. Lakshmi ◽  
D. Deepa ◽  
...  

In today’s world diabetes is the major health challenges in India. It is a group of a syndrome that results in too much sugar in the blood. It is a protracted condition that affects the way the body mechanizes the blood sugar. Prevention and prediction of diabetes mellitus is increasingly gaining interest in medical sciences. The aim is how to predict at an early stage of diabetes using different machine learning techniques. In this paper basically, we use well-known classification that are Decision tree, K-Nearest Neighbors, Support Vector Machine, and Random forest. These classification techniques used with Pima Indians diabetes dataset. Therefore, we predict diabetes at different stage and analyze the performance of different classification techniques. We Also proposed a conceptual model for the prediction of diabetes mellitus using different machine learning techniques. In this paper we also compare the accuracy of the different machine learning techniques to finding the diabetes mellitus at early stage.


Sign in / Sign up

Export Citation Format

Share Document