DAuGAN: An Approach for Augmenting Time Series Imbalanced Datasets via Latent Space Sampling Using Adversarial Techniques

Scientific Programming ◽

10.1155/2021/7877590 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Andrei Bratu ◽

Gabriela Czibula

Keyword(s):

Machine Learning ◽

Time Series ◽

Data Science ◽

Data Augmentation ◽

Synthetic Data ◽

Generative Adversarial Networks ◽

Learning Agent ◽

Machine Learning Model ◽

Data Points ◽

And Performance

Data augmentation is a commonly used technique in data science for improving the robustness and performance of machine learning models. The purpose of the paper is to study the feasibility of generating synthetic data points of temporal nature towards this end. A general approach named DAuGAN (Data Augmentation using Generative Adversarial Networks) is presented for identifying poorly represented sections of a time series, studying the synthesis and integration of new data points, and performance improvement on a benchmark machine learning model. The problem is studied and applied in the domain of algorithmic trading, whose constraints are presented and taken into consideration. The experimental results highlight an improvement in performance on a benchmark reinforcement learning agent trained on a dataset enhanced with DAuGAN to trade a financial instrument.

Download Full-text

Forecasting PV Panel Output Using Prophet Time Series Machine Learning Model

2020 IEEE REGION 10 CONFERENCE (TENCON) ◽

10.1109/tencon50793.2020.9293751 ◽

2020 ◽

Author(s):

Md. Mehedi Hasan Shawon ◽

Sumaiya Akter ◽

Md. Kamrul Islam ◽

Sabbir Ahmed ◽

Md. Mosaddequr Rahman

Keyword(s):

Machine Learning ◽

Time Series ◽

Learning Model ◽

Pv Panel ◽

Machine Learning Model

Download Full-text

A Generative Adversarial Network (GAN) Technique for Internet of Medical Things Data

Sensors ◽

10.3390/s21113726 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3726

Author(s):

Ivan Vaccari ◽

Vanessa Orani ◽

Alessia Paglialonga ◽

Enrico Cambiaso ◽

Maurizio Mongelli

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Monitoring Program ◽

Clinical Decision Support Systems ◽

Direct Access ◽

Generative Adversarial Networks ◽

Chronic Obstructive ◽

Generative Adversarial Network ◽

Internet Of Medical Things ◽

Synthetic Datasets

The application of machine learning and artificial intelligence techniques in the medical world is growing, with a range of purposes: from the identification and prediction of possible diseases to patient monitoring and clinical decision support systems. Furthermore, the widespread use of remote monitoring medical devices, under the umbrella of the “Internet of Medical Things” (IoMT), has simplified the retrieval of patient information as they allow continuous monitoring and direct access to data by healthcare providers. However, due to possible issues in real-world settings, such as loss of connectivity, irregular use, misuse, or poor adherence to a monitoring program, the data collected might not be sufficient to implement accurate algorithms. For this reason, data augmentation techniques can be used to create synthetic datasets sufficiently large to train machine learning models. In this work, we apply the concept of generative adversarial networks (GANs) to perform a data augmentation from patient data obtained through IoMT sensors for Chronic Obstructive Pulmonary Disease (COPD) monitoring. We also apply an explainable AI algorithm to demonstrate the accuracy of the synthetic data by comparing it to the real data recorded by the sensors. The results obtained demonstrate how synthetic datasets created through a well-structured GAN are comparable with a real dataset, as validated by a novel approach based on machine learning.

Download Full-text

Predicting the New Cases of Coronavirus [COVID-19] in India by Using Time Series Analysis as Machine Learning Model in Python

Journal of The Institution of Engineers (India) Series B ◽

10.1007/s40031-021-00546-0 ◽

2021 ◽

Author(s):

Vikas Kulshreshtha ◽

N. K. Garg

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Analysis ◽

Learning Model ◽

Series Analysis ◽

Machine Learning Model

Download Full-text

Modeling of Psychomotor Reactions of a Person Based on Modification of the Tapping Test

International Journal of Computing ◽

10.47839/ijc.20.2.2166 ◽

2021 ◽

pp. 190-200

Author(s):

Lesia Mochurad ◽

Yaroslav Hladun

Keyword(s):

Neural Network ◽

Machine Learning ◽

Time Series ◽

Real Data ◽

Finger Tapping ◽

Similar Distribution ◽

Model Learning ◽

Machine Learning Model ◽

Finger Tapping Test

The paper considers the method for analysis of a psychophysical state of a person on psychomotor indicators – finger tapping test. The app for mobile phone that generalizes the classic tapping test is developed for experiments. Developed tool allows collecting samples and analyzing them like individual experiments and like dataset as a whole. The data based on statistical methods and optimization of hyperparameters is investigated for anomalies, and an algorithm for reducing their number is developed. The machine learning model is used to predict different features of the dataset. These experiments demonstrate the data structure obtained using finger tapping test. As a result, we gained knowledge of how to conduct experiments for better generalization of the model in future. A method for removing anomalies is developed and it can be used in further research to increase an accuracy of the model. Developed model is a multilayer recurrent neural network that works well with the classification of time series. Error of model learning on a synthetic dataset is 1.5% and on a real data from similar distribution is 5%.

Download Full-text

A machine learning approach to predicting short-term mortality risk in patients starting chemotherapy

10.1101/204081 ◽

2017 ◽

Cited By ~ 2

Author(s):

Aymen A. Elfiky ◽

Maximilian J. Pany ◽

Ravi B. Parikh ◽

Ziad Obermeyer

Keyword(s):

Machine Learning ◽

Mortality Risk ◽

Palliative Chemotherapy ◽

Learning Algorithm ◽

Cancer Center ◽

Short Term ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Short Term Mortality ◽

And Performance

ABSTRACTBackgroundCancer patients who die soon after starting chemotherapy incur costs of treatment without benefits. Accurately predicting mortality risk from chemotherapy is important, but few patient data-driven tools exist. We sought to create and validate a machine learning model predicting mortality for patients starting new chemotherapy.MethodsWe obtained electronic health records for patients treated at a large cancer center (26,946 patients; 51,774 new regimens) over 2004-14, linked to Social Security data for date of death. The model was derived using 2004-11 data, and performance measured on non-overlapping 2012-14 data.Findings30-day mortality from chemotherapy start was 2.1%. Common cancers included breast (21.1%), colorectal (19.3%), and lung (18.0%). Model predictions were accurate for all patients (AUC 0.94). Predictions for patients starting palliative chemotherapy (46.6% of regimens), for whom prognosis is particularly important, remained highly accurate (AUC 0.92). To illustrate model discrimination, we ranked patients initiating palliative chemotherapy by model-predicted mortality risk, and calculated observed mortality by risk decile. 30-day mortality in the highest-risk decile was 22.6%; in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies—even for clinical trial regimens that first appeared in years after the model was trained (AUC 0.94). The model also performed well for prediction of 180-day mortality (AUC 0.87; mortality 74.8% in the highest risk decile vs. 0.2% in the lowest). Predictions were more accurate than data from randomized trials of individual chemotherapies, or SEER estimates.InterpretationA machine learning algorithm accurately predicted short-term mortality in patients starting chemotherapy using EHR data. Further research is necessary to determine generalizability and the feasibility of applying this algorithm in clinical settings.

Download Full-text

Machine Learning Accurately Predicts Next Season NHL Player Injury Before It Occurs: Validation of 10,449 Player-Years from 2007-17

Orthopaedic Journal of Sports Medicine ◽

10.1177/2325967120s00360 ◽

2020 ◽

Vol 8 (7_suppl6) ◽

pp. 2325967120S0036

Author(s):

Audrey Wright ◽

Jaret Karnuta ◽

Bryan Luu ◽

Heather Haeberle ◽

Eric Makhni ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Injury Risk ◽

Medical Personnel ◽

Operating Characteristics ◽

K Nearest Neighbors ◽

Predictive Algorithm ◽

Preventative Intervention ◽

Machine Learning Model ◽

And Performance

Objectives: With the accumulation of big data surrounding National Hockey League (NHL) and the advent of advanced computational processors, machine learning (ML) is ideally suited to develop a predictive algorithm capable of imbibing historical data to accurately project a future player’s availability to play based on prior injury and performance. To the end of leveraging available analytics to permit data-driven injury prevention strategies and informed decisions for NHL franchises beyond static logistic regression (LR) analysis, the objective of this study of NHL players was to (1) characterize the epidemiology of publicly reported NHL injuries from 2007-17, (2) determine the validity of a machine learning model in predicting next season injury risk for both goalies and non-goalies, and (3) compare the performance of modern ML algorithms versus LR analyses. Methods: Hockey player data was compiled for the years 2007 to 2017 from two publicly reported databases in the absence of an official NHL-approved database. Attributes acquired from each NHL player from each professional year included: age, 85 player metrics, and injury history. A total of 5 ML algorithms were created for both non-goalie and goalie data; Random Forest, K-Nearest Neighbors, Naive Bayes, XGBoost, and Top 3 Ensemble. Logistic regression was also performed for both non-goalie and goalie data. Area under the receiver operating characteristics curve (AUC) primarily determined validation. Results: Player data was generated from 2,109 non-goalies and 213 goalies with an average follow-up of 4.5 years. The results are shown below in Table 1.For models predicting following season injury risk for non-goalies, XGBoost performed the best with an AUC of 0.948, compared to an AUC of 0.937 for logistic regression. For models predicting following season injury risk for goalies, XGBoost had the highest AUC with 0.956, compared to an AUC of 0.947 for LR. Conclusion: Advanced ML models such as XGBoost outperformed LR and demonstrated good to excellent capability of predicting whether a publicly reportable injury is likely to occur the next season. As more player-specific data become available, algorithm refinement may be possible to strengthen predictive insights and allow ML to offer quantitative risk management for franchises, present opportunity for targeted preventative intervention by medical personnel, and replace regression analysis as the new gold standard for predictive modeling. [Figure: see text]

Download Full-text

Covid-19 Analysis and Prediction using Data Science and Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39272 ◽

2021 ◽

Vol 9 (12) ◽

pp. 303-307

Author(s):

Akshata Kulkarni

Keyword(s):

Machine Learning ◽

Data Science ◽

Information Dissemination ◽

Prediction Models ◽

Learning Model ◽

Future Trend ◽

Control Measures ◽

Machine Learning Model ◽

Using Data ◽

Novel Coronavirus

Abstract: Officials around the world are using several COVID-19 outbreak prediction models to make educated decisions and enact necessary control measures. In this study, we developed a Machine Learning model which predicts and forecasts the COVID-19 outbreak in India, with the goal of determining the best regression model for an in-depth examination of the novel coronavirus. Based on data available from January 31 to October 31, 2020, collected from Kaggle, this model predicts the number of confirmed cases in Maharashtra. We're using a Machine Learning model to foresee the future trend of these situations. The project has the potential to demonstrate the importance of information dissemination in improving response time and planning ahead of time to help reduce risk.

Download Full-text

MITRE: predicting host status from microbiota time-series data

10.1101/447250 ◽

2018 ◽

Author(s):

Elijah Bogart ◽

Richard Creswell ◽

Georg K. Gerber

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Synthetic Data ◽

Black Box ◽

Series Data ◽

Learning Approaches ◽

Rule Engine ◽

Microbiome Composition ◽

Host Status

AbstractLongitudinal studies are crucial for discovering casual relationships between the microbiome and human disease. We present Microbiome Interpretable Temporal Rule Engine (MITRE), the first machine learning method specifically designed for predicting host status from microbiome time-series data. Our method maintains interpretability by learning predictive rules over automatically inferred time-periods and phylogenetically related microbes. We validate MITRE’s performance on semi-synthetic data, and five real datasets measuring microbiome composition over time in infant and adult cohorts. Our results demonstrate that MITRE performs on par or outperforms “black box” machine learning approaches, providing a powerful new tool enabling discovery of biologically interpretable relationships between microbiome and human host.

Download Full-text

Counterfactual Examples for Data Augmentation: A Case Study

The International FLAIRS Conference Proceedings ◽

10.32473/flairs.v34i1.128503 ◽

2021 ◽

Vol 34 (1) ◽

Author(s):

Md Golam Moula Mehedi Hasan ◽

Douglas A. Talbert

Keyword(s):

Machine Learning ◽

Potential Application ◽

Data Augmentation ◽

Generative Adversarial Networks ◽

Application Area ◽

Learning Models ◽

Adversarial Networks ◽

Feature Values ◽

Machine Learning Models

Counterfactual explanations are gaining in popularity as a way of explaining machine learning models. Counterfactual examples are generally created to help interpret the decision of a model. In this case, if a model makes a certain decision for an instance, the counterfactual examples of that instance reverse the decision of the model. The counterfactual examples can be created by craftily changing particular feature values of the instance. Though counterfactual examples are generated to explain the decision of machine learning models, in this work, we explore another potential application area of counterfactual examples, whether counterfactual examples are useful for data augmentation. We demonstrate the efficacy of this approach on the widely used “Adult-Income” dataset. We consider several scenarios where we do not have enough data and use counterfactual examples to augment the dataset. We compare our approach with Generative Adversarial Networks approach for dataset augmentation. The experimental results show that our proposed approach can be an effective way to augment a dataset.

Download Full-text

Connected Vehicle Data Time Series Dependence for Machine Learning Model Selection and Specification

10.4271/2021-01-0246 ◽

2021 ◽

Author(s):

Dominique Meroux ◽

Cassandra Telenko ◽

Zhen Jiang

Keyword(s):

Machine Learning ◽

Time Series ◽

Model Selection ◽

Learning Model ◽

Connected Vehicle ◽

Vehicle Data ◽

Machine Learning Model ◽

Series Dependence

Download Full-text