DeepMicro: deep representation learning for disease prediction based on microbiome data

Mapping Intimacies ◽

10.1101/785626 ◽

2019 ◽

Author(s):

Min Oh ◽

Liqing Zhang

Keyword(s):

Machine Learning ◽

Optimization Procedure ◽

Representation Learning ◽

Disease Prediction ◽

Machine Learning Classification ◽

Evaluation Scheme ◽

Marker Profile ◽

Model Training ◽

Low Dimensional ◽

Microbiome Data

AbstractHuman microbiota plays a key role in human health and growing evidence supports the potential use of microbiome as a predictor of various diseases. However, the high-dimensionality of microbiome data, often in the order of hundreds of thousands, yet low sample sizes, poses great challenge for machine learning-based prediction algorithms. This imbalance induces the data to be highly sparse, preventing from learning a better prediction model. Also, there has been little work on deep learning applications to microbiome data with a rigorous evaluation scheme. To address these challenges, we propose DeepMicro, a deep representation learning framework allowing for an effective representation of microbiome profiles. DeepMicro successfully transforms high-dimensional microbiome data into a robust low-dimensional representation using various autoencoders and applies machine learning classification algorithms on the learned representation. In disease prediction, DeepMicro outperforms the current best approaches based on the strain-level marker profile in five different datasets. In addition, by significantly reducing the dimensionality of the marker profile, DeepMicro accelerates the model training and hyperparameter optimization procedure with 8X-30X speedup over the basic approach. DeepMicro is freely available at https://github.com/minoh0201/DeepMicro.

Download Full-text

Identification of Cardio Diseases in Modern Healthcare using Machine Learning Classification

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i2.2020 ◽

2021 ◽

Vol 12 (2) ◽

pp. 2356-2365

Author(s):

P.Priya Et. al.

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Good Accuracy ◽

Heart Diseases ◽

Support Vector ◽

Disease Prediction ◽

K Nearest Neighbor ◽

Coronary Heart Diseases ◽

Machine Learning Classification ◽

Clinical Statistics

The health region produces a massive quantity of facts. This statistics is not always made use to the full quantity and is frequently underutilized the usage of this big quantity of statistics, a ailment can be detected, predicated or maybe cured. A large hazard to human type is caused by sicknesses like heart disease, most cancers, tumour, and Alzheimer’s disease prediction. Using machine getting to know strategies, the coronary heart ailment may be expected. Clinical data which includes blood strain, hypertension, diabetes, the quantity of each day cigarettes smoked, and so forth. Are used as input, so these traits are modeled to expect. This model can then be used to are expecting future clinical statistics. The algorithms like Decision Tree , k – Nearest Neighbor and Support Vector Machine are used. The accuracy of the model the use of every of the algorithm is calculated. Then the only with the good accuracy is taken because the version for predicting the coronary heart diseases.

Download Full-text

Fusion of text and graph information for machine learning problems on networks

PeerJ Computer Science ◽

10.7717/peerj-cs.526 ◽

2021 ◽

Vol 7 ◽

pp. e526

Author(s):

Ilya Makarov ◽

Mikhail Makarov ◽

Dmitrii Kiselev

Keyword(s):

Machine Learning ◽

Link Prediction ◽

Citation Network ◽

Representation Learning ◽

Scientific Paper ◽

Learning Problems ◽

Network Embedding ◽

Network Properties ◽

Text Information ◽

Low Dimensional

Today, increased attention is drawn towards network representation learning, a technique that maps nodes of a network into vectors of a low-dimensional embedding space. A network embedding constructed this way aims to preserve nodes similarity and other specific network properties. Embedding vectors can later be used for downstream machine learning problems, such as node classification, link prediction and network visualization. Naturally, some networks have text information associated with them. For instance, in a citation network, each node is a scientific paper associated with its abstract or title; in a social network, all users may be viewed as nodes of a network and posts of each user as textual attributes. In this work, we explore how combining existing methods of text and network embeddings can increase accuracy for downstream tasks and propose modifications to popular architectures to better capture textual information in network embedding and fusion frameworks.

Download Full-text

DeepMicro: deep representation learning for disease prediction based on microbiome data

Scientific Reports ◽

10.1038/s41598-020-63159-5 ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 4

Author(s):

Min Oh ◽

Liqing Zhang

Keyword(s):

Representation Learning ◽

Disease Prediction ◽

Microbiome Data

Download Full-text

Fairness in Network Representation by Latent Structural Heterogeneity in Observational Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5792 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3809-3816

Author(s):

Xin Du ◽

Yulong Pei ◽

Wouter Duivesteijn ◽

Mykola Pechenizkiy

Keyword(s):

Machine Learning ◽

Observational Data ◽

Representation Learning ◽

Structural Heterogeneity ◽

Heterogeneous Distribution ◽

Network Representation ◽

Representation Model ◽

Real World Datasets ◽

Synthetic Datasets ◽

Low Dimensional

While recent advances in machine learning put many focuses on fairness of algorithmic decision making, topics about fairness of representation, especially fairness of network representation, are still underexplored. Network representation learning learns a function mapping nodes to low-dimensional vectors. Structural properties, e.g. communities and roles, are preserved in the latent embedding space. In this paper, we argue that latent structural heterogeneity in the observational data could bias the classical network representation model. The unknown heterogeneous distribution across subgroups raises new challenges for fairness in machine learning. Pre-defined groups with sensitive attributes cannot properly tackle the potential unfairness of network representation. We propose a method which can automatically discover subgroups which are unfairly treated by the network representation model. The fairness measure we propose can evaluate complex targets with multi-degree interactions. We conduct randomly controlled experiments on synthetic datasets and verify our methods on real-world datasets. Both quantitative and quantitative results show that our method is effective to recover the fairness of network representations. Our research draws insight on how structural heterogeneity across subgroups restricted by attributes would affect the fairness of network representation learning.

Download Full-text

Using Conditional Generative Adversarial Networks to Boost the Performance of Machine Learning in Microbiome Datasets

10.1101/2020.05.18.102814 ◽

2020 ◽

Cited By ~ 1

Author(s):

Derek Reiman ◽

Yang Dai

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Side Information ◽

Generative Adversarial Networks ◽

Disease Prediction ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks ◽

Original Dataset ◽

Microbiome Data

AbstractThe microbiome of the human body has been shown to have profound effects on physiological regulation and disease pathogenesis. However, association analysis based on statistical modeling of microbiome data has continued to be a challenge due to inherent noise, complexity of the data, and high cost of collecting large number of samples. To address this challenge, we employed a deep learning framework to construct a data-driven simulation of microbiome data using a conditional generative adversarial network. Conditional generative adversarial networks train two models against each other while leveraging side information learn from a given dataset to compute larger simulated datasets that are representative of the original dataset. In our study, we used a cohorts of patients with inflammatory bowel disease to show that not only can the generative adversarial network generate samples representative of the original data based on multiple diversity metrics, but also that training machine learning models on the synthetic samples can improve disease prediction through data augmentation. In addition, we also show that the synthetic samples generated by this cohort can boost disease prediction of a different external cohort.

Download Full-text

Machine Learning Classification of Spinal Lesions: Compared Accuracy of Texture Parameters Extracted by Different Software

10.1055/s-0039-1692578 ◽

2019 ◽

Author(s):

V. Chianca ◽

D. Albano ◽

R. Cuocolo ◽

C. Messina ◽

S. Gitto ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Classification ◽

Spinal Lesions ◽

Texture Parameters

Download Full-text

Machine Learning Classification of Low-grade and High-grade Chondrosarcomas Based on MRI-based Texture Analysis

10.1055/s-0039-1692575 ◽

2019 ◽

Author(s):

S. Gitto ◽

D. Albano ◽

V. Chianca ◽

R. Cuocolo ◽

L. Ugga ◽

...

Keyword(s):

Machine Learning ◽

Texture Analysis ◽

Low Grade ◽

High Grade ◽

Machine Learning Classification

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Machine Learning in Agriculture

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i6.713 ◽

2018 ◽

Vol 8 (6) ◽

pp. 26 ◽

Cited By ~ 1

Author(s):

Matthew N. O. Sadiku ◽

Chandra M. M Kotteti ◽

Sarhan M. Musa

Keyword(s):

Climate Change ◽

Artificial Intelligence ◽

Machine Learning ◽

Disease Diagnosis ◽

Automated Detection ◽

Disease Prediction ◽

Paper Briefly ◽

Agriculture Sector ◽

Modern Agriculture ◽

Crop Disease

Machine learning is an emerging field of artificial intelligence which can be applied to the agriculture sector. It refers to the automated detection of meaningful patterns in a given data. Modern agriculture seeks ways to conserve water, use nutrients and energy more efficiently, and adapt to climate change. Machine learning in agriculture allows for more accurate disease diagnosis and crop disease prediction. This paper briefly introduces what machine learning can do in the agriculture sector.

Download Full-text

Detection With Firefly Algorithm (FA) Based Feature Selection Forautism Spectrum Disorder (ASD) and Machine Learning Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i5.992998 ◽

2019 ◽

Vol 7 (5) ◽

pp. 992-998

Author(s):

R. Rajeswari ◽

R.S. Padma Priya

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Firefly Algorithm ◽

Spectrum Disorder ◽

Machine Learning Classification

Download Full-text