scholarly journals DeepMicro: deep representation learning for disease prediction based on microbiome data

2019 ◽  
Author(s):  
Min Oh ◽  
Liqing Zhang

AbstractHuman microbiota plays a key role in human health and growing evidence supports the potential use of microbiome as a predictor of various diseases. However, the high-dimensionality of microbiome data, often in the order of hundreds of thousands, yet low sample sizes, poses great challenge for machine learning-based prediction algorithms. This imbalance induces the data to be highly sparse, preventing from learning a better prediction model. Also, there has been little work on deep learning applications to microbiome data with a rigorous evaluation scheme. To address these challenges, we propose DeepMicro, a deep representation learning framework allowing for an effective representation of microbiome profiles. DeepMicro successfully transforms high-dimensional microbiome data into a robust low-dimensional representation using various autoencoders and applies machine learning classification algorithms on the learned representation. In disease prediction, DeepMicro outperforms the current best approaches based on the strain-level marker profile in five different datasets. In addition, by significantly reducing the dimensionality of the marker profile, DeepMicro accelerates the model training and hyperparameter optimization procedure with 8X-30X speedup over the basic approach. DeepMicro is freely available at https://github.com/minoh0201/DeepMicro.

Author(s):  
P.Priya Et. al.

The health region produces a massive quantity of facts. This statistics is not always made use to the full quantity and is frequently underutilized the usage of this big quantity of statistics, a ailment can be detected, predicated or maybe cured. A large hazard to human type is caused by sicknesses like heart disease, most cancers, tumour, and Alzheimer’s disease prediction. Using machine getting to know strategies, the coronary heart ailment may be expected. Clinical data which includes blood strain, hypertension, diabetes, the quantity of each day cigarettes smoked, and so forth. Are used as input, so these traits are modeled to expect. This model can then be used to are expecting future clinical statistics. The algorithms like Decision Tree , k – Nearest Neighbor and Support Vector Machine are used. The accuracy of the model the use of every of the algorithm is calculated. Then the only with the good accuracy is taken because the version for predicting the coronary heart diseases.


2021 ◽  
Vol 7 ◽  
pp. e526
Author(s):  
Ilya Makarov ◽  
Mikhail Makarov ◽  
Dmitrii Kiselev

Today, increased attention is drawn towards network representation learning, a technique that maps nodes of a network into vectors of a low-dimensional embedding space. A network embedding constructed this way aims to preserve nodes similarity and other specific network properties. Embedding vectors can later be used for downstream machine learning problems, such as node classification, link prediction and network visualization. Naturally, some networks have text information associated with them. For instance, in a citation network, each node is a scientific paper associated with its abstract or title; in a social network, all users may be viewed as nodes of a network and posts of each user as textual attributes. In this work, we explore how combining existing methods of text and network embeddings can increase accuracy for downstream tasks and propose modifications to popular architectures to better capture textual information in network embedding and fusion frameworks.


2020 ◽  
Vol 34 (04) ◽  
pp. 3809-3816
Author(s):  
Xin Du ◽  
Yulong Pei ◽  
Wouter Duivesteijn ◽  
Mykola Pechenizkiy

While recent advances in machine learning put many focuses on fairness of algorithmic decision making, topics about fairness of representation, especially fairness of network representation, are still underexplored. Network representation learning learns a function mapping nodes to low-dimensional vectors. Structural properties, e.g. communities and roles, are preserved in the latent embedding space. In this paper, we argue that latent structural heterogeneity in the observational data could bias the classical network representation model. The unknown heterogeneous distribution across subgroups raises new challenges for fairness in machine learning. Pre-defined groups with sensitive attributes cannot properly tackle the potential unfairness of network representation. We propose a method which can automatically discover subgroups which are unfairly treated by the network representation model. The fairness measure we propose can evaluate complex targets with multi-degree interactions. We conduct randomly controlled experiments on synthetic datasets and verify our methods on real-world datasets. Both quantitative and quantitative results show that our method is effective to recover the fairness of network representations. Our research draws insight on how structural heterogeneity across subgroups restricted by attributes would affect the fairness of network representation learning.


Author(s):  
Derek Reiman ◽  
Yang Dai

AbstractThe microbiome of the human body has been shown to have profound effects on physiological regulation and disease pathogenesis. However, association analysis based on statistical modeling of microbiome data has continued to be a challenge due to inherent noise, complexity of the data, and high cost of collecting large number of samples. To address this challenge, we employed a deep learning framework to construct a data-driven simulation of microbiome data using a conditional generative adversarial network. Conditional generative adversarial networks train two models against each other while leveraging side information learn from a given dataset to compute larger simulated datasets that are representative of the original dataset. In our study, we used a cohorts of patients with inflammatory bowel disease to show that not only can the generative adversarial network generate samples representative of the original data based on multiple diversity metrics, but also that training machine learning models on the synthetic samples can improve disease prediction through data augmentation. In addition, we also show that the synthetic samples generated by this cohort can boost disease prediction of a different external cohort.


Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


Author(s):  
Matthew N. O. Sadiku ◽  
Chandra M. M Kotteti ◽  
Sarhan M. Musa

Machine learning is an emerging field of artificial intelligence which can be applied to the agriculture sector. It refers to the automated detection of meaningful patterns in a given data.  Modern agriculture seeks ways to conserve water, use nutrients and energy more efficiently, and adapt to climate change.  Machine learning in agriculture allows for more accurate disease diagnosis and crop disease prediction. This paper briefly introduces what machine learning can do in the agriculture sector.


Sign in / Sign up

Export Citation Format

Share Document