Predicting Disease Related microRNA Based on Similarity and Topology

Zhihua Chen; Xinke Wang; Peng Gao; Hongju Liu; Bosheng Song

doi:10.3390/cells8111405

Predicting Disease Related microRNA Based on Similarity and Topology

Cells ◽

10.3390/cells8111405 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1405 ◽

Cited By ~ 2

Author(s):

Zhihua Chen ◽

Xinke Wang ◽

Peng Gao ◽

Hongju Liu ◽

Bosheng Song

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Lung Neoplasm ◽

Area Under The Curve ◽

Usual Method ◽

Machine Learning Method ◽

And Topology ◽

Machine Learning Model ◽

Topology Information

It is known that many diseases are caused by mutations or abnormalities in microRNA (miRNA). The usual method to predict miRNA disease relationships is to build a high-quality similarity network of diseases and miRNAs. All unobserved associations are ranked by their similarity scores, such that a higher score indicates a greater probability of a potential connection. However, this approach does not utilize information within the network. Therefore, in this study, we propose a machine learning method, called STIM, which uses network topology information to predict disease–miRNA associations. In contrast to the conventional approach, STIM constructs features according to information on similarity and topology in networks and then uses a machine learning model to predict potential associations. To verify the reliability and accuracy of our method, we compared STIM to other classical algorithms. The results of fivefold cross validation demonstrated that STIM outperforms many existing methods, particularly in terms of the area under the curve. In addition, the top 30 candidate miRNAs recommended by STIM in a case study of lung neoplasm have been confirmed in previous experiments, which proved the validity of the method.

Download Full-text

Breath can discriminate tuberculosis from other lower respiratory illness in children

Scientific Reports ◽

10.1038/s41598-021-80970-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Carly A. Bobak ◽

Lili Kang ◽

Lesley Workman ◽

Lindy Bateman ◽

Mohammad S. Khan ◽

...

Keyword(s):

Machine Learning ◽

Lower Respiratory Tract ◽

Cross Validation ◽

Pediatric Patients ◽

Respiratory Illness ◽

Health Crisis ◽

Machine Learning Model ◽

Childhood Tb ◽

Pediatric Tb ◽

Pediatric Tuberculosis

AbstractPediatric tuberculosis (TB) remains a global health crisis. Despite progress, pediatric patients remain difficult to diagnose, with approximately half of all childhood TB patients lacking bacterial confirmation. In this pilot study (n = 31), we identify a 4-compound breathprint and subsequent machine learning model that accurately classifies children with confirmed TB (n = 10) from children with another lower respiratory tract infection (LRTI) (n = 10) with a sensitivity of 80% and specificity of 100% observed across cross validation folds. Importantly, we demonstrate that the breathprint identified an additional nine of eleven patients who had unconfirmed clinical TB and whose symptoms improved while treated for TB. While more work is necessary to validate the utility of using patient breath to diagnose pediatric TB, it shows promise as a triage instrument or paired as part of an aggregate diagnostic scheme.

Download Full-text

A Machine Learning Method for Identifying Lung Cancer Based on Routine Blood Indices: Qualitative Feasibility Study

JMIR Medical Informatics ◽

10.2196/13476 ◽

2019 ◽

Vol 7 (3) ◽

pp. e13476 ◽

Cited By ~ 3

Author(s):

Jiangpeng Wu ◽

Xiangyi Zan ◽

Liping Gao ◽

Jianhong Zhao ◽

Jing Fan ◽

...

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Cross Validation ◽

Clinical Symptoms ◽

Machine Learning Method ◽

Learning Method ◽

Liquid Biopsies ◽

Blood Indices ◽

Routine Blood ◽

Identification Model

Background Liquid biopsies based on blood samples have been widely accepted as a diagnostic and monitoring tool for cancers, but extremely high sensitivity is frequently needed due to the very low levels of the specially selected DNA, RNA, or protein biomarkers that are released into blood. However, routine blood indices tests are frequently ordered by physicians, as they are easy to perform and are cost effective. In addition, machine learning is broadly accepted for its ability to decipher complicated connections between multiple sets of test data and diseases. Objective The aim of this study is to discover the potential association between lung cancer and routine blood indices and thereby help clinicians and patients to identify lung cancer based on these routine tests. Methods The machine learning method known as Random Forest was adopted to build an identification model between routine blood indices and lung cancer that would determine if they were potentially linked. Ten-fold cross-validation and further tests were utilized to evaluate the reliability of the identification model. Results In total, 277 patients with 49 types of routine blood indices were included in this study, including 183 patients with lung cancer and 94 patients without lung cancer. Throughout the course of the study, there was correlation found between the combination of 19 types of routine blood indices and lung cancer. Lung cancer patients could be identified from other patients, especially those with tuberculosis (which usually has similar clinical symptoms to lung cancer), with a sensitivity, specificity and total accuracy of 96.3%, 94.97% and 95.7% for the cross-validation results, respectively. This identification method is called the routine blood indices model for lung cancer, and it promises to be of help as a tool for both clinicians and patients for the identification of lung cancer based on routine blood indices. Conclusions Lung cancer can be identified based on the combination of 19 types of routine blood indices, which implies that artificial intelligence can find the connections between a disease and the fundamental indices of blood, which could reduce the necessity of costly, elaborate blood test techniques for this purpose. It may also be possible that the combination of multiple indices obtained from routine blood tests may be connected to other diseases as well.

Download Full-text

Application of the XGBoost Machine Learning Method in PM2.5 Prediction: A Case Study of Shanghai

Aerosol and Air Quality Research ◽

10.4209/aaqr.2019.08.0408 ◽

2020 ◽

Vol 20 (1) ◽

pp. 128-138 ◽

Cited By ~ 6

Author(s):

Jinghui Ma ◽

Zhongqi Yu ◽

Yuanhao Qu ◽

Jianming Xu ◽

Yu Cao

Keyword(s):

Machine Learning ◽

Machine Learning Method ◽

Learning Method

Download Full-text

Survey on Stress Detection Using Multiple Sensors through Wearable Devices

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/461022021 ◽

2021 ◽

Vol 10 (2) ◽

pp. 787-790

Keyword(s):

Machine Learning ◽

Wearable Devices ◽

Anxiety And Depression ◽

Accuracy Score ◽

Machine Learning Method ◽

Stress Detection ◽

Multiple Sensors ◽

Machine Learning Model ◽

Wide Range ◽

Principle Objective

An Individual method of living on with a daily existence it directly influences on your overall health. Since stress is the significant infection of our human body. Like depression, heart attack and mental illness. WHO says “Globally, more than 264 million people of all ages suffer from depression.”[8]. Also the report says that most of the time people are stressed because of their work. 10.7% of People disorder with stress, anxiety and depression [8]. There are different method to discovering stress ex. Smart watches, chest belt, and extraordinary machine. Our principle objective is to figure out pressure progressively utilizing smart watches through their Sensor. There are different kinds of sensor available to find stress such as PPG, GSR, HRV, ECG and temperature. Smart watches contain a wide range of data through various sensor. This kind of gathered information are applied on various machine learning method. Like linear regression, SVM, KNN, decision tree. Technique have distinct, comparing accuracy and chooses best Machine learning model. This paper investigation have different analysis to find and compare accuracy by various sensors data. It is also check whether using one sensor or multiple sensors such as HRV, ECG or GSR and PPG to predict the better accuracy score for stress detection.

Download Full-text

Bidders Recommender for Public Procurement Auctions Using Machine Learning: Data Analysis, Algorithm, and Case Study with Tenders from Spain

Complexity ◽

10.1155/2020/8858258 ◽

2020 ◽

Vol 2020 ◽

pp. 1-20

Author(s):

Manuel J. García Rodríguez ◽

Vicente Rodríguez Montequín ◽

Francisco Ortega Fernández ◽

Joaquín M. Villanueva Balsera

Keyword(s):

Machine Learning ◽

Public Procurement ◽

Public Information ◽

Random Forest Classifier ◽

Procurement Auctions ◽

Machine Learning Method ◽

Test Conditions ◽

A Company ◽

Learning Data

Recommending the identity of bidders in public procurement auctions (tenders) has a significant impact in many areas of public procurement, but it has not yet been studied in depth. A bidders recommender would be a very beneficial tool because a supplier (company) can search appropriate tenders and, vice versa, a public procurement agency can discover automatically unknown companies which are suitable for its tender. This paper develops a pioneering algorithm to recommend potential bidders using a machine learning method, particularly a random forest classifier. The bidders recommender is described theoretically, so it can be implemented or adapted to any particular situation. It has been successfully validated with a case study: an actual Spanish tender dataset (free public information) which has 102,087 tenders from 2014 to 2020 and a company dataset (nonfree public information) which has 1,353,213 Spanish companies. Quantitative, graphical, and statistical descriptions of both datasets are presented. The results of the case study were satisfactory: the winning bidding company is within the recommended companies group, from 24% to 38% of the tenders, according to different test conditions and scenarios.

Download Full-text

MEWS++: Enhancing the Prediction of Clinical Deterioration in Admitted Patients through a Machine Learning Model

Journal of Clinical Medicine ◽

10.3390/jcm9020343 ◽

2020 ◽

Vol 9 (2) ◽

pp. 343 ◽

Cited By ~ 4

Author(s):

Arash Kia ◽

Prem Timsina ◽

Himanshu N. Joshi ◽

Eyal Klang ◽

Rohit R. Gupta ◽

...

Keyword(s):

Machine Learning ◽

At Risk ◽

Area Under The Curve ◽

Learning Model ◽

Clinical Deterioration ◽

Early Warning Score ◽

Support Vector ◽

Adult Age ◽

Machine Learning Model ◽

Patients At Risk

Early detection of patients at risk for clinical deterioration is crucial for timely intervention. Traditional detection systems rely on a limited set of variables and are unable to predict the time of decline. We describe a machine learning model called MEWS++ that enables the identification of patients at risk of escalation of care or death six hours prior to the event. A retrospective single-center cohort study was conducted from July 2011 to July 2017 of adult (age > 18) inpatients excluding psychiatric, parturient, and hospice patients. Three machine learning models were trained and tested: random forest (RF), linear support vector machine, and logistic regression. We compared the models’ performance to the traditional Modified Early Warning Score (MEWS) using sensitivity, specificity, and Area Under the Curve for Receiver Operating Characteristic (AUC-ROC) and Precision-Recall curves (AUC-PR). The primary outcome was escalation of care from a floor bed to an intensive care or step-down unit, or death, within 6 h. A total of 96,645 patients with 157,984 hospital encounters and 244,343 bed movements were included. Overall rate of escalation or death was 3.4%. The RF model had the best performance with sensitivity 81.6%, specificity 75.5%, AUC-ROC of 0.85, and AUC-PR of 0.37. Compared to traditional MEWS, sensitivity increased 37%, specificity increased 11%, and AUC-ROC increased 14%. This study found that using machine learning and readily available clinical data, clinical deterioration or death can be predicted 6 h prior to the event. The model we developed can warn of patient deterioration hours before the event, thus helping make timely clinical decisions.

Download Full-text

Factors Influencing Matching of Ride-Hailing Service Using Machine Learning Method

Sustainability ◽

10.3390/su11205615 ◽

2019 ◽

Vol 11 (20) ◽

pp. 5615 ◽

Cited By ~ 2

Author(s):

Myungsik Do ◽

Wanhee Byun ◽

Doh Kyoum Shin ◽

Hyeryun Jin

Keyword(s):

Machine Learning ◽

Success Rate ◽

Cross Validation ◽

Average Distance ◽

Machine Learning Method ◽

Land Uses ◽

Taxi Drivers ◽

Factors Influencing ◽

Taxi Service ◽

The City

It is common to call a taxi by taxi-apps in Korea and it was believed that an app-taxi service would provide customers with more convenience. However, customers’ requests can often be denied, as taxi drivers can decide whether to take calls from customers or not. Therefore, studies on factors that determine whether taxi drivers refuse or accept calls from customers are needed. This study investigated why taxi drivers might refuse calls from customers and factors that influence the success of matching within the service. This study used origin-destination data in Seoul and Daejeon obtained from T-map Taxis, which was analyzed via a decision tree using machine learning. Cross-validation was also performed. Results showed that distance, socio-economic features, and land uses affected matching success rate. Furthermore, distance was the most important factor in both Seoul and Daejeon. The matching success rate in Seoul was lowest for trips shorter than the average at midnight. In Daejeon, the rate was lowest when the calls were made for trips either shorter or longer than the average distance. This study showed that the matching success for ride-hailing services can be differentiated particularly by the distance of the requested trip depending on the size of the city.

Download Full-text

Deep learning based DNA:RNA triplex forming potential prediction

BMC Bioinformatics ◽

10.1186/s12859-020-03864-0 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Yu Zhang ◽

Yahui Long ◽

Chee Keong Kwoh

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Cross Validation ◽

Roc Curves ◽

Triplex Dna ◽

Triplex Formation ◽

Machine Learning Model ◽

Non Coding Rnas ◽

High Level ◽

Integrated Program

Abstract Background Long non-coding RNAs (lncRNAs) can exert functions via forming triplex with DNA. The current methods in predicting the triplex formation mainly rely on mathematic statistic according to the base paring rules. However, these methods have two main limitations: (1) they identify a large number of triplex-forming lncRNAs, but the limited number of experimentally verified triplex-forming lncRNA indicates that maybe not all of them can form triplex in practice, and (2) their predictions only consider the theoretical relationship while lacking the features from the experimentally verified data. Results In this work, we develop an integrated program named TriplexFPP (Triplex Forming Potential Prediction), which is the first machine learning model in DNA:RNA triplex prediction. TriplexFPP predicts the most likely triplex-forming lncRNAs and DNA sites based on the experimentally verified data, where the high-level features are learned by the convolutional neural networks. In the fivefold cross validation, the average values of Area Under the ROC curves and PRC curves for removed redundancy triplex-forming lncRNA dataset with threshold 0.8 are 0.9649 and 0.9996, and these two values for triplex DNA sites prediction are 0.8705 and 0.9671, respectively. Besides, we also briefly summarize the cis and trans targeting of triplexes lncRNAs. Conclusions The TriplexFPP is able to predict the most likely triplex-forming lncRNAs from all the lncRNAs with computationally defined triplex forming capacities and the potential of a DNA site to become a triplex. It may provide insights to the exploration of lncRNA functions.

Download Full-text

P1060USING ARTIFICIAL INTELLIGENCE TO PREDICT HOME THERAPY CANDIDATES

Nephrology Dialysis Transplantation ◽

10.1093/ndt/gfaa142.p1060 ◽

2020 ◽

Vol 35 (Supplement_3) ◽

Author(s):

Jerry Yu ◽

Andrew Long ◽

Maria Hanson ◽

Aleetha Ellis ◽

Michael Macarthur ◽

...

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Feedback Loop ◽

Area Under The Curve ◽

Patient Characteristics ◽

Training Data ◽

Quality Improvement Initiative ◽

Home Therapy ◽

Test Dataset ◽

Machine Learning Model

Abstract Background and Aims There are many benefits for performing dialysis at home including more flexibility and more frequent treatments. A possible barrier to election of home therapy (HT) by in-center patients is a lack of adequate HT education. To aid efficient education efforts, a predictive model was developed to help identify patients who are more likely to switch from in-center and succeed on HT. Method We developed a model using machine learning to predict which patients who are treated in-center without prior HT history are most likely to switch to HT in the next 90 days and stay on HT for at least 90 days. Training data was extracted from 2016–2019 for approximately 300,000 patients. We randomly sampled one in-center treatment date per patient and determined if the patient would switch and succeed on HT. The input features consisted of treatment vitals, laboratories, absence history, comprehensive assessments, facility information, county-level housing, and patient characteristics. Patients were excluded if they had less than 30 days on dialysis due to lack of data. A machine learning model (XGBoost classifier) was deployed monthly in a pilot with a team of HT educators to investigate the model’s utility for identifying HT candidates. Results There were approximately 1,200 patients starting a home therapy per month in a large dialysis provider, with approximately one-third being in-center patients. The prevalence of switching and succeeding to HT in this population was 2.54%. The predictive model achieved an area under the curve of 0.87, sensitivity of 0.77, and a specificity of 0.80 on a hold-out test dataset. The pilot was successfully executed for several months and two major lessons were learned: 1) some patients who reappeared on each month’s list should be removed from the list after expressing no interest in HT, and 2) a data collection mechanism should be put in place to capture the reasons why patients are not interested in HT. Conclusion This quality-improvement initiative demonstrates that predictive modeling can be used to identify patients likely to switch and succeed on home therapy. Integration of the model in existing workflows requires creating a feedback loop which can help improve future worklists.

Download Full-text

North American Hardwoods Identification Using Machine-Learning

Forests ◽

10.3390/f11030298 ◽

2020 ◽

Vol 11 (3) ◽

pp. 298 ◽

Cited By ~ 2

Author(s):

Dercilio Junior Verly Lopes ◽

Greg W. Burgreen ◽

Edward D. Entsminger

Keyword(s):

Machine Learning ◽

North American ◽

Mobile Application ◽

Cross Validation ◽

Data Augmentation ◽

Technical Note ◽

Machine Learning Method ◽

Training Set ◽

Hardwood Species ◽

Fold Cross Validation

This technical note determines the feasibility of using an InceptionV4_ResNetV2 convolutional neural network (CNN) to correctly identify hardwood species from macroscopic images. The method is composed of a commodity smartphone fitted with a 14× macro lens for photography. The end-grains of ten different North American hardwood species were photographed to create a dataset of 1869 images. The stratified 5-fold cross-validation machine-learning method was used, in which the number of testing samples varied from 341 to 342. Data augmentation was performed on-the-fly for each training set by rotating, zooming, and flipping images. It was found that the CNN could correctly identify hardwood species based on macroscopic images of its end-grain with an adjusted accuracy of 92.60%. With the current growing of machine-learning field, this model can then be readily deployed in a mobile application for field wood identification.

Download Full-text