Predicting the interaction biomolecule types for lncRNA: an ensemble deep learning approach

Briefings in Bioinformatics ◽

10.1093/bib/bbaa228 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yu Zhang ◽

Cangzhi Jia ◽

Chee Keong Kwoh

Keyword(s):

Deep Learning ◽

Molecular Mechanisms ◽

Operating Characteristic ◽

Cross Validation ◽

Characteristic Curve ◽

Published Data ◽

Different Types ◽

Precision Recall Curve ◽

Deep Learning Model ◽

Fold Cross Validation

Abstract Long noncoding RNAs (lncRNAs) play significant roles in various physiological and pathological processes via their interactions with biomolecules like DNA, RNA and protein. The existing in silico methods used for predicting the functions of lncRNA mainly rely on calculating the similarity of lncRNA or investigating whether an lncRNA can interact with a specific biomolecule or disease. In this work, we explored the functions of lncRNA from a different perspective: we presented a tool for predicting the interaction biomolecule type for a given lncRNA. For this purpose, we first investigated the main molecular mechanisms of the interactions of lncRNA–RNA, lncRNA–protein and lncRNA–DNA. Then, we developed an ensemble deep learning model: lncIBTP (lncRNA Interaction Biomolecule Type Prediction). This model predicted the interactions between lncRNA and different types of biomolecules. On the 5-fold cross-validation, the lncIBTP achieves average values of 0.7042 in accuracy, 0.7903 and 0.6421 in macro-average area under receiver operating characteristic curve and precision–recall curve, respectively, which illustrates the model effectiveness. Besides, based on the analysis of the collected published data and prediction results, we hypothesized that the characteristics of lncRNAs that interacted with DNA may be different from those that interacted with only RNA.

Download Full-text

Automated Breast Cancer Detection in Digital Mammograms of Various Densities via Deep Learning

Journal of Personalized Medicine ◽

10.3390/jpm10040211 ◽

2020 ◽

Vol 10 (4) ◽

pp. 211 ◽

Cited By ~ 1

Author(s):

Yong Joon Suh ◽

Jaewon Jung ◽

Bum-Joo Cho

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Operating Characteristic ◽

Meta Analysis ◽

Characteristic Curve ◽

Malignant Lesion ◽

Model Performance ◽

Mean Values ◽

The Mean ◽

Deep Learning Model

Mammography plays an important role in screening breast cancer among females, and artificial intelligence has enabled the automated detection of diseases on medical images. This study aimed to develop a deep learning model detecting breast cancer in digital mammograms of various densities and to evaluate the model performance compared to previous studies. From 1501 subjects who underwent digital mammography between February 2007 and May 2015, craniocaudal and mediolateral view mammograms were included and concatenated for each breast, ultimately producing 3002 merged images. Two convolutional neural networks were trained to detect any malignant lesion on the merged images. The performances were tested using 301 merged images from 284 subjects and compared to a meta-analysis including 12 previous deep learning studies. The mean area under the receiver-operating characteristic curve (AUC) for detecting breast cancer in each merged mammogram was 0.952 ± 0.005 by DenseNet-169 and 0.954 ± 0.020 by EfficientNet-B5, respectively. The performance for malignancy detection decreased as breast density increased (density A, mean AUC = 0.984 vs. density D, mean AUC = 0.902 by DenseNet-169). When patients’ age was used as a covariate for malignancy detection, the performance showed little change (mean AUC, 0.953 ± 0.005). The mean sensitivity and specificity of the DenseNet-169 (87 and 88%, respectively) surpassed the mean values (81 and 82%, respectively) obtained in a meta-analysis. Deep learning would work efficiently in screening breast cancer in digital mammograms of various densities, which could be maximized in breasts with lower parenchyma density.

Download Full-text

Predicting Malignancy and Invasiveness of Pulmonary Subsolid Nodules on CT Images Using Deep Learning

Frontiers in Oncology ◽

10.3389/fonc.2021.700158 ◽

2021 ◽

Vol 11 ◽

Author(s):

Tianle Shen ◽

Runping Hou ◽

Xiaodan Ye ◽

Xiaoyang Li ◽

Junfeng Xiong ◽

...

Keyword(s):

Deep Learning ◽

Operating Characteristic ◽

Characteristic Curve ◽

Three Dimensional ◽

Ct Images ◽

Optimal Decision ◽

Preoperative Ct ◽

3D Cnn ◽

Deep Learning Model

BackgroundTo develop and validate a deep learning–based model on CT images for the malignancy and invasiveness prediction of pulmonary subsolid nodules (SSNs).Materials and MethodsThis study retrospectively collected patients with pulmonary SSNs treated by surgery in our hospital from 2012 to 2018. Postoperative pathology was used as the diagnostic reference standard. Three-dimensional convolutional neural network (3D CNN) models were constructed using preoperative CT images to predict the malignancy and invasiveness of SSNs. Then, an observer reader study conducted by two thoracic radiologists was used to compare with the CNN model. The diagnostic power of the models was evaluated with receiver operating characteristic curve (ROC) analysis.ResultsA total of 2,614 patients were finally included and randomly divided for training (60.9%), validation (19.1%), and testing (20%). For the benign and malignant classification, the best 3D CNN model achieved a satisfactory AUC of 0.913 (95% CI: 0.885–0.940), sensitivity of 86.1%, and specificity of 83.8% at the optimal decision point, which outperformed all observer readers’ performance (AUC: 0.846±0.031). For pre-invasive and invasive classification of malignant SSNs, the 3D CNN also achieved satisfactory AUC of 0.908 (95% CI: 0.877–0.939), sensitivity of 87.4%, and specificity of 80.8%.ConclusionThe deep-learning model showed its potential to accurately identify the malignancy and invasiveness of SSNs and thus can help surgeons make treatment decisions.

Download Full-text

A Deep-Learning Model With the Attention Mechanism Could Rigorously Predict Survivals in Neuroblastoma

Frontiers in Oncology ◽

10.3389/fonc.2021.653863 ◽

2021 ◽

Vol 11 ◽

Author(s):

Chenzhao Feng ◽

Tianyu Xiang ◽

Zixuan Yi ◽

Xinyao Meng ◽

Xufeng Chu ◽

...

Keyword(s):

Deep Learning ◽

Molecular Mechanisms ◽

Operating Characteristic ◽

Roc Curves ◽

Attention Mechanism ◽

Survival Prediction ◽

Training Set ◽

Kaplan Meier ◽

Applied Artificial Intelligence ◽

Deep Learning Model

BackgroundNeuroblastoma is one of the most devastating forms of childhood cancer. Despite large amounts of attempts in precise survival prediction in neuroblastoma, the prediction efficacy remains to be improved.MethodsHere, we applied a deep-learning (DL) model with the attention mechanism to predict survivals in neuroblastoma. We utilized 2 groups of features separated from 172 genes, to train 2 deep neural networks and combined them by the attention mechanism.ResultsThis classifier could accurately predict survivals, with areas under the curve of receiver operating characteristic (ROC) curves and time-dependent ROC reaching 0.968 and 0.974 in the training set respectively. The accuracy of the model was further confirmed in a validation cohort. Importantly, the two feature groups were mapped to two groups of patients, which were prognostic in Kaplan-Meier curves. Biological analyses showed that they exhibited diverse molecular backgrounds which could be linked to the prognosis of the patients.ConclusionsIn this study, we applied artificial intelligence methods to improve the accuracy of neuroblastoma survival prediction based on gene expression and provide explanations for better understanding of the molecular mechanisms underlying neuroblastoma.

Download Full-text

Ensembled deep learning model outperforms human experts in diagnosing biliary atresia from sonographic gallbladder images

Nature Communications ◽

10.1038/s41467-021-21466-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Wenying Zhou ◽

Yang Yang ◽

Cheng Yu ◽

Juxian Liu ◽

Xingxing Duan ◽

...

Keyword(s):

Deep Learning ◽

Biliary Atresia ◽

Operating Characteristic ◽

Characteristic Curve ◽

External Validation ◽

Learning Model ◽

Video Sequences ◽

Validation Dataset ◽

Patient Level ◽

Deep Learning Model

AbstractIt is still challenging to make accurate diagnosis of biliary atresia (BA) with sonographic gallbladder images particularly in rural area without relevant expertise. To help diagnose BA based on sonographic gallbladder images, an ensembled deep learning model is developed. The model yields a patient-level sensitivity 93.1% and specificity 93.9% [with areas under the receiver operating characteristic curve of 0.956 (95% confidence interval: 0.928-0.977)] on the multi-center external validation dataset, superior to that of human experts. With the help of the model, the performances of human experts with various levels are improved. Moreover, the diagnosis based on smartphone photos of sonographic gallbladder images through a smartphone app and based on video sequences by the model still yields expert-level performances. The ensembled deep learning model in this study provides a solution to help radiologists improve the diagnosis of BA in various clinical application scenarios, particularly in rural and undeveloped regions with limited expertise.

Download Full-text

Stacked Generalization: An Introduction to Super Learning

10.1101/172395 ◽

2017 ◽

Author(s):

Ashley I. Naimi ◽

Laura B. Balzer

Keyword(s):

Operating Characteristic ◽

Cross Validation ◽

Mean Squared Error ◽

Characteristic Curve ◽

Squared Error ◽

Stacked Generalization ◽

Super Learner ◽

Technical Details ◽

Fold Cross Validation ◽

Weighted Combination

AbstractStacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into what is now known as “Super Learner”. Super Learner uses V -fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of the Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns.

Download Full-text

LE-MDCAP: A Computational Model to Prioritize Causal miRNA–Disease Associations

International Journal of Molecular Sciences ◽

10.3390/ijms222413607 ◽

2021 ◽

Vol 22 (24) ◽

pp. 13607

Author(s):

Zhou Huang ◽

Yu Han ◽

Leibo Liu ◽

Qinghua Cui ◽

Yuan Zhou

Keyword(s):

Computational Models ◽

Operating Characteristic ◽

Cross Validation ◽

Characteristic Curve ◽

Levenshtein Distance ◽

Similarity Matrix ◽

Disease Treatment ◽

Independent Test ◽

Disease Associations ◽

Fold Cross Validation

MicroRNAs (miRNAs) are associated with various complex human diseases and some miRNAs can be directly involved in the mechanisms of disease. Identifying disease-causative miRNAs can provide novel insight in disease pathogenesis from a miRNA perspective and facilitate disease treatment. To date, various computational models have been developed to predict general miRNA–disease associations, but few models are available to further prioritize causal miRNA–disease associations from non-causal associations. Therefore, in this study, we constructed a Levenshtein-Distance-Enhanced miRNA–Disease Causal Association Predictor (LE-MDCAP), to predict potential causal miRNA–disease associations. Specifically, Levenshtein distance matrixes covering the sequence, expression and functional miRNA similarities were introduced to enhance the previous Gaussian interaction profile kernel-based similarity matrix. LE-MDCAP integrated miRNA similarity matrices, disease semantic similarity matrix and known causal miRNA–disease associations to make predictions. For regular causal vs. non-disease association discrimination task, LF-MDCAP achieved area under the receiver operating characteristic curve (AUROC) of 0.911 and 0.906 in 10-fold cross-validation and independent test, respectively. More importantly, LE-MDCAP prominently outperformed the previous MDCAP model in distinguishing causal versus non-causal miRNA–disease associations (AUROC 0.820 vs. 0.695). Case studies performed on diabetic retinopathy and hsa-mir-361 also validated the accuracy of our model. In summary, LE-MDCAP could be useful for screening causal miRNA–disease associations from general miRNA–disease associations.

Download Full-text

Development and Verification of a Deep Learning Algorithm to Evaluate Small-Bowel Preparation Quality

Diagnostics ◽

10.3390/diagnostics11061127 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1127

Author(s):

Ji Hyung Nam ◽

Dong Jun Oh ◽

Sumin Lee ◽

Hyun Joo Song ◽

Yun Jeong Lim

Keyword(s):

Deep Learning ◽

Small Bowel ◽

Scoring System ◽

Operating Characteristic ◽

Clinical Evidence ◽

Learning Algorithm ◽

Characteristic Curve ◽

External Validation ◽

Test Results ◽

Deep Learning Algorithm

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.

Download Full-text

NIMG-08. PREDICTION OF LOWER-GRADE GLIOMA MOLECULAR SUBTYPES USING DEEP LEARNING

Neuro-Oncology ◽

10.1093/neuonc/noaa215.621 ◽

2020 ◽

Vol 22 (Supplement_2) ◽

pp. ii148-ii148

Author(s):

Yoshihiro Muragaki ◽

Yutaka Matsui ◽

Takashi Maruyama ◽

Masayuki Nitta ◽

Taiichi Saito ◽

...

Keyword(s):

Deep Learning ◽

Cross Validation ◽

Molecular Subtype ◽

Learning Model ◽

Group Classification ◽

Training Dataset ◽

Lower Grade ◽

Test Dataset ◽

Ct Data ◽

Deep Learning Model

Abstract INTRODUCTION It is useful to know the molecular subtype of lower-grade gliomas (LGG) when deciding on a treatment strategy. This study aims to diagnose this preoperatively. METHODS A deep learning model was developed to predict the 3-group molecular subtype using multimodal data including magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT). The performance was evaluated using leave-one-out cross validation with a dataset containing information from 217 LGG patients. RESULTS The model performed best when the dataset contained MRI, PET, and CT data. The model could predict the molecular subtype with an accuracy of 96.6% for the training dataset and 68.7% for the test dataset. The model achieved test accuracies of 58.5%, 60.4%, and 59.4% when the dataset contained only MRI, MRI and PET, and MRI and CT data, respectively. The conventional method used to predict mutations in the isocitrate dehydrogenase (IDH) gene and the codeletion of chromosome arms 1p and 19q (1p/19q) sequentially had an overall accuracy of 65.9%. This is 2.8 percent point lower than the proposed method, which predicts the 3-group molecular subtype directly. CONCLUSIONS AND FUTURE PERSPECTIVE A deep learning model was developed to diagnose the molecular subtype preoperatively based on multi-modality data in order to predict the 3-group classification directly. Cross-validation showed that the proposed model had an overall accuracy of 68.7% for the test dataset. This is the first model to double the expected value for a 3-group classification problem, when predicting the LGG molecular subtype. We plan to apply the techniques of heat map and/or segmentation for an increase in prediction accuracy.

Download Full-text

Multi-Omics Analysis of Acute Lymphoblastic Leukemia Identified the Methylation and Expression Differences Between BCP-ALL and T-ALL

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2020.622393 ◽

2021 ◽

Vol 8 ◽

Author(s):

Jin-Fan Li ◽

Xiao-Jing Ma ◽

Lin-Lin Ying ◽

Ying-hui Tong ◽

Xue-ping Xiang

Keyword(s):

Acute Lymphoblastic Leukemia ◽

Molecular Mechanisms ◽

Cross Validation ◽

Lymphoblastic Leukemia ◽

Expression Data ◽

Expression Signature ◽

Signature Genes ◽

Common Cancer ◽

Monte Carlo Feature Selection ◽

Fold Cross Validation

Acute lymphoblastic leukemia (ALL) as a common cancer is a heterogeneous disease which is mainly divided into BCP-ALL and T-ALL, accounting for 80–85% and 15–20%, respectively. There are many differences between BCP-ALL and T-ALL, including prognosis, treatment, drug screening, gene research and so on. In this study, starting with methylation and gene expression data, we analyzed the molecular differences between BCP-ALL and T-ALL and identified the multi-omics signatures using Boruta and Monte Carlo feature selection methods. There were 7 expression signature genes (CD3D, VPREB3, HLA-DRA, PAX5, BLNK, GALNT6, SLC4A8) and 168 methylation sites corresponding to 175 methylation signature genes. The overall accuracy, accuracy of BCP-ALL, accuracy of T-ALL of the RIPPER (Repeated Incremental Pruning to Produce Error Reduction) classifier using these signatures evaluated with 10-fold cross validation repeated 3 times were 0.973, 0.990, and 0.933, respectively. Two overlapped genes between 175 methylation signature genes and 7 expression signature genes were CD3D and VPREB3. The network analysis of the methylation and expression signature genes suggested that their common gene, CD3D, was not only different on both methylation and expression levels, but also played a key regulatory role as hub on the network. Our results provided insights of understanding the underlying molecular mechanisms of ALL and facilitated more precision diagnosis and treatment of ALL.

Download Full-text

Prediction of Merchandise Sales on E-Commerce Platforms Based on Data Mining and Deep Learning

Scientific Programming ◽

10.1155/2021/2179692 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Xiaoting Yin ◽

Xiaosha Tao

Keyword(s):

Data Mining ◽

Deep Learning ◽

Learning Algorithm ◽

Research Process ◽

Deep Learning Algorithm ◽

Sales Prediction ◽

Different Types ◽

Online Business ◽

Product Sales ◽

Deep Learning Model

Online business has grown exponentially during the last decade, and the industries are focusing on online business more than before. However, just setting up an online store and starting selling might not work. Different machine learning and data mining techniques are needed to know the users’ preferences and know what would be best for business. According to the decision-making needs of online product sales, combined with the influencing factors of online product sales in various industries and the advantages of deep learning algorithm, this paper constructs a sales prediction model suitable for online products and focuses on evaluating the adaptability of the model in different types of online products. In the research process, the full connection model is compared with the training results of CNN, which proves the accuracy and generalization ability of CNN model. By selecting the non-deep learning model as the comparison baseline, the performance advantages of CNN model under different categories of products are proved. In addition, the experiment concludes that the unsupervised pretrained CNN model is more effective and adaptable in sales forecasting.

Download Full-text