scholarly journals Predicting human intestinal absorption with modified random forest approach: a comprehensive evaluation of molecular representation, unbalanced data, and applicability domain issues

RSC Advances ◽  
2017 ◽  
Vol 7 (31) ◽  
pp. 19007-19018 ◽  
Author(s):  
Ning-Ning Wang ◽  
Chen Huang ◽  
Jie Dong ◽  
Zhi-Jiang Yao ◽  
Min-Feng Zhu ◽  
...  

A relatively larger dataset consisting of 970 compounds was collected. Classification RF models were established based on different training sets and different descriptors. model validation and evaluation.

2014 ◽  
Vol 15 (4) ◽  
pp. 380-388 ◽  
Author(s):  
J. Julian-Ortiz ◽  
Riccardo Zanni ◽  
Maria Galvez-Llompart ◽  
Ramon Garcia-Domenech

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 2601-2601
Author(s):  
Tao Zhou ◽  
Libin Chen ◽  
Jing Guo ◽  
Mengmeng Zhang ◽  
Huanhuan Liu ◽  
...  

2601 Background: Microsatellite instability (MSI) is a common genomic alteration in several tumors, such as colorectal cancer, endometrial carcinoma, and stomach, which is characterized as microsatellite instability-high (MSI-H) and microsatellite stable (MSS) based on a high degree of polymorphism in microsatellite lengths. MSI is a predictive biomarker for immunotherapy efficacy in advanced/metastatic solid tumors, especially in colorectal cancer (CRC) patients. Several computational approaches based on target panel sequencing data have been used to detect MSI; However, they are considerably affected by the sequencing depth and panel size. Methods: We developed MSIFinder, a python package for automatic MSI classification, using random forest classifier (RFC)-based genome sequencing, which is a machine learning technology. We included 19 MSI-H and 25 MSS samples as training sets. First, RFC model were built by 54 feature markers from the training sets. Second. The software was validated the classifier using a test set comprising 21 MSI-H and 379 MSS samples. Results: With this test set, MSIFinder achieved a sensitivity (recall) of 0.997, a specificity of 1, an accuracy of 0.998, a positive predictive value (PPV) of 0.954, an F1 score of 0.977, and an area under curve (AUC) of 0.999. We discovered that MSIFinder is less affected by low sequencing depth and can achieve a concordance of 0.993, while exhibiting a sequencing depth of 100×. Furthermore, we realized that MSIFinder is less affected by the panel size and can achieve a concordance of 0.99 when the panel size is 0.5 m (million base). Conclusions: These results indicated that MSIFinder is a robust MSI classification tool and not affected by the panel size and sequencing depth. Furthermore, MSIFinder can provide reliable MSI detection for scientific and clinical purposes.[Table: see text]


2003 ◽  
Vol 92 (3) ◽  
pp. 621-633 ◽  
Author(s):  
Donatas Zmuidinavicius ◽  
Remigijus Didziapetris ◽  
Pranas Japertas ◽  
Alex Avdeef ◽  
Alanas Petrauskas

2009 ◽  
Vol 98 (11) ◽  
pp. 4039-4054 ◽  
Author(s):  
Derek P. Reynolds ◽  
Kiril Lanevskij ◽  
Pranas Japertas ◽  
Remigijus Didziapetris ◽  
Alanas Petrauskas

Author(s):  
Saranya N. ◽  
Saravana Selvam

After an era of managing data collection difficulties, these days the issue has turned into the problem of how to process these vast amounts of information. Scientists, as well as researchers, think that today, probably the most essential topic in computing science is Big Data. Big Data is used to clarify the huge volume of data that could exist in any structure. This makes it difficult for standard controlling approaches for mining the best possible data through such large data sets. Classification in Big Data is a procedure of summing up data sets dependent on various examples. There are distinctive classification frameworks which help us to classify data collections. A few methods that discussed in the chapter are Multi-Layer Perception Linear Regression, C4.5, CART, J48, SVM, ID3, Random Forest, and KNN. The target of this chapter is to provide a comprehensive evaluation of classification methods that are in effect commonly utilized.


Sign in / Sign up

Export Citation Format

Share Document