scholarly journals A Combination Method of the Tanimoto Coefficient and Proximity Measure of Random Forest for Compound Activity Prediction

2008 ◽  
Vol 4 ◽  
pp. 238-249
Author(s):  
Gen Kawamura ◽  
Shigeto Seno ◽  
Yoichi Takenaka ◽  
Hideo Matsuda
2020 ◽  
Vol 3 (1) ◽  
pp. 39-46 ◽  
Author(s):  
Sarini Abdullah ◽  
GV Prasetyo

Imbalanced data might cause some issues in problem definition level, algorithm level, and data level. Some of the methods have been developed to overcome this issue, one of state-of-the-art method is Easy Ensemble. Easy Ensemble was claimed can improve model performance to classify minority class, and overcome the deficiency of random under- sampling. In this paper we discussed the implementation of Easy Ensemble with Random Forest Classifiers to handle imbalance problem in credit scoring case. This combination method is implemented in two datasets which taken from data science competition website, finhacks.id and kaggle.com with class proportion within majority and minority is 70:30 and 94:6. The results showed that resampling with Easy Ensemble can improve Random Forest classifier performance upon minority class. Recall on minority class increased significantly after the resampling. Before resampling, the recall on minority class for the first dataset (finhacks.id) was 0.49, and increased to 0.82 after the resampling. Similar results were obtained for the second data set (kaggle.com), where the recall for the minority class was increased from just 0.14 to 0.73.


2018 ◽  
Vol 5 (1) ◽  
pp. 47-55
Author(s):  
Florensia Unggul Damayanti

Data mining help industries create intelligent decision on complex problems. Data mining algorithm can be applied to the data in order to forecasting, identity pattern, make rules and recommendations, analyze the sequence in complex data sets and retrieve fresh insights. Yet, increasing of technology and various techniques among data mining availability data give opportunity to industries to explore and gain valuable information from their data and use the information to support business decision making. This paper implement classification data mining in order to retrieve knowledge in customer databases to support marketing department while planning strategy for predict plan premium. The dataset decompose into conceptual analytic to identify characteristic data that can be used as input parameter of data mining model. Business decision and application is characterized by processing step, processing characteristic and processing outcome (Seng, J.L., Chen T.C. 2010). This paper set up experimental of data mining based on J48 and Random Forest classifiers and put a light on performance evaluation between J48 and random forest in the context of dataset in insurance industries. The experiment result are about classification accuracy and efficiency of J48 and Random Forest , also find out the most attribute that can be used to predict plan premium in context of strategic planning to support business strategy.


2019 ◽  
Vol 139 (8) ◽  
pp. 850-857
Author(s):  
Hiromu Imaji ◽  
Takuya Kinoshita ◽  
Toru Yamamoto ◽  
Keisuke Ito ◽  
Masahiro Yoshida ◽  
...  

2006 ◽  
Vol 68 (3) ◽  
pp. 274-279 ◽  
Author(s):  
Akira TAKAHASHI ◽  
Naoya YAMAZAKI ◽  
Akifumi YAMAMOTO ◽  
Kouji YOSHINO ◽  
Kenjiro NAMIKAWA ◽  
...  

Author(s):  
Eesha Goel ◽  
◽  
Er. Abhilasha ◽  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document