scholarly journals On the Optimal Size of Candidate Feature Set in Random forest

2019 ◽  
Vol 9 (5) ◽  
pp. 898 ◽  
Author(s):  
Sunwoo Han ◽  
Hyunjoong Kim

Random forest is an ensemble method that combines many decision trees. Each level of trees is determined by an optimal rule among a candidate feature set. The candidate feature set is a random subset of all features, and is different at each level of trees. In this article, we investigated whether the accuracy of Random forest is affected by the size of the candidate feature set. We found that the optimal size differs from data to data without any specific pattern. To estimate the optimal size of feature set, we proposed a novel algorithm which uses the out-of-bag error and the ‘SearchSize’ exploration. The proposed method is significantly faster than the standard grid search method while giving almost the same accuracy. Finally, we demonstrated that the accuracy of Random forest using the proposed algorithm has increased significantly compared to using a typical size of feature set.

Author(s):  
Aruna M ◽  
M Anjana ◽  
Harshita Chauhan ◽  
Deepa R

The price of a car depreciates right from the time it is bought. The resale value of cars is influenced by many factors and influences both buyers and sellers, making it a prominent problem in the machine learning field. Diverse methodologies in machine learning can help us use all the varied factors and process a large amount of data to predict the cost. For our dataset, the Random Forest Regression algorithm shows a significant increase in the prediction rate. In order to optimise the Random Forest Regressor model, best hyperparameters can be found using hyperparameter tuning strategies. On comparing Grid Search and Randomized Search, a better prediction rate is accounted for using the former. These parameters are then passed to the algorithm as hyperparameter tuning can help collect the best batch of decision trees in the random forest for the most optimised prediction rate.


2020 ◽  
Author(s):  
Dániel Kalmár ◽  
György Hetényi ◽  
István Bondár ◽  

<p>We perform P-to-S receiver function analysis to determine a detailed map of the crust-mantle boundary in the Eastern Alps–Pannonian basin–Carpathian mountains junction. We use data from the AlpArray Seismic Network, the Carpathian Basin Project and the South Carpathian Project temporary seismic networks, the permanent stations of the Hungarian National Seismological network, stations of a private network in Hungary as well as selected permanent seismological stations in neighbouring countries for the time period between 2004.01.01. and 2019.03.31. Altogether 221 seismological stations are used in the analysis. Owing to the dense station coverage we can achieve so far unprecedented resolution, thus extending our previous work on the region. We applied three-fold quality control, the first two on the observed waveforms and the third on the calculated radial receiver functions, calculated by the iterative time-domain deconvolution approach. The Moho depth was determined by two independent approaches, the common conversion point (CCP) migration with a local velocity model and the H-K grid search. We show cross-sections beneath the entire investigated area, and concentrate on major structural elements such as the AlCaPa and Tisza-Dacia blocks, the Mid-Hungarian Fault Zone and the Balaton Line. Finally, we present the Moho map obtained by the H-K grid search method and pre-stack CCP migration and interpolation over the entire study area, and compare results of two independent methods to prior knowledge.</p>


Sign in / Sign up

Export Citation Format

Share Document