On the Use of Reliable-Negatives Selection Strategies in the PU Learning Approach for Quality Flaws Prediction in Wikipedia

Author(s):  
Edgardo Ferretti ◽  
Marcelo L. Errecalde ◽  
Maik Anderka ◽  
Benno Stein
Author(s):  
Nesma Settouti ◽  
Mostafa El Habib Daho ◽  
Mohammed El Amine Bechar ◽  
Mohammed Amine Chikh

The semi-supervised learning is one of the most interesting fields for research developments in the machine learning domain beyond the scope of supervised learning from data. Medical diagnostic process works mostly in supervised mode, but in reality, we are in the presence of a large amount of unlabeled samples and a small set of labeled examples characterized by thousands of features. This problem is known under the term “the curse of dimensionality”. In this study, we propose, as solution, a new approach in semi-supervised learning that we would call Optim Co-forest. The Optim Co-forest algorithm combines the re-sampling data approach (Bagging Breiman, 1996) with two selection strategies. The first one involves selecting random subset of parameters to construct the ensemble of classifiers following the principle of Co-forest (Li & Zhou, 2007). The second strategy is an extension of the importance measure of Random Forest (RF; Breiman, 2001). Experiments on high dimensional datasets confirm the power of the adopted selection strategies in the scalability of our method.


2017 ◽  
Vol 25 (1) ◽  
pp. 201-213 ◽  
Author(s):  
Fuqiang Yao ◽  
Luliang Jia ◽  
Youming Sun ◽  
Yuhua Xu ◽  
Shuo Feng ◽  
...  

Author(s):  
Lindsey M. Kitchell ◽  
Francisco J. Parada ◽  
Brandi L. Emerick ◽  
Tom A. Busey

Sign in / Sign up

Export Citation Format

Share Document