Optimizing decision tree structures for spectral histopathology (SHP)

The Analyst ◽  
2018 ◽  
Vol 143 (24) ◽  
pp. 5935-5939 ◽  
Author(s):  
Xinying Mu ◽  
Stan Remiszewski ◽  
Mark Kon ◽  
Ayşegül Ergin ◽  
Max Diem

This paper reviews methods to arrive at optimum decision tree or label tree structures to analyze large SHP datasets.

Author(s):  
Ernest Muthomi Mugambi ◽  
Andrew Hunter ◽  
Giles Oatley ◽  
Lee Kennedy

1986 ◽  
Vol 17 (8) ◽  
pp. 1-10
Author(s):  
Yoshio Yanagihara ◽  
Shinichi Tamura ◽  
Minoru Tanaka

2004 ◽  
Vol 17 (2-4) ◽  
pp. 81-87 ◽  
Author(s):  
E.M Mugambi ◽  
Andrew Hunter ◽  
Giles Oatley ◽  
Lee Kennedy

Author(s):  
Yolanda Angelita S

Data Mining is an information discovery by extracting information patterns that contain trend searches in a very large amount of data and assist the process of storing data in making a decision in the future. In determining the pattern classification techniques are done collecting records (Training set).Forests are a very important role for national and state development. Because forests can provide maximum benefits. However, the current situation of Protected Forest has experienced a drastic reduction in area, for that reason utilizing Protected Forest data can produce information about Protected Forest that is a priority and which is not a priority to be reforested or handled first so that the forest function is correct in its use.C4.5 algorithm or commonly known as the decision tree method can provide rule data information to describe the processes associated with processing protected forest data. The characteristics of the classified data can be obtained clearly, both in the form of decision tree structures and in the form of rules. So that in the testing phase with Tanagra 1.4 software can assist in processing valid Protection Forest data.Keywords: Data Mining, Protection Forest, C4.5 Algorithm, Tanagra Version 1.4


Author(s):  
Ming-Shu Chen ◽  
Shih-Hsin Chen

According to the modified Adult Treatment Panel III, five indices are used to define metabolic syndrome (MetS): waist circumference (WC), high blood pressure, fasting glucose, triglycerides (TG), and high-density lipoprotein cholesterol. Our work evaluates the importance of these indices. In addition, we attempted to identify whether trends and patterns existed among young, middle-aged, and older people. Following the analysis, a decision tree algorithm was used to analyze the importance of the five criteria for MetS because the algorithm in question selects the attribute with the highest information gain as the split node. The most important indices are located on the top of the tree, indicating that these indices can effectively distinguish data in a binary tree and the importance of this criterion. That is, the decision tree algorithm specifies the priority of the influence factors. The decision tree algorithm examined four of the five indices because one was excluded. Moreover, the tree structures differed among the three age groups. For example, the first key index for middle-aged and older people was TG whereas for younger people it was WC. Furthermore, the order of the second to fourth indices differed among the groups. Because the key index was identified for each age group, researchers and practitioners could provide different health care strategies for individuals based on age. High-risk middle-aged and healthy older people maintained low values of TG, which might be the most crucial index. When a person can avoid the first and second indices provided by the decision tree, they are at lower risk of MetS. Therefore, this paper provides a data-driven guideline for MetS prevention.


2010 ◽  
Vol 61 (2) ◽  
pp. 545-553 ◽  
Author(s):  
Hun-Kyun Bae ◽  
Betty H. Olson ◽  
Kuo-Lin Hsu ◽  
Soroosh Sorooshian

The study used existing indicator bacterial data and a number of physicochemical parameters that can be measured instantaneously to determine if a decision tree approach, especially classification and regression tree, could be used to predict bacterial concentrations in timely manner for beach closure management. Each indicator bacteria showed different tree structures and each had its own significant variables; Dissolved oxygen played an important role for both total coliform and fecal coliform and turbidity was the most important factor to predict concentrations of enterococci for decision tree approaches. Root mean squared error stayed between 5 and 6.5% of the average values of observations; RMSEs from each simulation, 0.25 for total coliform, 0.31 for fecal coliform, and 0.29 for enterococci. Estimations from tree structures would be regarded as a good representation of the actual data. In addition to results of the objective function, RMSE, 77.5% of actual value fell into the 95% of confidence interval of estimations for total coliform concentrations, 60% for fecal coliform concentrations, and 62.5% for enterococci concentrations. The approach showed reliable estimations for the majority of the data processed, although the method did not portray low concentrations of bacteria as well.


Sign in / Sign up

Export Citation Format

Share Document