scholarly journals Classification tree methods for development of decision rules for botulism and cyanide poisoning

2008 ◽  
Vol 4 (2) ◽  
pp. 77-83 ◽  
Author(s):  
Howell Sasser ◽  
Marcy Nussbaum ◽  
Michael Beuhler ◽  
Marsha Ford
2020 ◽  
Vol 12 (3) ◽  
pp. 910 ◽  
Author(s):  
Francesca Pagliara ◽  
Filomena Mauriello ◽  
Lucia Russo

This paper provides a contribution to the international literature by applying regression tree methods to the analysis of the expected effects of the High Speed Rail project in Italy on the tourism market. This approach, as far as the author knows, has never been applied in this context. Tourism and transport information have been gathered for 99 Italian provinces during the 2006–2016 period. Tree-structured methods have been chosen as an application of regression models in which some explanatory variables are used as covariates to predict the dependent variable values on the basis of some decision rules. This approach establishes a casual effect between dependent and independent variables. The dependent variables chosen are the Italian and foreign tourists, and the number of overnights spent by Italians and foreigners. Among the independent variables are the presence of HSR, the presence of first-level airport hubs and the number of operating bases of low-cost airlines; among the attractiveness variables are the GDP, the number of attractions in a given province, the presence of the sea, the population and the percentage of unemployment. The main outcome of this study is that HSR affects the tourism market.


2015 ◽  
Vol 14 (03) ◽  
pp. 521-533
Author(s):  
M. Sariyar ◽  
A. Borg

Deterministic record linkage (RL) is frequently regarded as a rival to more sophisticated strategies like probabilistic RL. We investigate the effect of combining deterministic linkage with other linkage techniques. For this task, we use a simple deterministic linkage strategy as a preceding filter: a data pair is classified as ‘match' if all values of attributes considered agree exactly, otherwise as ‘nonmatch'. This strategy is separately combined with two probabilistic RL methods based on the Fellegi–Sunter model and with two classification tree methods (CART and Bagging). An empirical comparison was conducted on two real data sets. We used four different partitions into training data and test data to increase the validity of the results. In almost all cases, application of deterministic linkage as a preceding filter leads to better results compared to the omission of such a pre-filter, and overall classification trees exhibited best results. On all data sets, probabilistic RL only profited from deterministic linkage when the underlying probabilities were estimated before applying deterministic linkage. When using a pre-filter for subtracting definite cases, the underlying population of data pairs changes. It is crucial to take this into account for model-based probabilistic RL.


2016 ◽  
Vol 70 ◽  
pp. 154-162
Author(s):  
Anna Spychała ◽  
Michał Skrzypek ◽  
Ewa Niewiadomska

Author(s):  
Fiorella Mete ◽  
David J. Corr ◽  
Michael P. Wilbur ◽  
Ying Chen

Collecting information on heavy trucks and monitoring the bridges which they regularly cross is important for many facets of infrastructure management. In this paper, a two-step algorithm is developed using bridge and truck data, by deploying sequentially unsupervised and supervised machine learning techniques. Longitudinal clustering of bridge data, concerning strain waveforms, is adopted to perform the first step of the algorithm, while image visual inspection and classification tree methods are applied to truck data concurrently in the second step. Both bridge and truck traffic must be monitored for a limited, yet significant, amount of time to calibrate the algorithm, which is then used to build a classification framework. The framework provides the same benefits of two data collection systems while only one needs to be operative. Depending on which monitoring system remains available, the framework enables the use of bridge data to identify the truck’s profile which generated it, or to estimate bridge response given the truck’s information. As a result, the present study aims to provide decision-makers with an effective way to monitor the whole bridge-traffic system, bridge managers to plan effective maintenance, and policymakers to develop ad hoc regulations.


2020 ◽  
Author(s):  
İpek Deveci Kocakoç ◽  
İstem Keser

This study explores the most important socio-economic variables determining the voting decisions of the provinces in Municipality Elections by using classification trees. We collected data on many potential variables that may affect voting decisions in favor  of a political party. Each province’s economic, geographic and demographic data is taken into consideration as independent variables. The dependent variable is the winner party in 2014 Municipality Elections. Data set consists of 81 provinces’ data on 69 variables. The aim of the study is to find which variables affect voting decision the most and try to find a pattern that may lead political campaigns. Amongst many classification algorithms, we used C5.0 algorithm coded in R. It helps us explore the structure of a set of data, while developing easy to visualize decision rules for predicting a categorical (classification tree) or continuous (regression tree) outcome. The C5.0 algorithm determines the separation criterion with the greatest information gain in each decision node and performs optimal separation. Since our data size is small, we used k=1000 trials (estimations) and then summarized them to provide more robust results. By choosing C5.0 algorithm’s sub-trial size as 5, 5000 trees are formed and the mean of all importance scores of all trees formed are calculated and interpreted. The most important independent variables discriminating the voting decision are found to be the result of the previous elections, mean household population, proportion of population between ages 15 and 19, electricity consumption per person, and proportion of population between ages 55 and 64. Keywords: classification trees, voting decision, C5.0 algorithm, decision trees


1996 ◽  
Vol 33 (6) ◽  
pp. 888-893 ◽  
Author(s):  
Stefano Merler ◽  
Cesare Furlanello ◽  
Claudio Chemini ◽  
Gianni Nicolini

Sign in / Sign up

Export Citation Format

Share Document