scholarly journals Comparing the utility of decision trees and support vector machines when planning inspections of linear sewer infrastructure

2014 ◽  
Vol 16 (6) ◽  
pp. 1265-1279 ◽  
Author(s):  
Robert Richard Harvey ◽  
Edward Arthur McBean

Closed-circuit television inspection technology is traditionally used to identify aging sewer pipes requiring rehabilitation. While these inspections provide essential information on the condition of pipes hidden from day-to-day view, they are expensive and often limited to small portions of an entire sewer system. Municipalities may benefit from utilizing predictive analytics to leverage existing inspection datasets so that reliable predictions of condition are available for pipes that have not yet been inspected. The predictive capabilities of data mining systems, namely support vector machines (SVMs) and decision tree classifiers, are demonstrated using a case study of sanitary sewer pipe inspection data collected by the municipality of Guelph, Ontario, Canada. The modeling algorithms are implemented using open-source software and are tuned to counteract the negative impact on predictive performance resulting from class imbalance common within pipe inspection datasets. The decision tree classifier outperforms SVM for this classification task – achieving an acceptable area under the receiver operating characteristic curve of 0.77 and an overall accuracy of 76% on a stratified test set. Although predicting individual pipe condition is a notoriously difficult task, decision trees are found to be a useful screening tool for planning future inspection-related activities.

2021 ◽  
Vol 12 (11) ◽  
pp. 1916-1924
Author(s):  
Tamanna Siddiqui, Et. al.

Sarcasm is well-defined as a cutting, frequently sarcastic remark intended to fast ridicule or dislike. Irony detection is the assignment of fittingly labeling the text as’ Sarcasm’ or ’non- Sarcasm.’ There is a challenging task owing to the deficiency of facial expressions and intonation in the text. Social media and micro-blogging websites are extensively explored for getting the information to extract the opinion of the target because a huge of text data existence is put out into the open field into social media like Twitter. Such large, openly available text data could be utilized for a variety of researches. Here we applied text data set for classifying Sarcasm and experiments have been made from the textual data extracted from the Twitter data set. Text data set downloaded from Kaggle, including 1984 tweets that collected from Twitter. These data already have labels here. In this paper, we apply these data to train our model Classifiers for different algorithms to see the ability of model machine learning to recognize sarcasm and non-sarcasm through a set of the process start by text pre-processing feature extraction (TF-IDF) and apply different classification algorithms, such as Decision Tree classifier, Multinomial Naïve Bayes Classifier, Support vector machines, and Logistic Regression classifier. Then tuning a model fitting the best results, we get in (TF-IDF) we achieve 0.94% in Multinomial NB, Decision Tree Classifier we achieve 0.93%, Logistic Regression we achieve 0.97%, and Support vector machines (SVM) we achieve 0.42%. All these result models were improved, except the SVM model has the lowest accuracy. The results were extracted, and the evaluation of the results has been proved above to be good in accuracy for identifying sarcastic impressions of people.


Sign in / Sign up

Export Citation Format

Share Document