Classification of the factors for smoking cessation using logistic regression, decision tree & neural networks

Classification of jobs with risk of low back disorders by applying data mining techniques

Occupational Ergonomics ◽

10.3233/oer-2004-4406 ◽

2005 ◽

Vol 4 (4) ◽

pp. 291-305

Author(s):

Jozef Zurada ◽

Waldemar Karwowski ◽

William Marras

Keyword(s):

Data Mining ◽

Neural Networks ◽

Logistic Regression ◽

Decision Tree ◽

Low Back ◽

Data Mining Techniques ◽

Back Disorders ◽

Work Related ◽

Low Back Disorders

Work related low back disorders (LBDs) continue to pose significant occupational health problem that affects the quality of life of the industrial population. The main objective of this study was to explore the application of various data mining techniques, including neural networks, logistic regression, decision trees, memory-based reasoning, and the ensemble model, for classification of industrial jobs with respect to the risk of work-related LBDs. The results from extensive computer simulations using a 10-fold cross validation showed that memory-based reasoning and ensemble models were the best in the overall classification accuracy. The decision tree and memory-based reasoning models were the most accurate in classifying jobs with high risk of LBDs, whereas neural networks and logistic regression were the best in classifying jobs with low risk of LBDs. The decision tree model delivered the most stable results across 10 generations of different data sets randomly chosen for training, validation, and testing. The classification results generated by the decision tree were the easiest to interpret because they were given in the form of simple 'if-then' rules. These results produced by the decision tree method showed that the peak moment had the highest predictive power of LBDs.

Download Full-text

Classification of online toxic comments using the logistic regression and neural networks models

10.1063/1.5082126 ◽

2018 ◽

Cited By ~ 4

Author(s):

Mujahed A. Saif ◽

Alexander N. Medvedev ◽

Maxim A. Medvedev ◽

Todorka Atanasova

Keyword(s):

Neural Networks ◽

Logistic Regression

Download Full-text

Classification of Pima Indian Diabetes Dataset using Ensemble of Decision Tree, Logistic Regression and Neural Network

IJARCCE ◽

10.17148/ijarcce.2020.9701 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1-4

Author(s):

Mani Abedini ◽

Anita Bijari ◽

Touraj Banirostam

Keyword(s):

Neural Network ◽

Logistic Regression ◽

Decision Tree

Download Full-text

IDENTIFIKASI JENIS IKAN MENGGUNAKAN MODEL HYBRID DEEP LEARNING DAN ALGORITMA KLASIFIKASI

Sebatik ◽

10.46984/sebatik.v24i2.1057 ◽

2020 ◽

Vol 24 (2) ◽

Author(s):

Anifuddin Azis

Keyword(s):

Neural Networks ◽

Support Vector Machine ◽

Logistic Regression ◽

Deep Learning ◽

Random Forest ◽

Decision Tree ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Output

Indonesia merupakan negara dengan keanekaragaman hayati terbesar kedua di dunia setelah Brazil. Indonesia memiliki sekitar 25.000 spesies tumbuhan dan 400.000 jenis hewan dan ikan. Diperkirakan 8.500 spesies ikan hidup di perairan Indonesia atau merupakan 45% dari jumlah spesies yang ada di dunia, dengan sekitar 7.000an adalah spesies ikan laut. Untuk menentukan berapa jumlah spesies tersebut dibutuhkan suatu keahlian di bidang taksonomi. Dalam pelaksanaannya mengidentifikasi suatu jenis ikan bukanlah hal yang mudah karena memerlukan suatu metode dan peralatan tertentu, juga pustaka mengenai taksonomi. Pemrosesan video atau citra pada data ekosistem perairan yang dilakukan secara otomatis mulai dikembangkan. Dalam pengembangannya, proses deteksi dan identifikasi spesies ikan menjadi suatu tantangan dibandingkan dengan deteksi dan identifikasi pada objek yang lain. Metode deep learning yang berhasil dalam melakukan klasifikasi objek pada citra mampu untuk menganalisa data secara langsung tanpa adanya ekstraksi fitur pada data secara khusus. Sistem tersebut memiliki parameter atau bobot yang berfungsi sebagai ektraksi fitur maupun sebagai pengklasifikasi. Data yang diproses menghasilkan output yang diharapkan semirip mungkin dengan data output yang sesungguhnya. CNN merupakan arsitektur deep learning yang mampu mereduksi dimensi pada data tanpa menghilangkan ciri atau fitur pada data tersebut. Pada penelitian ini akan dikembangkan model hybrid CNN (Convolutional Neural Networks) untuk mengekstraksi fitur dan beberapa algoritma klasifikasi untuk mengidentifikasi spesies ikan. Algoritma klasifikasi yang digunakan pada penelitian ini adalah : Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree, K-Nearest Neighbor (KNN), Random Forest, Backpropagation.

Download Full-text

Assessing Data Mining Approaches for Analyzing Actuarial Student Success Rate

Data Mining ◽

10.4018/978-1-4666-2455-9.ch094 ◽

2013 ◽

pp. 1819-1834

Author(s):

Alan Olinsky ◽

Phyllis A. Schumacher ◽

John Quinn

Keyword(s):

Data Mining ◽

Neural Networks ◽

Logistic Regression ◽

Decision Tree ◽

Student Success ◽

Predictive Models ◽

Drop Out ◽

Predicting Success ◽

Best Fitting ◽

Fitting Model

One way to enhance the likelihood that more students will graduate within the specific major that they begin with is to attract the type of students who have typically (historically) done well in that field of study. This chapter details a study that utilizes data mining techniques to analyze the characteristics of students who enroll as actuarial students and then either drop out of the major or graduate as actuarial students. Several predictive models including logistic regression, neural networks and decision trees are obtained. The models are then compared and the best fitting model is determined. The regression model turns out to be the best predictor. Since this is a very well understood method, it can easily be explained. The decision tree, although its underpinnings are somewhat difficult to explain, gives a clear and well understood output. Not only is the resulting model a good one for predicting success in the major, it also allows us the ability to better counsel students.

Download Full-text

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

International Journal of Systems and Service-Oriented Engineering ◽

10.4018/ijssoe.2021010103 ◽

2021 ◽

Vol 11 (1) ◽

pp. 31-42

Author(s):

Jayalath Bandara Ekanayake

Keyword(s):

Logistic Regression ◽

Decision Tree ◽

Prediction Models ◽

Naive Bayes ◽

Naïve Bayes ◽

Potential Candidate ◽

Priority Level ◽

Bug Reports ◽

Manual Classification

Manual classification of bug reports is time-consuming as the reports are received in large quantities. Alternatively, this project proposed automatic bug prediction models to classify the bug reports. The topics or the candidate keywords are mined from the developer description in bug reports using RAKE algorithm and converted into attributes. These attributes together with the target attribute—priority level—construct the training datasets. Naïve Bayes, logistic regression, and decision tree learner algorithms are trained, and the prediction quality was measured using area under recursive operative characteristics curves (AUC) as AUC does not consider the biasness in datasets. The logistics regression model outperforms the other two models providing the accuracy of 0.86 AUC whereas the naïve Bayes and the decision tree learner recorded 0.79 AUC and 0.81 AUC, respectively. The bugs can be classified without developer involvement and logistic regression is also a potential candidate as naïve Bayes for bug classification.

Download Full-text

SETTING UP A PROBABILISTIC NEURAL NETWORK FOR CLASSIFICATION OF HIGHWAY VEHICLES

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026805001702 ◽

2005 ◽

Vol 05 (04) ◽

pp. 411-423 ◽

Cited By ~ 5

Author(s):

MAJURA F. SELEKWA ◽

VALERIAN KWIGIZILE ◽

RENATUS N. MUSSA

Keyword(s):

Neural Network ◽

Neural Networks ◽

Decision Tree ◽

Probabilistic Neural Network ◽

Classification Problem ◽

Misclassification Rate ◽

Tree Methods ◽

Decision Tree Methods

Many neural network methods used for efficient classification of populations work only when the population is globally separable. In situ classification of highway vehicles is one of the problems with globally nonseparable populations. This paper presents a systematic procedure for setting up a probabilistic neural network that can classify the globally nonseparable population of highway vehicles. The method is based on a simple concept that any set of classifiable data can be broken down to subclasses of locally separable data. Hence, if these locally separable data can be identified, then the classification problem can be carried out in two hierarchical steps; step one classifies the data according to the local subclasses, and step two classifies the local subclasses into the global classes. The proposed approach was tested on the problem of classifying highway vehicles according to the US Federal Highway Administration standard, which is normally handled by decision tree methods that use vehicle axle information and a set of IF-THEN rules. By using a sample of 3326 vehicles, the proposed method showed improved classification results with an overall misclassification rate of only 2.9% compared to 9.7% of the decision tree methods. A similar setup can be used with different neural networks such as recurrent neural networks, but they were not tested in this study especially since the focus was for in situ applications where a high learning rate is desired.

Download Full-text

Traffic Congestion Prediction using Decision Tree, Logistic Regression and Neural Networks

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2021.04.138 ◽

2020 ◽

Vol 53 (5) ◽

pp. 512-517

Author(s):

Tariku Sinshaw Tamir ◽

Gang Xiong ◽

Zhishuai Li ◽

Hao Tao ◽

Zhen Shen ◽

...

Keyword(s):

Neural Networks ◽

Logistic Regression ◽

Decision Tree ◽

Traffic Congestion ◽

Congestion Prediction

Download Full-text

Neural Network Classification of Tan Spot and Stagonospora Blotch Infection Periods in a Wheat Field Environment

Phytopathology ◽

10.1094/phyto.2000.90.2.108 ◽

2000 ◽

Vol 90 (2) ◽

pp. 108-113 ◽

Cited By ~ 30

Author(s):

E. D. De Wolf ◽

L. J. Francl

Keyword(s):

Neural Network ◽

Neural Networks ◽

Logistic Regression ◽

Tan Spot ◽

Data Set ◽

Wheat Field ◽

Incidence Data ◽

Significant Difference ◽

General Regression Neural Networks

Tan spot and Stagonospora blotch of hard red spring wheat served as a model system for evaluating disease forecasts by artificial neural networks. Pathogen infection periods on susceptible wheat plants were measured in the field from 1993 to 1998, and incidence data were merged with 24-h summaries of accumulated growing degree days, temperature, relative humidity, precipitation, and leaf wetness duration. The resulting data set of 202 discrete periods was randomly assigned to 10 modeldevelopment or -validation (n = 50) data sets. Backpropagation neural networks, general regression neural networks, logistic regression, and parametric and nonparametric methods of discriminant analysis were chosen for comparison. Mean validation classification of tan spot incidence was between 71% for logistic regression and 76% for backpropagation models. No significant difference was found between methods of modeling tan spot infection periods. Mean validation prediction accuracy of Stagonospora blotch incidence was 86 and 81% for backpropagation and logistic regression, respectively. Prediction accuracies of other modeling methods were ≤78% and were significantly different (P = 0.01) from backpropagation, but not logistic regression, results. The best backpropagation models of tan spot and Stagonospora blotch incidences correctly classified 82 and 84% of validation cases, respectively. High classification accuracy and consistently good performance demonstrate the applicability of neural network technology to plant disease forecasting.

Download Full-text

Assessing Data Mining Approaches for Analyzing Actuarial Student Success Rate

Visual Analytics and Interactive Technologies ◽

10.4018/978-1-60960-102-7.ch010 ◽

2011 ◽

pp. 169-185

Author(s):

Alan Olinsky ◽

Phyllis A. Schumacher ◽

John Quinn

Keyword(s):

Data Mining ◽

Neural Networks ◽

Logistic Regression ◽

Decision Tree ◽

Student Success ◽

Predictive Models ◽

Drop Out ◽

Predicting Success ◽

Best Fitting ◽

Fitting Model

One way to enhance the likelihood that more students will graduate within the specific major that they begin with is to attract the type of students who have typically (historically) done well in that field of study. This chapter details a study that utilizes data mining techniques to analyze the characteristics of students who enroll as actuarial students and then either drop out of the major or graduate as actuarial students. Several predictive models including logistic regression, neural networks and decision trees are obtained. The models are then compared and the best fitting model is determined. The regression model turns out to be the best predictor. Since this is a very well understood method, it can easily be explained. The decision tree, although its underpinnings are somewhat difficult to explain, gives a clear and well understood output. Not only is the resulting model a good one for predicting success in the major, it also allows us the ability to better counsel students.

Download Full-text