Predicting Faculty Performance Using Regression Model in Data Mining

One way to enhance the likelihood that more university students will graduate within the specific major that they begin with is to attract the type of students who have typically (historically) done well in that field of study. This paper expands upon a study that utilizes data mining techniques to analyze the characteristics of students who enroll as actuarial students and then either drop out of the major or graduate as actuarial students. Several predictive models including logistic regression, neural networks and decision trees are obtained using input variables describing academic attributes of the students. The models are then compared and the best fitting model is determined. The regression model turns out to be the best predictor. Since this is a very well understood method, it can easily be explained. The decision tree, although its underpinnings are somewhat difficult to explain, gives a clear and well understood output. In addition, the non-predictive method of cluster analysis is applied in order to group these students into distinct classifications based on the values of the input variables. Finally, a new approach to modeling in SAS®, called Rapid Predictive Modeler (RPM), is described and utilized. The results of the RPM also select the regression model as the best predictor.

Download Full-text

Predictors of Performance among Faculty in Dubai Higher Education Institutions

IAMURE International Journal of Education ◽

10.7718/iamure.ije.v18i1.1142 ◽

2016 ◽

Vol 18 (1) ◽

Author(s):

Ann Gloghienette Orais Perez ◽

Marilou D. Junsay

Keyword(s):

Higher Education ◽

Job Satisfaction ◽

Regression Model ◽

Educational Level ◽

Higher Education Institutions ◽

Faculty Members ◽

Faculty Performance ◽

Perceived Fairness ◽

Professional Qualification ◽

Research Findings

The purpose of this sequential explanatory research study is to predict the psychographics and demographics that are associated with performance among faculty in Dubai Higher Education Institutions (HEIs) and thereafter to develop a regression model. Using the purposive sampling, twenty faculty members among Dubai HEIs were used to answer the validated and tried-out interview guide which results were coded, interpreted, and clustered into themes. The research findings reveal that professional qualification, commitment, job satisfaction, motivation, personal differences, and perceived fairness in management emerged as psychographics that influence faculty performance. The psychographics and the demographics were tested whether these predict faculty performance. Using stratified sampling, not lesser than one hundred forty-nine (149) faculty members were selected to answer the validated and tried questionnaire. Using MANCOVA, the figures disclose that the educational level, professional qualification, commitment, job satisfaction, motivation, and perceived fairness in management are predictors of faculty performance. The regression model of the study is Faculty Performance = 32.076 + 12.977 Educational Level + 2.070 Professional Qualification + .967 Commitment – 10.388 Job Satisfaction + 6.926 Motivation – 1.302 Perceived Fairness in Management. The findings of this study would contribute to the identification of criteria in the hiring of faculty in Dubai HEIs.

Download Full-text

Data Mining CMMSs: How to Convert Data into Knowledge

Biomedical Instrumentation & Technology ◽

10.2345/0899-8205-52.s2.28 ◽

2018 ◽

Vol 52 (s2) ◽

pp. 28-33 ◽

Cited By ~ 1

Author(s):

Larry Fennigkoh ◽

D. Courtney Nanney

Keyword(s):

Data Mining ◽

Regression Analysis ◽

Regression Model ◽

Multiple Regression Analysis ◽

Preventive Maintenance ◽

Statistical Significance ◽

Research Question ◽

Inferential Statistics ◽

Proper Interpretation

Although the healthcare technology management (HTM) community has decades of accumulated medical device–related maintenance data, little knowledge has been gleaned from these data. Finding and extracting such knowledge requires the use of the well-established, but admittedly somewhat foreign to HTM, application of inferential statistics. This article sought to provide a basic background on inferential statistics and describe a case study of their application, limitations, and proper interpretation. The research question associated with this case study involved examining the effects of ventilator preventive maintenance (PM) labor hours, age, and manufacturer on needed unscheduled corrective maintenance (CM) labor hours. The study sample included more than 21,000 combined PM inspections and CM work orders on 2,045 ventilators from 26 manufacturers during a five-year period (2012–16). A multiple regression analysis revealed that device age, manufacturer, and accumulated PM inspection labor hours all influenced the amount of CM labor significantly (P < 0.001). In essence, CM labor hours increased with increasing PM labor. However, and despite the statistical significance of these predictors, the regression analysis also indicated that ventilator age, manufacturer, and PM labor hours only explained approximately 16% of all variability in CM labor, with the remainder (84%) caused by other factors that were not included in the study. As such, the regression model obtained here is not suitable for predicting ventilator CM labor hours.

Download Full-text

A regression-based algorithm for frequent itemsets mining

Data Technologies and Applications ◽

10.1108/dta-03-2019-0037 ◽

2019 ◽

Vol 54 (3) ◽

pp. 259-273

Author(s):

Zirui Jia ◽

Zengli Wang

Keyword(s):

Data Mining ◽

Regression Model ◽

Multiple Linear Regression Model ◽

Mining Area ◽

Frequent Itemset ◽

Continuous Data ◽

Data Sets ◽

Content Type ◽

Existing Problems ◽

Frequent Itemsets Mining

Purpose Frequent itemset mining (FIM) is a basic topic in data mining. Most FIM methods build itemset database containing all possible itemsets, and use predefined thresholds to determine whether an itemset is frequent. However, the algorithm has some deficiencies. It is more fit for discrete data rather than ordinal/continuous data, which may result in computational redundancy, and some of the results are difficult to be interpreted. The purpose of this paper is to shed light on this gap by proposing a new data mining method. Design/methodology/approach Regression pattern (RP) model will be introduced, in which the regression model and FIM method will be combined to solve the existing problems. Using a survey data of computer technology and software professional qualification examination, the multiple linear regression model is selected to mine associations between items. Findings Some interesting associations mined by the proposed algorithm and the results show that the proposed method can be applied in ordinal/continuous data mining area. The experiment of RP model shows that, compared to FIM, the computational redundancy decreased and the results contain more information. Research limitations/implications The proposed algorithm is designed for ordinal/continuous data and is expected to provide inspiration for data stream mining and unstructured data mining. Practical implications Compared to FIM, which mines associations between discrete items, RP model could mine associations between ordinal/continuous data sets. Importantly, RP model performs well in saving computational resource and mining meaningful associations. Originality/value The proposed algorithms provide a novelty view to define and mine association.

Download Full-text

A Data Mining Approach on Lorry Drivers Overloading in Tehran Urban Roads

Journal of Advanced Transportation ◽

10.1155/2020/6895407 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Ehsan Ayazi ◽

Abdolreza Sheikholeslami

Keyword(s):

Data Mining ◽

Regression Model ◽

Truck Drivers ◽

Commercial Vehicles ◽

Binary Regression ◽

Pickup Truck ◽

Data Mining Approach ◽

Factors Influencing ◽

Other Information ◽

Construction Loads

The aim of this study is to identify the important factors influencing overloading of commercial vehicles on Tehran’s urban roads. The weight information of commercial freight vehicles was collected using a pair of portable scales besides other information needed including driver information, vehicle features, load, and travel details by completing a questionnaire. The results showed that the highest probability of overloading is for construction loads. Further, the analysis of the results in the lorry type section shows that the least likely occurrence of overloading is among pickup truck drivers such that this likelihood within this group was one-third among Nissan and small truck drivers. Also, the results of modeling the type of route showed that the highest likelihood of overloading is for internal loads (origin and destination inside Tehran), and the least probability of overloading is for suburban trips (origin and destination outside of Tehran). Considering the type of load packing as a variable, the results of binary regression model analysis showed that the most probability of overloading occurs for packed (boxed) loads. Finally, it was concluded that drivers are 18 times more likely to commit overloading on weekends than on weekdays.

Download Full-text

Faculty performance evaluation based on prediction in distributed data mining

2015 IEEE International Conference on Engineering and Technology (ICETECH) ◽

10.1109/icetech.2015.7275019 ◽

2015 ◽

Author(s):

Priyanka R Shah ◽

Dinesh B Vaghela ◽

Priyanka Sharma

Keyword(s):

Data Mining ◽

Performance Evaluation ◽

Distributed Data Mining ◽

Distributed Data ◽

Faculty Performance

Download Full-text

Identification of hindered internal rotational mode for complex chemical species: A data mining approach with multivariate logistic regression model

Chemometrics and Intelligent Laboratory Systems ◽

10.1016/j.chemolab.2017.11.006 ◽

2018 ◽

Vol 172 ◽

pp. 10-16 ◽

Cited By ~ 32

Author(s):

Triet H.M. Le ◽

Tung T. Tran ◽

Lam K. Huynh

Keyword(s):

Data Mining ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Chemical Species ◽

Multivariate Logistic Regression Model ◽

Multivariate Logistic Regression ◽

Rotational Mode ◽

Complex Chemical ◽

Data Mining Approach

Download Full-text

Regression Model Method for Analyze the Association Rules using Major Parameters

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g1024.0597s20 ◽

2020 ◽

Vol 9 (7S) ◽

pp. 71-74

Keyword(s):

Data Mining ◽

Regression Model ◽

Association Rules ◽

Association Rule ◽

Large Data ◽

Significance Test ◽

Fundamental Parameter ◽

Explanatory Variables ◽

R Language ◽

The Given

Using the data mining user can extract the information. Frequent itemsets is one of the popular task in data mining. Association Rule Analysis is the task of discovering association rules that occur frequently in a given large data set.The task is to find certain relationships among a set of itemsets in the database. There are two fundamental parameter(measurement) is Support and Confidence.Traditional association rule mining techniques employ predefined support and confidence values. But, it’s observed that specifying minimum support value of the minded rules in advance often leads to either too many or too few rules, which negatively impacts the performance of the overall system.This paper proposes a non-linear regression model using support, confidence and association rules. To predict the number of rules under the given explanatory variables say parameters. Use the R language for the Rules generations and also uses significance test to verify regression coefficients. Using the coefficient test and F-test verify the model

Download Full-text

Improvement of Prediction Ability of Multicomponent Regression Model by a Method Based on Data Mining in Chemometrics

2009 Second International Workshop on Knowledge Discovery and Data Mining ◽

10.1109/wkdd.2009.82 ◽

2009 ◽

Author(s):

Ling Gao ◽

Shouxin Ren

Keyword(s):

Data Mining ◽

Regression Model ◽

Prediction Ability

Download Full-text