scholarly journals A Learning Analytics Approach to Identify Students at Risk of Dropout: A Case Study with a Technical Distance Education Course

2020 ◽  
Vol 10 (11) ◽  
pp. 3998 ◽  
Author(s):  
Emanuel Marques Queiroga ◽  
João Ladislau Lopes ◽  
Kristofer Kappel ◽  
Marilton Aguiar ◽  
Ricardo Matsumura Araújo ◽  
...  

Contemporary education is a vast field that is concerned with the performance of education systems. In a formal e-learning context, student dropout is considered one of the main problems and has received much attention from the learning analytics research community, which has reported several approaches to the development of models for the early prediction of at-risk students. However, maximizing the results obtained by predictions is a considerable challenge. In this work, we developed a solution using only students’ interactions with the virtual learning environment and its derivative features for early predict at-risk students in a Brazilian distance technical high school course that is 103 weeks in duration. To maximize results, we developed an elitist genetic algorithm based on Darwin’s theory of natural selection for hyperparameter tuning. With the application of the proposed technique, we predicted the student at risk with an Area Under the Receiver Operating Characteristic Curve (AUROC) above 0.75 in the initial weeks of a course. The results demonstrate the viability of applying interaction count and derivative features to generate prediction models in contexts where access to demographic data is restricted. The application of a genetic algorithm to the tuning of hyperparameters classifiers can increase their performance in comparison with other techniques.

2021 ◽  
Vol 48 (6) ◽  
pp. 720-728
Author(s):  
Wenting Weng ◽  
Nicola L. Ritter ◽  
Karen Cornell ◽  
Molly Gonzales

Over the past decade, the field of education has seen stark changes in the way that data are collected and leveraged to support high-stakes decision-making. Utilizing big data as a meaningful lens to inform teaching and learning can increase academic success. Data-driven research has been conducted to understand student learning performance, such as predicting at-risk students at an early stage and recommending tailored interventions to support services. However, few studies in veterinary education have adopted Learning Analytics. This article examines the adoption of Learning Analytics by using the retrospective data from the first-year professional Doctor of Veterinary Medicine program. The article gives detailed examples of predicting six courses from week 0 (i.e., before the classes started) to week 14 in the semester of Spring 2018. The weekly models for each course showed the change of prediction results as well as the comparison between the prediction results and students’ actual performance. From the prediction models, at-risk students were successfully identified at the early stage, which would help inform instructors to pay more attention to them at this point.


2022 ◽  
Vol 6 (1) ◽  
pp. 6
Author(s):  
Gomathy Ramaswami ◽  
Teo Susnjak ◽  
Anuradha Mathrani

Poor academic performance of students is a concern in the educational sector, especially if it leads to students being unable to meet minimum course requirements. However, with timely prediction of students’ performance, educators can detect at-risk students, thereby enabling early interventions for supporting these students in overcoming their learning difficulties. However, the majority of studies have taken the approach of developing individual models that target a single course while developing prediction models. These models are tailored to specific attributes of each course amongst a very diverse set of possibilities. While this approach can yield accurate models in some instances, this strategy is associated with limitations. In many cases, overfitting can take place when course data is small or when new courses are devised. Additionally, maintaining a large suite of models per course is a significant overhead. This issue can be tackled by developing a generic and course-agnostic predictive model that captures more abstract patterns and is able to operate across all courses, irrespective of their differences. This study demonstrates how a generic predictive model can be developed that identifies at-risk students across a wide variety of courses. Experiments were conducted using a range of algorithms, with the generic model producing an effective accuracy. The findings showed that the CatBoost algorithm performed the best on our dataset across the F-measure, ROC (receiver operating characteristic) curve and AUC scores; therefore, it is an excellent candidate algorithm for providing solutions on this domain given its capabilities to seamlessly handle categorical and missing data, which is frequently a feature in educational datasets.


2017 ◽  
Vol 7 (3) ◽  
pp. 42
Author(s):  
Vikash Rowtho

Undergraduate student dropout is gradually becoming a global problem and the 39 Small Islands Developing States (SIDS) are no exception to this trend. The purpose of this research was to develop a method that can be used for early detection of students who are at-risk of performing poorly in their undergraduate studies. A sample of 279 students participated in the study conducted in a Mauritian private tertiary academic institution. Results of regression analyses identified the variables having a significant influence on academic performance. These variables were used in a linear discriminant analysis where 74 percent of the students could be correctly classified into three categories: at-risk, pass or fail. In conclusion, this study has proposed a new technique that can be used by institutions to determine significant academic performance predictors and then identify at-risk students upon whom interventions can be implemented prior to exams to address the problem of dropouts.


2020 ◽  
Vol 10 (10) ◽  
pp. 3469 ◽  
Author(s):  
María Consuelo Sáiz-Manzanares ◽  
Raúl Marticorena-Sánchez ◽  
César Ignacio García-Osorio

Early detection of at-risk students is essential, especially in the university environment. Moreover, personalized learning has been shown to increase motivation and lower student dropout rates. At present, the average dropout rates among students following courses leading to the award of Spanish university degrees are around 18% and 42.8% for presential teaching and online courses, respectively. The objectives of this study are: (1) to design and to implement a Modular Object-Oriented Dynamic Learning Environment (Moodle) plugin, “eOrientation”, for the early detection of at-risk students; (2) to test the effectiveness of the “eOrientation” plugin on university students. We worked with 279 third-year students following health sciences degrees. A process for extracting information records was also implemented. In addition, a learning analytics module was developed, through which both supervised and unsupervised Machine Learning techniques can be applied. All these measures facilitated the personalized monitoring of the students and the easier detection of students at academic risk. The use of this tool could be of great importance to teachers and university governing teams, as it can assist the early detection of students at academic risk. Future studies will be aimed at testing the plugin using the Moodle environment on degree courses at other universities.


2016 ◽  
Vol 23 (2) ◽  
pp. 124 ◽  
Author(s):  
Douglas Detoni ◽  
Cristian Cechinel ◽  
Ricardo Araujo Matsumura ◽  
Daniela Francisco Brauner

Student dropout is one of the main problems faced by distance learning courses. One of the major challenges for researchers is to develop methods to predict the behavior of students so that teachers and tutors are able to identify at-risk students as early as possible and provide assistance before they drop out or fail in their courses. Machine Learning models have been used to predict or classify students in these settings. However, while these models have shown promising results in several settings, they usually attain these results using attributes that are not immediately transferable to other courses or platforms. In this paper, we provide a methodology to classify students using only interaction counts from each student. We evaluate this methodology on a data set from two majors based on the Moodle platform. We run experiments consisting of training and evaluating three machine learning models (Support Vector Machines, Naive Bayes and Adaboost decision trees) under different scenarios. We provide evidences that patterns from interaction counts can provide useful information for classifying at-risk students. This classification allows the customization of the activities presented to at-risk students (automatically or through tutors) as an attempt to avoid students drop out.


2019 ◽  
Author(s):  
Yanli Zhang-James ◽  
Qi Chen ◽  
Ralf Kuja-Halkola ◽  
Paul Lichtenstein ◽  
Henrik Larsson ◽  
...  

AbstractBackgroundChildren with attention-deficit/hyperactivity disorder (ADHD) have a high risk for substance use disorders (SUDs). Early identification of at-risk youth would help allocate scarce resources for prevention programs.MethodsPsychiatric and somatic diagnoses, family history of these disorders, measures of socioeconomic distress and information about birth complications were obtained from the national registers in Sweden for 19,787 children with ADHD born between 1989-1993. We trained 1) cross-sectional machine learning models using data available by age 17 to predict SUD diagnosis between ages 18-19; and 2) a longitudinal model to predict new diagnoses at each age.ResultsThe area under the receiver operating characteristic curve (AUC) was 0.73 and 0.71 for the random forest and multilayer perceptron cross-sectional models. A prior diagnosis of SUD was the most important predictor, accounting for 25% of correct predictions. However, after excluding this predictor, our model still significantly predicted the first-time diagnosis of SUD during age 18-19 with an AUC of 0.67. The average of the AUCs from longitudinal models predicting new diagnoses one, two, five and ten years in the future was 0.63.ConclusionsSignificant predictions of at-risk co-morbid SUDs in individuals with ADHD can be achieved using population registry data, even many years prior to the first diagnosis. Longitudinal models can potentially monitor their risks over time. More work is needed to create prediction models based on electronic health records or linked population-registers that are sufficiently accurate for use in the clinic.


2019 ◽  
Vol 8 (3) ◽  
pp. 5916-5920

Timeliness was a missing factor in many studies on Academic Performance Prediction to identify at-risk students. This study embarked on a search to evaluate the feasibility of predicting students’ performance based on heart rate data collected during classes. This dimension of data was collected in the first four weeks after semester commencement to validate accurate prediction that will enable educationists to introduce remedial intervention to at-risk students. Another aim of this study is to determine the best threshold values for the different types of heart rate fluctuations that can be used in predicting academic achievements. The threshold values were tested further to verify whether the prediction model for individual course or combined courses was more accurate. Results revealed that heart rate data alone can achieve a maximum prediction accuracy of 88% and recall of 100%. Threshold values calculated in derived heart rate fluctuation types produces the best results. Prediction models for individual courses outperform the model using average threshold values of all courses.


2015 ◽  
Vol 21 (2) ◽  
pp. 247-262 ◽  
Author(s):  
Jay Bainbridge ◽  
James Melitski ◽  
Anne Zahradnik ◽  
Eitel J. M. Lauría ◽  
Sandeep Jayaprakash ◽  
...  

Author(s):  
Irene O'Dowd

This paper describes the findings of a small research study, conducted in a third-level online college, using learning analytics to examine the implementation of formative quizzes in a blended-learning post-primary teaching programme. Using historic data captured in a virtual learning environment (VLE) for a single cohort (n=126), patterns of use of formative Knowledge Check quizzes were analysed with particular regard to completion and retakes. Three hypotheses were tested using appropriate data correlation methods. Completion levels for quizzes were correlated with completion levels for other online tasks to see whether an increase in task workload resulted in a decrease in quiz engagement. A second test compared levels of quiz re-attempts with completion levels for other online tasks, to see whether different patterns of quiz attempts were linked to different levels of online engagement. Finally, the data was analysed to ascertain the relationship, if any, between student gender and different patterns of quiz attempts, to see if gender might be a factor in quiz engagement. The findings of this study suggested that the decrease in engagement with quizzes was not significantly related to task workload increase, and that there is a relationship between quiz re-attempts and higher module engagement. The findings are presented and discussed in the context of student engagement with online formative strategies in humanities-based subjects. Options are considered for enhancing engagement and formative value in this teaching and learning context; the potential of learning analytics in informing evidence-based improvements in digital learning design is also assessed.


Sign in / Sign up

Export Citation Format

Share Document