Student Engagement Dataset (SED): An Online Learning Activity Dataset

Distance learning has become a popular educational medium, and the Internet has spread since the early 2000s. To leverage this phenomenon, learning analytics and data mining can provide insights into improving pedagogy and assessing student engagement. To this end, a student-centric dataset was cons...

Full description

Saved in:
Bibliographic Details
Main Authors: M. S. S. Kassim, Z. H. Azizul, A. A. H. Ahmad Fuaad
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10844083/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Distance learning has become a popular educational medium, and the Internet has spread since the early 2000s. To leverage this phenomenon, learning analytics and data mining can provide insights into improving pedagogy and assessing student engagement. To this end, a student-centric dataset was constructed by extracting data from Universiti Malaya’s Moodle-based Virtual Learning Environment (VLE), which serves approximately 25,000 students annually. In this paper, we present the Student Engagement Dataset (SED). The dataset consists of 16,609 students and 2,407 courses. It contains information such as grades and daily logged online activities (approximately 12 million data points), including temporal data across four tables. The tables include student engagement features created by aggregating raw activity data. Here, we present the dataset’s properties and describe the data collection, selection, and processing steps. Correlation analysis of student engagement features showed a statistically significant but weak negative correlation between the number of courses, early morning logins, assignments, and top students’ performance. SED is expected to present new opportunities for researchers in the learning analytics domain.
ISSN:2169-3536