File(s) under embargo
Reason: Awaiting publication in journal manuscript
until file(s) become available
Applied Machine Learning for Online Education
We consider the problem of developing innovative machine learning tools for online education and evaluate their ability to provide instructional resources. Prediction tasks for student behavior are a complex problem spanning a wide range of topics: we complement current research in student grade prediction and clickstream analysis by considering data from three areas of online learning: Social Learning Networks (SLN), Instructor Feedback, and Learning Management Systems (LMS). In each of these categories, we propose a novel method for modelling data and an associated tool that may be used to assist students and instructors. First, we develop a methodology for analyzing instructor-provided feedback and determining how it correlates with changes in student grades using NLP and NER--based feature extraction. We demonstrate that student grade improvement can be well approximated by a multivariate linear model with average fits across course sections approaching 83\%, and determine several contributors to student success. Additionally, we develop a series of link prediction methodologies that utilize spatial and time-evolving network architectures to pass network state between space and time periods. Through evaluation on six real-world datasets, we find that our method obtains substantial improvements over Bayesian models, linear classifiers, and an unsupervised baseline, with AUCs typically above 0.75 and reaching 0.99. Motivated by Federated Learning, we extend our model of student discussion forums to model an entire classroom as a SLN. We develop a methodology to represent student actions across different course materials in a shared, low-dimensional space that allows characteristics from actions of different types to be passed jointly to a downstream task. Performance comparisons against several baselines in centralized, federated, and personalized learning demonstrate that our model offers more distinctive representations of students in a low-dimensional space, which in turn results in improved accuracy on a common downstream prediction task. Results from these three research thrusts indicate the ability of machine learning methods to accurately model student behavior across multiple data types and suggest their ability to benefit students and instructors alike through future development of assistive tools.