paper

A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

Lukas De Kerpel, Arthur Thuy, Dries F. Benoit

A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

Name: A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization
Author: Lukas De Kerpel, Arthur Thuy, Dries F. Benoit

Lukas De Kerpel, Arthur Thuy, Dries F. Benoit

paper2026-02-04English

Start Reading

Description

In recent years, instructional practices in Operations Research (OR), Management Science (MS), and Analytics have increasingly shifted toward digital environments, where large and diverse groups of learners make it difficult to provide practice that adapts to individual needs. This paper introduces a method that generates personalized sequences of exercises by selecting, at each step, the exercise most likely to advance a learner's understanding of a targeted skill. The method uses information about the learner and their past performance to guide these choices, and learning progress is measured as the change in estimated skill level before and after each exercise. Using data from an online mathematics tutoring platform, we find that the approach recommends exercises associated with greater skill improvement and adapts effectively to differences across learners. From an instructional perspective, the framework enables personalized practice at scale, highlights exercises with consistently strong learning value, and helps instructors identify learners who may benefit from additional support.