Fairer AI for Online Learning: Mitigating Bias in Student Engagement Assessment

TLDR: This research introduces a novel method using attribute-orthogonal regularization and a split-model architecture to automatically assess student engagement in online learning. It successfully mitigates gender bias in predictions, improving the Pearson correlation coefficient from 0.897 to 0.999, ensuring fairer outcomes despite potential trade-offs in traditional accuracy metrics on biased datasets. The approach focuses on model design and training to reduce reliance on sensitive features like gender.

In the evolving landscape of education, particularly with the rise of online and virtual learning, understanding and improving student engagement has become a critical challenge. Traditional methods of gauging a student’s involvement often fall short in digital environments, leading to a demand for automated systems that can accurately detect engagement levels.

A recent study by James Thiering, Tarun Sethupat Radha Krishna, Dylan Zelkin, and Ashis Kumer Biswas from the University of Colorado Denver addresses this very issue. Their research introduces a novel approach to automatically assess student engagement during online learning, while also tackling a significant ethical concern: algorithmic bias. The core of their work lies in developing a training method that actively discourages a model from relying on sensitive features, such as gender, for its predictions. This not only upholds ethical standards but also makes the model’s predictions more understandable.

The researchers applied a technique called attribute-orthogonal regularization to a split-model classifier. This classifier uses multiple transfer learning strategies to effectively reduce disparities in prediction distributions across different sensitivity groups. They achieved a remarkable improvement, moving the Pearson correlation coefficient for prediction distribution between sensitivity groups from 0.897 in an unmitigated model to 0.999 in their mitigated model. This indicates a near-perfect alignment in how the model predicts engagement for different groups, regardless of gender.

Understanding the Challenge of Bias

Bias in machine learning models is a pervasive problem, especially as these systems become integrated into daily life. Datasets, even seemingly balanced ones, often contain inherent biases that models can inadvertently leverage. A common issue is “spurious correlation,” where a model identifies a relationship between two variables that don’t have a direct causal link. In vision tasks, models might use spurious features—like certain hairstyles or background objects—instead of the core features relevant to the prediction.

For an engagement classifier, this bias can have serious social implications. A model biased towards certain demographic groups might either overestimate or underestimate engagement. For instance, if a model consistently predicts lower engagement for one gender, teachers might become frustrated or fail to intervene when necessary, leading to unequal educational outcomes over time. The researchers highlight that their work aims to address this bias through careful model design and training methods, rather than just manipulating the dataset.

The DAiSEE Dataset: A Case Study

The study utilized the DAiSEE dataset, a collection of video files with labels for various affective states, including engagement. However, the DAiSEE dataset presents several challenges. Firstly, the distribution of engagement labels is skewed, with a disproportionately high number of samples labeled as “high” or “very high” engagement. More critically, there’s a gender bias: females were, on average, annotated as more engaged than male participants. This skew in the ground truth labels means that models trained directly on this data tend to exhibit gender bias in their predictions.

The researchers also noted temporal considerations, as a single label for a 10-second video clip might not capture the varying engagement levels throughout that short duration. Furthermore, the absence of a human validation study for DAiSEE makes it difficult to establish a baseline for acceptable model performance.

The Proposed Solution: Attribute-Orthogonal Regularization

To mitigate the identified bias, the researchers proposed a novel training methodology. They used an Xception model as a feature extractor. The core of their bias mitigation strategy is Attribute-Orthogonal Regularization (AOR). This technique involves penalizing the correlation between the weights of two branching classifiers that stem from a shared feature extractor. One branch predicts the engagement level, while the other is trained to predict gender.

A crucial step in their approach was to train the gender classifier using a separate, more diverse dataset (the OUI dataset) that was specifically curated for gender classification. This ensured that the gender classifier was robust and not subject to the biases present in the DAiSEE dataset itself. Once the gender classifier was trained, its layers were frozen, and new layers for engagement grading were added as a split-model branch. The AOR term was then incorporated into the engagement classifier’s loss function during training.

Results and Impact

The results demonstrated a significant reduction in prediction bias. The Pearson correlation coefficient between male and female prediction distributions improved from 0.897 for the unmitigated model to 0.999 for the AOR-mitigated model. This indicates that the mitigated model produced nearly identical distributions of engagement predictions for both genders, effectively reducing the disparity.

While the F1-score and accuracy metrics showed some nuanced changes—with a slight decrease in F1-score for female very high engagement and a small drop in overall accuracy—the researchers explain that this is an expected trade-off when correcting for inherent biases in the dataset. The model is no longer leveraging the spurious correlations tied to gender, which might lead to a perceived drop in performance on a biased validation set, but ultimately results in fairer predictions.

Further evaluation on a uniformly distributed subset of the data confirmed the success of the AOR technique. The unmitigated model still showed a clear bias, predicting males as less engaged than females, even on this balanced subset. In contrast, the AOR-mitigated model produced much more similar predictions between genders, reinforcing the effectiveness of the bias mitigation strategy.

Also Read:

Looking Ahead

This research offers a promising direction for creating more ethical and interpretable AI systems for educational assessment. Future work includes further refining the AOR technique, testing its generalizability across different datasets and tasks, and conducting a human validation study for the DAiSEE dataset to establish clearer performance benchmarks. The source code for this project is available on GitHub.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Fairer AI for Online Learning: Mitigating Bias in Student Engagement Assessment

Understanding the Challenge of Bias

The DAiSEE Dataset: A Case Study

The Proposed Solution: Attribute-Orthogonal Regularization

Results and Impact

Looking Ahead

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates