spot_img
HomeResearch & DevelopmentFairer AI for Online Learning: Mitigating Bias in Student...

Fairer AI for Online Learning: Mitigating Bias in Student Engagement Assessment

TLDR: This research introduces a novel method using attribute-orthogonal regularization and a split-model architecture to automatically assess student engagement in online learning. It successfully mitigates gender bias in predictions, improving the Pearson correlation coefficient from 0.897 to 0.999, ensuring fairer outcomes despite potential trade-offs in traditional accuracy metrics on biased datasets. The approach focuses on model design and training to reduce reliance on sensitive features like gender.

In the evolving landscape of education, particularly with the rise of online and virtual learning, understanding and improving student engagement has become a critical challenge. Traditional methods of gauging a student’s involvement often fall short in digital environments, leading to a demand for automated systems that can accurately detect engagement levels.

A recent study by James Thiering, Tarun Sethupat Radha Krishna, Dylan Zelkin, and Ashis Kumer Biswas from the University of Colorado Denver addresses this very issue. Their research introduces a novel approach to automatically assess student engagement during online learning, while also tackling a significant ethical concern: algorithmic bias. The core of their work lies in developing a training method that actively discourages a model from relying on sensitive features, such as gender, for its predictions. This not only upholds ethical standards but also makes the model’s predictions more understandable.

The researchers applied a technique called attribute-orthogonal regularization to a split-model classifier. This classifier uses multiple transfer learning strategies to effectively reduce disparities in prediction distributions across different sensitivity groups. They achieved a remarkable improvement, moving the Pearson correlation coefficient for prediction distribution between sensitivity groups from 0.897 in an unmitigated model to 0.999 in their mitigated model. This indicates a near-perfect alignment in how the model predicts engagement for different groups, regardless of gender.

Understanding the Challenge of Bias

Bias in machine learning models is a pervasive problem, especially as these systems become integrated into daily life. Datasets, even seemingly balanced ones, often contain inherent biases that models can inadvertently leverage. A common issue is “spurious correlation,” where a model identifies a relationship between two variables that don’t have a direct causal link. In vision tasks, models might use spurious features—like certain hairstyles or background objects—instead of the core features relevant to the prediction.

For an engagement classifier, this bias can have serious social implications. A model biased towards certain demographic groups might either overestimate or underestimate engagement. For instance, if a model consistently predicts lower engagement for one gender, teachers might become frustrated or fail to intervene when necessary, leading to unequal educational outcomes over time. The researchers highlight that their work aims to address this bias through careful model design and training methods, rather than just manipulating the dataset.

The DAiSEE Dataset: A Case Study

The study utilized the DAiSEE dataset, a collection of video files with labels for various affective states, including engagement. However, the DAiSEE dataset presents several challenges. Firstly, the distribution of engagement labels is skewed, with a disproportionately high number of samples labeled as “high” or “very high” engagement. More critically, there’s a gender bias: females were, on average, annotated as more engaged than male participants. This skew in the ground truth labels means that models trained directly on this data tend to exhibit gender bias in their predictions.

The researchers also noted temporal considerations, as a single label for a 10-second video clip might not capture the varying engagement levels throughout that short duration. Furthermore, the absence of a human validation study for DAiSEE makes it difficult to establish a baseline for acceptable model performance.

The Proposed Solution: Attribute-Orthogonal Regularization

To mitigate the identified bias, the researchers proposed a novel training methodology. They used an Xception model as a feature extractor. The core of their bias mitigation strategy is Attribute-Orthogonal Regularization (AOR). This technique involves penalizing the correlation between the weights of two branching classifiers that stem from a shared feature extractor. One branch predicts the engagement level, while the other is trained to predict gender.

A crucial step in their approach was to train the gender classifier using a separate, more diverse dataset (the OUI dataset) that was specifically curated for gender classification. This ensured that the gender classifier was robust and not subject to the biases present in the DAiSEE dataset itself. Once the gender classifier was trained, its layers were frozen, and new layers for engagement grading were added as a split-model branch. The AOR term was then incorporated into the engagement classifier’s loss function during training.

Results and Impact

The results demonstrated a significant reduction in prediction bias. The Pearson correlation coefficient between male and female prediction distributions improved from 0.897 for the unmitigated model to 0.999 for the AOR-mitigated model. This indicates that the mitigated model produced nearly identical distributions of engagement predictions for both genders, effectively reducing the disparity.

While the F1-score and accuracy metrics showed some nuanced changes—with a slight decrease in F1-score for female very high engagement and a small drop in overall accuracy—the researchers explain that this is an expected trade-off when correcting for inherent biases in the dataset. The model is no longer leveraging the spurious correlations tied to gender, which might lead to a perceived drop in performance on a biased validation set, but ultimately results in fairer predictions.

Further evaluation on a uniformly distributed subset of the data confirmed the success of the AOR technique. The unmitigated model still showed a clear bias, predicting males as less engaged than females, even on this balanced subset. In contrast, the AOR-mitigated model produced much more similar predictions between genders, reinforcing the effectiveness of the bias mitigation strategy.

Also Read:

Looking Ahead

This research offers a promising direction for creating more ethical and interpretable AI systems for educational assessment. Future work includes further refining the AOR technique, testing its generalizability across different datasets and tasks, and conducting a human validation study for the DAiSEE dataset to establish clearer performance benchmarks. The source code for this project is available on GitHub.

Rhea Bhattacharya
Rhea Bhattacharyahttps://blogs.edgentiq.com
Rhea Bhattacharya is an AI correspondent with a keen eye for cultural, social, and ethical trends in Generative AI. With a background in sociology and digital ethics, she delivers high-context stories that explore the intersection of AI with everyday lives, governance, and global equity. Her news coverage is analytical, human-centric, and always ahead of the curve. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -