TLDR: A cross-national study by Liu Liu and Dai Rui utilized Explainable AI (XAI) on PISA 2018 data from ten countries to predict and understand mathematics achievement. The research found that advanced machine learning models (Random Forest, CATBoost) significantly outperformed traditional linear regression, demonstrating the complex, non-linear nature of factors influencing student performance. Key predictors consistently identified were socio-economic status, student motivation and engagement, learning time, and school climate, with their relative importance varying by country. XAI techniques provided transparent insights into how these factors interact, offering actionable guidance for developing targeted educational policies and interventions to improve math outcomes.
Understanding what drives students’ success in mathematics is crucial for creating effective educational strategies. A recent study, titled “Explainable AI for Predicting and Understanding Mathematics Achievement: A Cross-National Analysis of PISA 2018” by Liu Liu and Dai Rui, delves into this complex issue by applying advanced artificial intelligence (AI) techniques to a vast international dataset. This research offers fresh perspectives on how various factors, from a student’s background to their classroom environment, influence math performance across different countries.
Unveiling Complex Patterns with AI
Traditional statistical methods often struggle to capture the intricate, non-linear relationships and interactions between numerous factors affecting student achievement. This study addressed this limitation by utilizing explainable artificial intelligence (XAI) techniques on data from the Programme for International Student Assessment (PISA) 2018. PISA is a global assessment that evaluates 15-year-olds’ performance in key subjects, and the 2018 cycle focused on mathematics.
The researchers analyzed data from 67,329 students across ten diverse countries, including Argentina, Chile, Chinese Taipei, Finland, Hungary, Italy, Japan, South Korea, the Philippines, and the United States. They developed four predictive models: Multiple Linear Regression (a traditional approach), Random Forest, CATBoost, and Artificial Neural Network (more advanced machine learning models). The goal was not just to predict math scores accurately but also to understand *why* certain predictions were made, making the models transparent and their insights actionable.
Key Findings: What Drives Math Achievement?
The study’s findings highlight several crucial insights:
First, the advanced machine learning models, particularly Random Forest and CATBoost, consistently outperformed the traditional linear regression model. This indicates that factors influencing math achievement are indeed complex and interact in non-linear ways that simpler models cannot fully capture. While even the best models explained only about one-third of the variance in individual math scores, this is significant given the many unmeasured factors that influence test performance.
Second, and perhaps most consistently, socio-economic status (SES) emerged as the most powerful predictor of math performance across virtually all countries. This reinforces the long-standing understanding that a student’s family background, including parental education, occupation, and home resources, plays a fundamental role in their academic journey. Even in countries with highly equitable education systems, SES disparities translated into achievement gaps.
Third, beyond economic factors, student engagement and motivation proved to be incredibly important. Factors such as the effort a student reported putting into the PISA test, their personal learning goals, and their sense of belonging at school were highly predictive of math scores in many nations. In some cases, these ‘non-cognitive’ factors were as influential as, or even more influential than, traditional socio-economic indicators. This suggests that a student’s mindset and their connection to the school environment are critical for success.
Fourth, the amount of time students invested in learning, both specifically in mathematics and overall, was also a significant predictor. Its importance varied by cultural context; for instance, in East Asian countries like South Korea, Chinese Taipei, and Japan, dedicated math learning time was particularly influential, reflecting a cultural emphasis on intensive study.
Fifth, home resources, such as the number of books available at home, consistently appeared as an important feature. This classic indicator of a literate and resource-rich home environment was strongly associated with higher math achievement, even when accounting for overall SES.
Finally, school and teacher-related factors also played a meaningful role. Teacher clarity in instruction, the feedback provided to students, teacher enthusiasm, and a disciplined classroom climate were all found to be important predictors in various countries. This underscores that the quality of the classroom and school environment significantly impacts learning outcomes, independent of individual and family characteristics.
The Power of Explainable AI
A key strength of this study lies in its use of explainable AI techniques like SHAP values and decision tree visualizations. These tools allowed the researchers to move beyond simply identifying important factors to understand *how* they interact. For example, the models could reveal that for students with very low study time, socio-economic status made an especially large difference. Conversely, for students who were already putting in sufficient effort, factors like teacher motivation and a sense of belonging became more critical in differentiating performance. This nuanced understanding provides a richer picture than traditional methods, highlighting specific scenarios where certain interventions might be most effective.
Also Read:
- New AI Model Learns Hidden Concepts to Improve Student Learning and Exercise Recommendations
- Enhancing Cybersecurity with Transparent AI: Introducing the L-XAIDS Framework
Implications for Education Policy and Practice
The findings offer valuable guidance for educators and policymakers. Addressing structural inequalities through policies that support disadvantaged students, such as increased funding for schools in low-income areas and ensuring access to learning materials, remains paramount. However, the study also emphasizes that focusing solely on economic resources is not enough. Interventions should also target student motivation and engagement, foster positive school climates, and enhance teacher support. Programs that encourage a growth mindset, build intrinsic motivation, improve classroom management, and strengthen teacher-student relationships can lead to measurable gains in achievement.
In essence, improving educational outcomes requires a multifaceted approach: tackling socio-economic disparities while simultaneously cultivating supportive, engaging, and high-quality learning environments. This research demonstrates how explainable AI can be a powerful tool to inform these efforts, providing data-driven insights that are both accurate and understandable.
For more in-depth information, you can read the full research paper here.


