Inside LLMs: Why Some Languages Get Less Attention

TLDR: A study using Sparse Autoencoders on Gemma-2-2B found that Large Language Models (LLMs) exhibit systematic activation disparities, with medium-to-low resource languages receiving significantly lower internal activations compared to high-resource languages. This disparity correlates with weaker performance on benchmarks, despite similar embedding representations. Activation-aware fine-tuning improved activations for underrepresented languages and led to modest benchmark gains, highlighting activation alignment as key for multilingual LLM performance.

Large Language Models (LLMs) have shown impressive abilities in understanding and generating text across many languages. However, a significant challenge remains: these powerful models often perform less effectively in languages with fewer digital resources, known as medium-to-low resource languages. This disparity is a concern, especially since English data heavily dominates the training datasets for most LLMs, with non-English data making up only a small fraction.

Researchers at the National University of Singapore, Richmond Sin Jing Xuan, Jalil Huseynov, and Yang Zhang, investigated this performance gap in their paper, “Uncovering Cross-Linguistic Disparities in LLMs using Sparse Autoencoders”. They aimed to understand why LLMs might struggle with certain languages, even when their internal representations (embeddings) seem to treat all languages similarly.

Peeking Inside LLMs with Sparse Autoencoders

To delve into how LLMs process different languages, the researchers used a technique called Sparse Autoencoders (SAEs). Think of SAEs as a special magnifying glass that allows us to see the “activation patterns” or how much different parts of the LLM “light up” when processing text in various languages. Unlike simply looking at how similar language representations are, SAEs provide direct insights into the neural activity within the model.

The study focused on the Gemma-2-2B model, analyzing its 26 internal layers across 10 languages. These included high-resource languages like Chinese, Russian, Spanish, and Italian, and medium-to-low resource languages such as Indonesian, Catalan, Marathi, Malayalam, and Hindi, with English serving as a reference.

Key Findings: A Clear Disparity

The analysis revealed systematic differences in how LLMs activate for different language groups. The most striking findings were:

Lower Activations for Less-Resourced Languages: Medium-to-low resource languages consistently received significantly lower activation levels compared to high-resource languages. This gap was most pronounced in the early layers of the model, where it was up to 26.27% lower, and persisted even in deeper layers, remaining around 19.89%.
Correlation with Performance: These lower activation levels directly correlated with weaker performance on common benchmarks like ARC-C, MMLU, and HellaSwag. This suggests that if a language doesn’t “activate” the model’s internal features as strongly, its performance suffers.
Embedding Similarity Isn’t Enough: Interestingly, the study found that even when the “embedding similarity” (how similar the model’s overall representation of different languages appeared) was high, the actual performance on tasks for medium-to-low resource languages was still much lower. This highlights that surface-level similarity doesn’t guarantee equitable processing.

Addressing the Imbalance: Activation-Aware Fine-Tuning

To try and mitigate these disparities, the researchers applied a technique called activation-aware fine-tuning using LoRA (Low-Rank Adaptation). This method aimed to increase the activation levels for the underperforming languages while ensuring that the model’s performance on English remained stable.

The fine-tuning led to substantial gains in activation for languages like Malayalam (87.69% increase) and Hindi (86.32% increase), while English retention remained high (around 91%). Post-fine-tuning, benchmark results showed modest but consistent improvements, particularly in tasks like ARC-Challenge for Malayalam, which saw a 5.47% improvement. However, the improvements were not uniform across all benchmarks, indicating that while activation alignment is crucial, it’s not a complete solution on its own.

Also Read:

Looking Ahead

While this study provides valuable insights, it also highlights areas for future work. The translation models used might introduce some errors, and the fine-tuning, while effective in aligning activations, only led to modest benchmark improvements. This suggests that LLMs might not fully converge to shared representations across all languages, and more refined fine-tuning strategies are needed. The findings are also specific to Gemma-2-2B, so further research is needed to see if these patterns hold true for other LLM architectures.

In conclusion, this research underscores that simply having similar language representations isn’t enough for equitable multilingual LLM performance. Understanding and addressing activation disparities through techniques like Sparse Autoencoders and targeted fine-tuning is a vital step towards building more fair and effective multilingual AI models.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Inside LLMs: Why Some Languages Get Less Attention

Peeking Inside LLMs with Sparse Autoencoders

Key Findings: A Clear Disparity

Addressing the Imbalance: Activation-Aware Fine-Tuning

Looking Ahead

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates