Simplifying Spanish: How Large Language Models Improve Text Readability

TLDR: The CardiffNLP team participated in the CLEARS-2025 shared task, focusing on simplifying Spanish texts into Plain Language (PL) and Easy-to-Read (E2R) formats using Large Language Models (LLMs). They experimented with LLaMA-3.2 and Gemma-3, finding Gemma-3 to be more effective, especially when prompted in Spanish. Key to their success were structured output (Python dictionary) and sentence-level processing. They secured third place in PL and second in E2R, highlighting LLMs’ potential while also noting the limitations of current automatic evaluation metrics for nuanced text simplification.

Ensuring information is clear and easy to understand is a fundamental right, as highlighted by the Universal Declaration for Human Rights. However, many public and official documents, especially in fields like law and medicine, remain inaccessible to a significant portion of the population due to their complex language. To address this challenge, the CLEARS shared task at IberLEF-2025 focused on automatically adapting Spanish texts into two accessible formats: Plain Language (PL) and Easy-to-Read (E2R).

Plain Language (PL) aims to make texts clear and concise for general audiences, including non-native speakers and individuals with reading limitations. It emphasizes active voice, common words, and avoids jargon. Easy-to-Read (E2R), on the other hand, is specifically designed for people with cognitive, intellectual, or learning disabilities. It focuses on structural and linguistic simplicity, using short sentences, clear language, and often involves the target audience in testing for clarity. Traditionally, creating these adapted texts is a manual and resource-intensive process, requiring experts and user validation.

CardiffNLP’s Approach to Text Simplification

The CardiffNLP team, from Cardiff University, contributed to the CLEARS shared task by exploring a novel approach: leveraging Large Language Models (LLMs) for automatic text adaptation. Their work, detailed in their paper “Prompting Large Language Models for Plain Language and Easy-to-Read Text Rewriting”, involved experimenting with different prompting methods, including zero-shot, one-shot, and few-shot strategies.

Initially, the team experimented with LLaMA-3.2, but for their final submission, they adopted Gemma-3. Their experiments involved numerous prompt variations, testing the effectiveness of different instructions and even the language in which these instructions were given (English or Spanish). A key finding was that instructing the model to return its output as a Python dictionary significantly improved results and made extraction easier. Furthermore, explicitly guiding the model to read and work on sentences individually also boosted similarity scores.

Performance and Insights

The CardiffNLP team achieved notable success in the CLEARS shared task, securing third place in Subtask 1 (Plain Language) and second place in Subtask 2 (Easy-to-Read). Their results highlighted the potential of LLMs in text simplification, particularly Gemma-3, which consistently performed as well as or better than LLaMA-3.2. Interestingly, Gemma-3 showed superior performance when prompted in Spanish, as English prompts sometimes led the model to simplify texts in English.

Despite the promising results, the research also shed light on ongoing challenges. The team noted that current automatic metrics do not fully capture the nuances of text simplification, especially for E2R, where visual formatting and sentence segmentation are crucial for readability. This suggests a need for more sophisticated evaluation methods that can account for these qualitative aspects.

Also Read:

Looking Ahead

The CardiffNLP team’s contribution to the CLEARS shared task has deepened the understanding of LLMs’ capabilities and limitations in text simplification. Their work underscores the importance of carefully crafted prompts, structured output formats, and sentence-level processing in mitigating common LLM errors like hallucinations and inconsistent formatting. Future work will benefit from incorporating human evaluation and developing metrics that better reflect the complex qualitative aspects of simplification, especially for specific target groups requiring intricate formatting.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Simplifying Spanish: How Large Language Models Improve Text Readability

CardiffNLP’s Approach to Text Simplification

Performance and Insights

Looking Ahead

Gen AI News and Updates

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

Valorem Reply Earns 2025 Microsoft Inclusion Changemaker Partner of the Year Award for AI-Driven Solutions

Romanian Deep-Tech Startup .lumen Honored with CES 2026 Innovation Award for AI-Powered Glasses for the Blind

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates