spot_img
HomeResearch & DevelopmentAI-Powered Multi-Round Simplification Improves Text Readability

AI-Powered Multi-Round Simplification Improves Text Readability

TLDR: Researchers from OUNLP developed two multi-round text simplification systems, MRS-Rule and MRS-Joint, generated by GPT-4o, for the TSAR 2025 Shared Task. They found that a larger gap between a text’s original and target CEFR readability levels makes simplification harder. Their MRS-Joint method, which combines initial LLM simplification with subsequent rule-based refinements, significantly improved CEFR level accuracy and meaning preservation compared to single-step approaches, demonstrating the effectiveness of iterative simplification and AI-generated code.

Making complex text easier to understand is a crucial task, especially for language learners and individuals with limited literacy. This challenge was at the heart of the TSAR 2025 Shared Task, where the OUNLP team from the University of Oklahoma presented an innovative approach to text simplification using multi-round methods and AI-generated code. Their research highlights a significant finding: the greater the difference between a text’s original difficulty and its target readability level, the harder it is to simplify effectively. This gap, referred to as the “CEFR-Gap,” became the driving force behind their novel multi-step simplification strategies.

The Challenge of Text Simplification

Traditional methods often attempt to simplify text in a single step. However, the OUNLP team observed that when the required simplification is substantial – for instance, transforming a highly advanced (C1) text into a basic (A2) one – single-step approaches frequently fall short. This difficulty arises because larger CEFR-Gaps demand more radical linguistic changes, which can compromise both the accuracy of reaching the target readability level and the preservation of the original meaning. This insight led them to propose an iterative, multi-round process, believing that breaking down the simplification into smaller, manageable steps would yield better results.

Introducing Multi-Round Simplification

The OUNLP team developed two primary multi-round simplification methods, with their underlying code generated by GPT-4o, a powerful large language model:

MRS-Rule: This method is purely rule-based. It doesn’t rely on external large language model APIs for simplification itself, but instead uses a set of predefined rules to progressively adjust sentence structures and vocabulary. These rules include replacing complex words with simpler synonyms, standardizing numerical expressions, removing non-essential clauses, and breaking down long sentences. The system iteratively applies these rules, checking the text’s readability and semantic similarity to the original after each round, until the target CEFR level is met or the best possible simplification is achieved.

MRS-Joint: Building on the MRS-Rule, this method combines the strengths of both rule-based approaches and large language model prompting. In the initial step, an LLM (specifically GPT-4o-mini) generates a first simplified version of the text. Subsequent rounds then employ the same rule-based iterative refinements as MRS-Rule. This hybrid approach aims to leverage the generative power of LLMs for initial simplification while using structured rules for fine-grained, controlled adjustments in later stages.

Measuring Success: CEFR Levels and Meaning Preservation

To evaluate their systems, the researchers focused on two key aspects: CEFR Compliance (how close the simplified text is to the target CEFR level, measured by RMSE – Root Mean Square Error, where lower is better) and Meaning Preservation (how well the simplified text retains the original meaning, measured by MeaningBERT scores, where higher is better). They used three ModernBERT classifiers to predict CEFR levels and SBERT for semantic similarity checks.

Also Read:

Promising Results and Key Takeaways

The experiments showed that the MRS-Joint method significantly outperformed both the initial LLM-prompting baseline and the MRS-Rule method. MRS-Joint achieved the best CEFR accuracy (lowest RMSE) while still effectively preserving the original meaning. This demonstrates that multi-round simplification is indeed more effective at handling large CEFR-Gaps than conventional single-step approaches. Furthermore, starting the simplification process with an LLM-generated candidate, as done in MRS-Joint, further boosted the overall performance of the multi-round system.

While the systems excelled at simplifying complex sentences (C1-C2) to an intermediate B1 level, the team noted that simplifying to very low levels like A2 remained challenging, sometimes resulting in text that was still more complex than intended. This “overshooting” or other issues like “lexical imitation” (keeping formal phrases) and “under-generation” (producing incomplete sentences) highlight areas for future refinement.

In conclusion, the OUNLP team’s work provides compelling evidence that an iterative, multi-round approach to text simplification, especially when augmented by AI-generated code and a combination of LLM prompting and rule-based refinements, can significantly improve readability and accessibility. This research paves the way for more effective tools to make information accessible to a wider audience. You can find more details about their work in the full research paper: OUNLP at TSAR 2025 Shared Task: Multi-Round Text Simplifier via Code Generation.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -