spot_img
HomeResearch & DevelopmentAdvancing Medical Translation: How MedCOD Enhances Language Models for...

Advancing Medical Translation: How MedCOD Enhances Language Models for English-to-Spanish Healthcare Communication

TLDR: MedCOD is a new framework that significantly improves English-to-Spanish medical translation by integrating structured medical knowledge from UMLS and an LLM-as-Knowledge-Base into large language models. Combined with fine-tuning, MedCOD enables open-source LLMs to outperform proprietary systems like GPT-4o in clinical translation accuracy, addressing critical language barriers in healthcare.

Language barriers in healthcare can significantly impact patient care, especially for the millions of individuals with limited English proficiency in the United States. Electronic Health Records (EHRs) are crucial for patient engagement and communication, but their full benefits are often not realized by non-English speakers. This challenge is particularly acute for the Hispanic population, where a substantial percentage faces difficulties understanding medical forms, communicating with healthcare professionals, and following prescription guidelines.

While machine translation technologies have been explored to bridge this gap, general-purpose systems often fall short in ensuring clinical accuracy for complex medical texts. Large Language Models (LLMs) have shown promise in general translation, but their application in specialized biomedical translation, particularly for EHRs, has remained an area needing more exploration.

A new framework called MedCOD (Medical Chain-of-Dictionary) has been developed to address these critical issues. MedCOD is a hybrid approach designed to significantly improve English-to-Spanish medical translation by integrating structured, domain-specific knowledge into LLMs. It builds upon the Chain-of-Dictionary Prompting (COD) framework and incorporates knowledge from two key sources: the Unified Medical Language System (UMLS) and an LLM-as-Knowledge-Base (LLM-KB).

The MedCOD framework works by enriching the translation process with multi-layered domain knowledge. This includes translated medical terms, synonyms, and multilingual mappings obtained from both UMLS and the LLM-KB. This structured context helps LLMs better understand the nuances of medical language. The framework also combines this structured prompting with a lightweight fine-tuning technique called Low-Rank Adaptation (LoRA), allowing open-source models to adapt more effectively to specialized biomedical content.

Researchers constructed a parallel corpus of nearly 3,000 English-Spanish MedlinePlus articles and a 100-sentence test set, meticulously annotated with structured medical contexts. They evaluated four open-source LLMs: Phi-4, Qwen2.5-14B, Qwen2.5-7B, and LLaMA-3.1-8B. These models were tested using structured prompts that included multilingual variants, medical synonyms, and UMLS-derived definitions, combined with LoRA-based fine-tuning.

The experimental results were highly encouraging. MedCOD significantly improved translation quality across all evaluated models. For instance, Phi-4 with MedCOD and fine-tuning achieved a BLEU score of 44.23, a chrF++ score of 28.91, and a COMET score of 0.863. These scores surpassed strong baseline models like GPT-4o and GPT-4o-mini, demonstrating that MedCOD-enhanced open-source models can rival or even outperform proprietary systems in clinical translation accuracy.

Ablation studies confirmed that both the MedCOD prompting strategy and the model adaptation through fine-tuning independently contributed to these performance gains, with their combination yielding the highest improvements. This highlights the complementary nature of providing external knowledge and task-specific model adaptation.

Further analysis revealed that multilingual translation prompts generally yielded the highest scores, especially for fine-tuned models. However, the optimal prompting strategy could vary depending on the specific LLM architecture and whether it was fine-tuned, suggesting potential for adaptive prompt selection in future work.

The study also explored MedCOD’s applicability beyond English-to-Spanish translation, extending it to paragraph-level medical translation across six language pairs in the WMT24 Biomedical test set, and to multilingual summarization tasks using the MultiClinSum dataset. Consistent benefits were observed, indicating MedCOD’s robustness and generalizability across different tasks, diverse languages, and long, high-stakes medical texts.

Despite these advancements, the researchers acknowledge limitations, such as the dataset’s origin from standardized MedlinePlus articles, which might not fully capture the linguistic complexity of all clinical domains. Future work will also explore adaptability to other language pairs and address persistent issues like grammatical inconsistencies and stylistic awkwardness. For more in-depth information, you can read the full research paper here.

Also Read:

In conclusion, MedCOD offers a practical and scalable framework for enhancing biomedical translation. By equipping open-source LLMs with rich medical context, it paves the way for improved cross-lingual health communication, ultimately benefiting underrepresented populations and advancing healthcare accessibility.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -