TLDR: A new research paper by Vera Pavlova and Mohammed Makhlouf introduces an efficient and versatile model for multilingual information retrieval of Islamic texts, particularly the Quran. The study develops a lightweight, domain-adapted model using a novel ‘mixed’ training approach that combines monolingual and cross-lingual techniques. This model demonstrates strong performance across diverse search scenarios (monolingual, cross-lingual, multilingual) in English, Arabic, Urdu, and Russian. Crucially, the research emphasizes significant cost reductions and improved real-world deployment performance, including substantial decreases in latency, making advanced multilingual search more accessible and affordable.
In the rapidly evolving field of Multilingual Information Retrieval (MLIR), a significant challenge has been bridging the gap between advanced research and practical, real-world deployment. Many studies demonstrate impressive performance in controlled environments, but real-world applications often demand a single system capable of handling diverse search scenarios—monolingual, cross-lingual, and multilingual—efficiently.
A recent research paper, titled “Efficient and Versatile Model for Multilingual Information Retrieval of Islamic Text: Development and Deployment in Real-World Scenarios,” addresses this challenge head-on. Authored by Vera Pavlova and Mohammed Makhlouf from rttl labs, UAE, this work focuses on developing an ad-hoc information retrieval system specifically for the Islamic domain, designed to meet user needs in multiple languages. You can find the full paper here: Efficient and Versatile Model for Multilingual Information Retrieval of Islamic Text.
The researchers leveraged the unique characteristics of the Quranic multilingual corpus, which offers a rich parallel collection of high-quality human translations in over 100 languages. This unique resource simplifies the exploration of multilingual potential in retrieval models by eliminating the need for machine translation during evaluation.
Developing a Lightweight, Domain-Specific Model
The study utilized the XLM-RBase model, a multilingual model trained for general domains, as its foundation. Recognizing that model performance often declines due to domain shift, a crucial preliminary step involved a brief domain adaptation of the XLM-RBase model using a small, multilingual, domain-specific corpus of approximately 100 million words. This short pre-training round significantly boosted performance in retrieval tasks.
To ensure cost-efficiency and practical deployment, the researchers also performed language reduction on the XLM-RBase model. This process eliminated languages not required for the current deployment, resulting in a more than 50% reduction in the model’s size, transforming a 1.1 GB model into a lightweight 481 MB version.
Exploring Training Approaches
The paper explored four distinct training approaches for eleven retrieval models built upon this lightweight, domain-specific multilingual large language model (MLLM):
- Monolingual training: Queries and passages are in the same language.
- Cross-lingual training: Queries are in one language, and passages are in another.
- Translate-train-all: Training with different translations of the dataset simultaneously.
- Mixed approach: A novel method combining monolingual and cross-lingual techniques. This approach allows for greater diversity in training examples, hypothesizing improved cross-lingual interaction.
The evaluation was conducted across monolingual, cross-lingual, and multilingual retrieval scenarios, using English, Arabic, Urdu, and Russian. The results consistently showed that the proposed mixed training approach, particularly the ‘Bilingual Queries English Collection’ (Biq-ENc) model, yielded promising outcomes across all settings, often outperforming other methods.
Also Read:
- Boosting Information Retrieval with Chunk-Based Knowledge Generation
- Advancing Conversational Search with Intelligent Query Reformulation and Result Merging
Deployment and Performance Benefits
A key focus of the research was on deployment considerations. The study highlighted the cost-efficiency of deploying a single, versatile, lightweight model. Compared to deploying three separate, larger models, a single 400 MB model could reduce monthly recurring costs by about 70% on GPU-based servers. Further cost reductions were possible by deploying on CPU-based servers and leveraging languages like Rust to optimize memory consumption, potentially enabling deployment on compact serverless runtimes like AWS Lambda functions for as low as USD 10-20 per month.
Real-world performance metrics, gathered through real-user monitoring (RUM), demonstrated significant improvements in end-to-end latency after the new model’s deployment. Median latency decreased by 38.6% in MENA/EU, 26.8% in North America, and 47.4% in APAC regions. These improvements underscore the practical benefits of the lightweight and efficient model in enhancing user experience.
This research successfully demonstrates that a carefully designed, domain-adapted, and lightweight multilingual retrieval model, trained with a mixed approach, can bridge the gap between academic research and practical deployment, offering an efficient and scalable solution for accessing rich cultural and religious heritage in multiple languages.


