TLDR: The WolBanking77 dataset is introduced to address the lack of digital resources for low-resource languages like Wolof, spoken by over 10 million people in West Africa. It provides 9,791 text sentences and over 4 hours of spoken sentences in the banking domain, enabling the development of voice assistants for intent classification. This initiative aims to improve financial inclusion and access to digital services for the 42% illiterate population in Senegal, reducing fraud risks. Experiments with state-of-the-art NLP and ASR models show promising results, highlighting the dataset’s value for research in low-resource language AI.
In an increasingly digital world, access to essential services like banking often relies on written language. However, for communities where literacy rates are lower and oral traditions are strong, this creates a significant barrier. A new research paper introduces a groundbreaking solution: the WolBanking77 dataset, designed to empower Wolof speakers in West Africa with voice-activated digital banking services.
Wolof is a language spoken by over 10 million people across Senegal, Gambia, and Mauritania, with approximately 90% of Senegal’s population speaking it. Despite its widespread use, digital resources for Wolof are scarce. This scarcity, coupled with a 42% illiteracy rate in Senegal, highlights a critical need for voice-based interfaces to ensure financial inclusion and access to public services, especially for those in the informal sector who are often vulnerable to fraud due to language barriers.
The WolBanking77 dataset is a significant step towards addressing this challenge. It is specifically created for academic research in intent classification, a core component of natural language understanding (NLU) that allows systems to determine a user’s goal from their spoken or typed request. The dataset comprises two main parts: a text dataset and an audio dataset.
The text dataset contains 9,791 sentences in the banking domain, manually translated from the English Banking77 dataset into French and Wolof by linguistic experts. These translations were carefully localized to reflect the Senegalese context, ensuring relevance and naturalness. For instance, common terms like “ATM” and “app” were translated into their Wolof equivalents, “GAB” and “aplikaasiyoN.”
The audio dataset, based on the MINDS-14 dataset, includes over 4 hours of spoken sentences. It features 263 utterances covering 10 intents across banking and transport domains. These audio recordings were collected from students at Cheikh Anta Diop University in Dakar, using the Lig-Aikuma software. Participants had diverse accents and ages, contributing to a robust and representative dataset. Ethical considerations were paramount during collection, with participant names anonymized and informed consent obtained.
The researchers conducted extensive experiments using WolBanking77 to evaluate various state-of-the-art models for both Automatic Speech Recognition (ASR) and Intent Detection. For intent detection, models like AfroXLMR, which was pre-trained on African languages, showed promising performance, achieving F1-scores up to 79% after fine-tuning. This demonstrates the dataset’s ability to challenge and improve existing models for low-resource languages.
In the ASR task, which converts spoken language into text, the Canary-1b-flash model achieved an impressive Word Error Rate (WER) of 0.59%, outperforming other leading models like Phi-4-multimodal-instruct and Distil-whisper-large-v3.5. These results indicate that high-quality speech recognition is achievable for Wolof, even with a relatively modest amount of speech data (4 hours).
The creation of WolBanking77 is a crucial contribution to the field of natural language processing for low-resource languages. It provides a valuable resource for researchers to develop and benchmark AI models that can understand and process Wolof speech and text. This, in turn, paves the way for practical applications like voice assistants that can help millions access digital financial services, manage transactions, and reduce the risk of fraud.
Also Read:
- New AI Framework Unifies Tibetan Dialect Speech Generation
- Advancing Sentiment Analysis for Central Kurdish with BERT
Looking ahead, the team plans to continuously maintain and update the dataset, add more audio recordings in diverse environments, and release open-source code to further stimulate research. They also intend to share text data for potential responses to each intent, facilitating the development of complete conversational AI systems. The WolBanking77 dataset and its associated code are freely available under a CC BY 4.0 license, encouraging widespread use and collaboration within the academic community. For more details, you can refer to the original research paper: WolBanking77: Wolof Banking Speech Intent Classification Dataset.


