TLDR: NVIDIA has launched ‘Granary,’ the largest open-source speech AI dataset for European languages, alongside two state-of-the-art AI models, Canary-1b-v2 and Parakeet-tdt-0.6b-v3. This initiative aims to significantly advance speech recognition and translation across 25 European languages, enabling developers to build more accurate and scalable AI applications.
NVIDIA Corporation has announced a significant leap in artificial intelligence for European languages with the release of ‘Granary,’ a massive open-source speech AI dataset, and two accompanying cutting-edge AI models. The announcement, made on Friday, August 15, 2025, marks a pivotal moment for multilingual AI development.
The ‘Granary’ dataset is touted as one of the largest speech corpora available for European languages, encompassing approximately 1 million hours of multilingual audio. This includes around 650,000 hours specifically for speech recognition and over 350,000 hours for speech translation. The dataset covers 25 European languages, including nearly all of the European Union’s 24 official languages, along with Russian and Ukrainian.
Accompanying ‘Granary’ are two powerful AI models: NVIDIA Canary-1b-v2 and NVIDIA Parakeet-tdt-0.6b-v3. Canary-1b-v2 is specifically optimized for transcribing European languages, leveraging the extensive ‘Granary’ dataset. Parakeet-tdt-0.6b-v3, on the other hand, is engineered for real-time transcription, supporting all languages included in ‘Granary.’
This release is set to empower developers globally. As stated by NVIDIA in a press release, ‘These tools will help developers scale AI applications globally, providing fast and accurate speech capabilities for real-world use cases like multilingual chatbots, voice-based customer service agents, and near-instant translation tools.’ The initiative aims to foster high-quality speech recognition and translation AI, making it easier for developers to create production-scale applications.
Also Read:
- Allen Institute for AI Secures $152 Million from NSF and NVIDIA to Advance Open AI for Scientific Research
- Gnani.ai Introduces Inya.ai: A No-Code Agentic AI Platform for Multilingual Voice and Chat
The development of the ‘Granary’ dataset was a collaborative effort, with the NVIDIA speech AI team working alongside researchers from Carnegie Mellon University and Fondazione Bruno Kessler. They utilized an innovative processing pipeline powered by the NVIDIA NeMo Speech Data Processor toolkit, transforming unlabelled audio into structured, high-quality data. This meticulous process ensures that ‘Granary’ provides clean, ready-to-use data, giving developers a significant head start in building models for transcription and translation tasks across the diverse linguistic landscape of Europe.


