spot_img
HomeNews & Current EventsNVIDIA Unveils Expansive Open-Source Speech AI Dataset and Advanced...

NVIDIA Unveils Expansive Open-Source Speech AI Dataset and Advanced Models for European Languages

TLDR: NVIDIA has launched ‘Granary,’ the largest open-source speech AI dataset for European languages, alongside two state-of-the-art AI models, Canary-1b-v2 and Parakeet-tdt-0.6b-v3. This initiative aims to significantly advance speech recognition and translation across 25 European languages, enabling developers to build more accurate and scalable AI applications.

NVIDIA Corporation has announced a significant leap in artificial intelligence for European languages with the release of ‘Granary,’ a massive open-source speech AI dataset, and two accompanying cutting-edge AI models. The announcement, made on Friday, August 15, 2025, marks a pivotal moment for multilingual AI development.

The ‘Granary’ dataset is touted as one of the largest speech corpora available for European languages, encompassing approximately 1 million hours of multilingual audio. This includes around 650,000 hours specifically for speech recognition and over 350,000 hours for speech translation. The dataset covers 25 European languages, including nearly all of the European Union’s 24 official languages, along with Russian and Ukrainian.

Accompanying ‘Granary’ are two powerful AI models: NVIDIA Canary-1b-v2 and NVIDIA Parakeet-tdt-0.6b-v3. Canary-1b-v2 is specifically optimized for transcribing European languages, leveraging the extensive ‘Granary’ dataset. Parakeet-tdt-0.6b-v3, on the other hand, is engineered for real-time transcription, supporting all languages included in ‘Granary.’

This release is set to empower developers globally. As stated by NVIDIA in a press release, ‘These tools will help developers scale AI applications globally, providing fast and accurate speech capabilities for real-world use cases like multilingual chatbots, voice-based customer service agents, and near-instant translation tools.’ The initiative aims to foster high-quality speech recognition and translation AI, making it easier for developers to create production-scale applications.

Also Read:

The development of the ‘Granary’ dataset was a collaborative effort, with the NVIDIA speech AI team working alongside researchers from Carnegie Mellon University and Fondazione Bruno Kessler. They utilized an innovative processing pipeline powered by the NVIDIA NeMo Speech Data Processor toolkit, transforming unlabelled audio into structured, high-quality data. This meticulous process ensures that ‘Granary’ provides clean, ready-to-use data, giving developers a significant head start in building models for transcription and translation tasks across the diverse linguistic landscape of Europe.

Tanya Menon
Tanya Menonhttps://blogs.edgentiq.com
Tanya Menon is a real-time news specialist focusing on fast updates and micro-analysis of the global AI market. Known for her agile and energetic reporting style, Tanya leverages automation tools to scan emerging news signals and deliver concise, actionable updates. Her coverage is essential for decision-makers who need the GenAI headlines before they go mainstream. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -