spot_img
HomeResearch & DevelopmentAdvancing Quranic Recitation Assessment with Deep Learning

Advancing Quranic Recitation Assessment with Deep Learning

TLDR: This research introduces a deep learning system for automatically detecting and correcting pronunciation errors in Holy Quran recitation. It features a 98% automated data pipeline to create a large, high-quality dataset (850+ hours, 300K utterances) and a novel Quran Phonetic Script (QPS) that encodes intricate Tajweed rules. A multi-level CTC model, fine-tuned on Wav2Vec2-BERT, achieved a low 0.16% average Phoneme Error Rate, demonstrating the script’s learnability and the system’s potential for precise error detection in Quranic learners.

Assessing spoken language, especially the intricate recitation of the Holy Quran, has always been a significant challenge. The Quran’s recitation is governed by rigorous rules known as Tajweed, established by Muslim scholars centuries ago. While these rules simplify the assessment process by providing clear guidelines, the lack of high-quality, annotated data has been a major hurdle for developing automated learning tools.

A recent research paper titled “Automatic Pronunciation Error Detection and Correction of the Holy Quran’s Learners Using Deep Learning” by Abdullah Abdelfttah, Mahmoud I. Khalil, and Hazem Abbas, addresses these challenges head-on. The researchers have introduced a groundbreaking approach that leverages deep learning to provide highly effective assessment for Quranic learners.

A Novel Approach to Data and Phonetics

The core of this work lies in three key contributions. Firstly, the team developed a 98% automated pipeline to produce high-quality Quranic datasets. This pipeline involves collecting recitations from expert reciters, segmenting the audio at natural pause points using a fine-tuned wav2vec2-BERT model, transcribing these segments, and then verifying the transcripts with a novel algorithm called Tasmeea.

Secondly, this automated process has resulted in an extensive dataset comprising over 850 hours of audio, which includes more than 300,000 annotated utterances. This is a significant step forward, as data scarcity has historically been a major barrier in this field.

Thirdly, the researchers introduced a novel ASR-based (Automatic Speech Recognition) approach for detecting pronunciation errors. This method utilizes a custom Quran Phonetic Script (QPS), which is specifically designed to encode Tajweed rules. Unlike the International Phonetic Alphabet (IPA) used for Modern Standard Arabic, QPS uses a two-level script: a Phoneme level for Arabic letters and vowels, and a Sifa level to encode the articulation characteristics of each phoneme. This detailed script allows for a comprehensive capture of all Tajweed pronunciation errors, with the exception of Ishmam, which is a visual mouth movement without audible output.

Modeling and Impressive Results

To process this unique phonetic script, the researchers developed a novel multi-level CTC (Connectionist Temporal Classification) Model. This model functions by having a speech encoder with 11 parallel transcription levels – one for phonemes and ten for the Sifat (articulation attributes). The model was fine-tuned using Facebook’s Wav2Vec2-Bert and trained on the newly created dataset.

The results are highly promising. The model achieved an impressive 0.16% average Phoneme Error Rate (PER) on unseen test data, demonstrating that the Quran Phonetic Script is indeed learnable by deep learning models. Even more remarkably, the model was able to detect errors in Madd (elongation), Ghunnah (nasalization), Qalqala (echoing effect), and Tafkheem (emphasis) in actual samples, despite not being explicitly trained on recitations containing errors.

Also Read:

Future Directions

While the current dataset consists of ‘golden recitations’ (error-free), the researchers acknowledge this as a primary limitation. Future work will focus on annotating datasets that contain actual pronunciation errors to further enhance the model’s evaluation and real-world performance. They also plan to address limitations related to attribute-specific articulation patterns and less frequent Tajweed rules.

This research fundamentally transforms the methodology for assessing Holy Quran pronunciation. By providing an open-source pipeline, a vast annotated dataset, and a highly accurate multi-level deep learning model, it paves the way for advanced computer-aided learning systems for Quranic recitation. You can learn more about this innovative work by reading the full paper available at this link.

Ananya Rao
Ananya Raohttps://blogs.edgentiq.com
Ananya Rao is a tech journalist with a passion for dissecting the fast-moving world of Generative AI. With a background in computer science and a sharp editorial eye, she connects the dots between policy, innovation, and business. Ananya excels in real-time reporting and specializes in uncovering how startups and enterprises in India are navigating the GenAI boom. She brings urgency and clarity to every breaking news piece she writes. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -