TLDR: The KG-DG framework introduces a neuro-symbolic learning approach for diabetic retinopathy (DR) classification, combining Vision Transformers with expert-guided symbolic reasoning. It leverages clinical lesion ontologies and retinal vessel segmentation, fusing them with deep visual representations using a confidence-weighted strategy. Evaluated across four public datasets, KG-DG significantly improves accuracy in both single-domain and multi-domain generalization settings, demonstrating enhanced robustness and interpretability by embedding structured clinical knowledge into AI models.
Diabetic Retinopathy (DR) is a serious eye condition caused by diabetes, affecting the blood vessels in the retina and potentially leading to irreversible vision loss. Detecting and classifying DR early is crucial for effective treatment. Traditionally, expert ophthalmologists manually grade fundus photographs, but this process is time-consuming and can vary between observers.
While advanced deep learning models, particularly Vision Transformers (ViTs), have shown great promise in analyzing medical images, they often struggle when faced with real-world variations. These variations, known as ‘domain shifts,’ can be caused by differences in imaging devices, resolution settings, or patient demographics. This means a model trained in one clinic might not perform well in another, limiting its practical use.
Introducing KG-DG: A Neuro-Symbolic Approach
To address this challenge, researchers have developed KG-DG, a novel neuro-symbolic framework for diabetic retinopathy classification. This framework combines the power of deep learning with structured, expert-guided symbolic reasoning. The goal is to create AI systems that are more robust and can generalize effectively across different, unseen clinical environments.
Neuro-symbolic learning is a hybrid approach where deep learning models extract complex patterns from raw data, while symbolic components incorporate high-level domain knowledge and constraints. This integration helps the model avoid overfitting to specific data characteristics and instead focuses on clinically meaningful features that remain consistent across various settings.
How KG-DG Works
The KG-DG framework operates with a dual-branch architecture. One branch utilizes deep learning, specifically Vision Transformers, to process retinal images and identify visual patterns. The other branch is knowledge-driven, incorporating structured clinical knowledge. This knowledge is formalized as a set of diagnostic rules, reflecting expert-validated correlations between observable clinical features and disease states.
To extract these clinical features, KG-DG employs specialized tools. For instance, it uses the YOLOv11 object detection model to pinpoint and quantify clinically relevant lesions like hemorrhages, hard exudates, and cotton wool spots. Additionally, a retinal vessel segmentation module is integrated to extract morphological vessel features, such as vessel tortuosity and branching angles, which are also associated with DR progression.
The information from both the deep learning model and the knowledge-driven symbolic classifier is then integrated using various fusion strategies. These strategies include selecting the prediction with the highest confidence, comparing class-specific confidences, or applying empirically tuned weights to balance the neural and symbolic predictions. This confidence-weighted integration allows the system to leverage symbolic reasoning, especially when the deep learning model’s predictions are uncertain, enhancing robustness in challenging, out-of-distribution scenarios.
Demonstrated Performance
The KG-DG framework was rigorously evaluated across four public diabetic retinopathy datasets: APTOS, EyePACS, Messidor-1, and Messidor-2, each representing a distinct clinical domain. Experiments were conducted in both single-domain generalization (training on one dataset, testing on three others) and multi-domain generalization (training on three datasets, testing on a fourth unseen one) settings.
The results showed significant improvements. KG-DG achieved up to a 5.2% accuracy gain in cross-domain settings and a 6% improvement over baseline ViT models. Notably, the symbolic-only model achieved a 63.67% average accuracy in multi-domain generalization, demonstrating the strong generalization power of encoded clinical knowledge. The complete neuro-symbolic integration achieved the highest accuracy compared to existing baselines in challenging single-domain generalization scenarios.
Ablation studies further confirmed the value of the symbolic components, revealing that lesion-based features alone achieved 84.65% accuracy, substantially outperforming purely neural approaches. This highlights that symbolic components act as effective regularizers, not just enhancing interpretability but also improving model performance and generalization.
Also Read:
- DRetNet: A New Approach to Diagnosing Diabetic Retinopathy
- REx: Crafting Scientifically Valid Explanations for Drug Repurposing with Knowledge Graphs
Future Outlook
The findings from this research establish neuro-symbolic integration as a promising paradigm for building clinically robust and domain-invariant medical AI systems. While the framework currently relies on accurate lesion-level annotations and pre-trained modules, future work aims to explore dynamic neuro-symbolic reasoning, integrate temporal clinical data, and extend KG-DG to other medical imaging modalities like optical coherence tomography (OCT) or histopathology.
For more detailed information, you can read the full research paper here.


