TLDR: GeoSapiens is a new few-shot learning framework that uses a human-centric foundation model (Sapiens) and a novel geometric loss function to accurately detect dental landmarks on cone-beam computed tomography (CBCT) images of anterior teeth, even with limited training data. It significantly outperforms existing methods, making automated dental diagnostics more practical.
Accurate identification of anatomical landmarks on teeth is crucial for various dental procedures, including orthodontics, periodontics, and implant dentistry. Traditionally, dentists manually mark these landmarks on cone-beam computed tomography (CBCT) scans. This manual process is not only time-consuming and labor-intensive but also prone to inconsistencies between different observers.
Deep learning, a powerful branch of artificial intelligence, offers a promising solution to automate this process, making it more efficient. However, a major hurdle for deep learning methods in this field is the scarcity of high-quality training data, as expert annotations are very costly and difficult to obtain. This limitation often prevents conventional deep learning techniques from being widely adopted.
To address these challenges, researchers have introduced a new framework called GeoSapiens. This innovative few-shot learning framework is specifically designed for robust dental landmark detection, even when only a limited number of annotated CBCT images of anterior (front) teeth are available. GeoSapiens is built upon two core components.
The first component is a robust baseline adapted from Sapiens, a foundational model that has already achieved impressive, state-of-the-art performance in various human-centric vision tasks. The researchers leveraged Sapiens because of the inherent symmetrical characteristics found in both human-centric images and anterior teeth CBCT scans, which helps in stable fine-tuning and better generalization of the model.
The second key component is a novel geometric loss function. This function is designed to improve the model’s ability to understand and capture the critical geometric relationships among anatomical structures within the dental images. For instance, it helps the model recognize perpendicular and parallel relationships between specific lines defined by the landmarks, such as the line connecting the crown and apex of a tooth and the lines indicating root levels.
The team collected their own dataset, named LDTeeth, specifically for anterior teeth landmarks. This dataset includes images from patients who underwent orthodontic treatment, with detailed annotations of 16 landmarks per image. Experiments conducted on this dataset showed that GeoSapiens significantly outperformed existing landmark detection methods. At a strict 0.5 mm threshold, which is a widely recognized standard in dental diagnostics, GeoSapiens achieved an 8.18% higher success detection rate compared to the leading existing approach.
To make the training process more practical and efficient, GeoSapiens also incorporates a technique called LoRA (Low-Rank Adaptation). This method drastically reduces the number of trainable parameters in the model, from 330 million to just 24 million, making it more feasible for real-world clinical settings without significantly compromising performance.
Also Read:
- New Approach to Enhance Synthetic CT Quality Using Multimodal Imaging and Registration
- LangMamba: Enhancing Low-Dose CT Denoising with Vision-Language Models
In summary, GeoSapiens represents a significant advancement in automated dental landmark detection. It is the first framework to establish a dedicated anterior teeth CBCT dataset for this task, devises a strong baseline using a human-centric foundation model, and introduces a novel geometric-based loss function to enhance accuracy and robustness. These innovations collectively lead to superior performance, particularly in scenarios with limited data, making it highly promising for practical clinical applications. You can find more details about this research paper here.


