spot_img
HomeResearch & DevelopmentAdaptive Label Correction Enhances Ordinal Image Classification with Noisy...

Adaptive Label Correction Enhances Ordinal Image Classification with Noisy Data

TLDR: A new data-centric method called ORDAC (ORDinal Adaptive Correction) has been developed to address the problem of noisy labels in ordinal image classification. Unlike traditional methods that discard mislabeled data, ORDAC corrects them by representing labels as dynamic Gaussian distributions, adjusting their mean and standard deviation based on model predictions. Evaluated on age estimation and disease severity detection datasets, ORDAC significantly outperforms existing methods, even correcting inherent noise in original datasets, leading to more robust and accurate AI models.

In the rapidly evolving world of artificial intelligence, the quality of data used to train models is paramount. Especially in computer vision, where deep learning models excel, large, accurately labeled datasets are the backbone of success. However, the process of labeling data, particularly for tasks like ordinal image classification (e.g., estimating age or grading disease severity), is often prone to errors and inherent ambiguity. These ‘noisy labels’ can severely hinder a model’s performance and reliability.

Traditionally, approaches to combat noisy labels fall into two main categories: model-centric methods, which make the learning algorithm more robust, and data-centric methods, which focus on improving the dataset itself. Within data-centric approaches, the dominant strategy has been ‘sample selection,’ where mislabeled samples are identified and simply discarded. While effective, this method has a significant drawback: it throws away potentially valuable information contained within the features of those discarded instances.

Introducing ORDAC: A New Approach to Label Correction

To address this critical gap, researchers Alireza Sedighi Moghaddam and Mohammad Reza Mohammadi have introduced a novel data-centric framework called ORDinal Adaptive Correction (ORDAC). Instead of discarding samples with noisy labels, ORDAC is designed to *correct* them, making optimal use of the entire training dataset. This innovative method leverages the power of Label Distribution Learning (LDL), a paradigm where each label is not a single value but a probability distribution over all possible labels.

The core idea behind ORDAC is to represent each ordinal label as a Gaussian distribution. In this representation, the ‘mean’ of the distribution corresponds to the label’s value, while the ‘standard deviation’ quantifies the model’s uncertainty about that label. During training, ORDAC dynamically adjusts both the mean and standard deviation for each sample. This adaptive correction mechanism allows the model to gradually learn from progressively cleaner and more reliable labels, significantly enhancing its robustness and ability to generalize.

How ORDAC Works

ORDAC employs a clever cross-training strategy to ensure that label corrections are always guided by reliable, unbiased predictions. The training dataset is split into multiple folds, and several models are trained concurrently. Each model makes predictions on a ‘validation’ fold it hasn’t been trained on, and these predictions are then used to correct the labels in that fold. These corrected labels are then propagated back to the training sets for subsequent epochs.

The adaptive label correction mechanism within ORDAC involves two key stages:

  • Class-wise Prediction Debiasing: Ordinal regression models often develop a bias towards middle-rank classes due to loss functions and data imbalance. ORDAC first debiases the model’s predictions by re-centering class-wise predictions around the true class label, preventing the correction process from skewing towards the majority.
  • Sample-wise Distribution Update: After debiasing, the mean and standard deviation for each sample are updated. The magnitude of this update is controlled by a ‘correction coefficient’ that balances the model’s confidence with the class frequency. If the model’s prediction error is large compared to the current uncertainty, the uncertainty (standard deviation) increases, suggesting a noisy label. Conversely, if the error is small, uncertainty decreases. The label’s mean is then shifted towards the model’s debiased prediction.

Demonstrated Effectiveness

The effectiveness of ORDAC was rigorously evaluated on benchmark datasets for age estimation (Adience) and disease severity detection (Diabetic Retinopathy), under various scenarios of asymmetric Gaussian noise. The results were compelling:

  • ORDAC and its extended versions (ORDAC C and ORDAC R) consistently and significantly outperformed standard ordinal regression methods (like CORAL) and fixed-form LDL methods (like DLDL-v2) across both datasets and all noise levels.
  • For instance, on the Adience dataset with 40% noise, ORDAC R dramatically reduced the mean absolute error from 0.86 to 0.62 and increased the recall metric from 0.37 to 0.49.
  • Crucially, ORDAC demonstrated its effectiveness even on the original, uncorrupted datasets, suggesting it can identify and correct intrinsic noise already present in real-world data.
  • When compared to state-of-the-art sample selection methods like CASSOR, ORDAC’s correction-based approach generally yielded better performance, highlighting the benefits of preserving data rather than discarding it.
  • Ablation studies confirmed the importance of the class-wise prediction debiasing step, which prevents the correction process from collapsing into majority classes.

Also Read:

Paving the Way for More Reliable AI

This research marks a significant step forward in handling label noise in ordinal classification. By shifting the paradigm from sample removal to adaptive label correction, ORDAC offers a more data-efficient and robust strategy. It not only moves noisy labels closer to their true values but also intelligently manages label uncertainty throughout the training process, leading to more accurate and generalizable models. This work paves the way for more reliable artificial intelligence systems, particularly in critical domains where clean data is scarce and label ambiguity is high.

For a deeper dive into the methodology and experimental details, you can read the full research paper: Ordinal Adaptive Correction: A Data-Centric Approach to Ordinal Image Classification with Noisy Labels.

Nikhil Patel
Nikhil Patelhttps://blogs.edgentiq.com
Nikhil Patel is a tech analyst and AI news reporter who brings a practitioner's perspective to every article. With prior experience working at an AI startup, he decodes the business mechanics behind product innovations, funding trends, and partnerships in the GenAI space. Nikhil's insights are sharp, forward-looking, and trusted by insiders and newcomers alike. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -