Adaptive Label Correction Enhances Ordinal Image Classification with Noisy Data

TLDR: A new data-centric method called ORDAC (ORDinal Adaptive Correction) has been developed to address the problem of noisy labels in ordinal image classification. Unlike traditional methods that discard mislabeled data, ORDAC corrects them by representing labels as dynamic Gaussian distributions, adjusting their mean and standard deviation based on model predictions. Evaluated on age estimation and disease severity detection datasets, ORDAC significantly outperforms existing methods, even correcting inherent noise in original datasets, leading to more robust and accurate AI models.

In the rapidly evolving world of artificial intelligence, the quality of data used to train models is paramount. Especially in computer vision, where deep learning models excel, large, accurately labeled datasets are the backbone of success. However, the process of labeling data, particularly for tasks like ordinal image classification (e.g., estimating age or grading disease severity), is often prone to errors and inherent ambiguity. These ‘noisy labels’ can severely hinder a model’s performance and reliability.

Traditionally, approaches to combat noisy labels fall into two main categories: model-centric methods, which make the learning algorithm more robust, and data-centric methods, which focus on improving the dataset itself. Within data-centric approaches, the dominant strategy has been ‘sample selection,’ where mislabeled samples are identified and simply discarded. While effective, this method has a significant drawback: it throws away potentially valuable information contained within the features of those discarded instances.

Introducing ORDAC: A New Approach to Label Correction

To address this critical gap, researchers Alireza Sedighi Moghaddam and Mohammad Reza Mohammadi have introduced a novel data-centric framework called ORDinal Adaptive Correction (ORDAC). Instead of discarding samples with noisy labels, ORDAC is designed to *correct* them, making optimal use of the entire training dataset. This innovative method leverages the power of Label Distribution Learning (LDL), a paradigm where each label is not a single value but a probability distribution over all possible labels.

The core idea behind ORDAC is to represent each ordinal label as a Gaussian distribution. In this representation, the ‘mean’ of the distribution corresponds to the label’s value, while the ‘standard deviation’ quantifies the model’s uncertainty about that label. During training, ORDAC dynamically adjusts both the mean and standard deviation for each sample. This adaptive correction mechanism allows the model to gradually learn from progressively cleaner and more reliable labels, significantly enhancing its robustness and ability to generalize.

How ORDAC Works

ORDAC employs a clever cross-training strategy to ensure that label corrections are always guided by reliable, unbiased predictions. The training dataset is split into multiple folds, and several models are trained concurrently. Each model makes predictions on a ‘validation’ fold it hasn’t been trained on, and these predictions are then used to correct the labels in that fold. These corrected labels are then propagated back to the training sets for subsequent epochs.

The adaptive label correction mechanism within ORDAC involves two key stages:

Class-wise Prediction Debiasing: Ordinal regression models often develop a bias towards middle-rank classes due to loss functions and data imbalance. ORDAC first debiases the model’s predictions by re-centering class-wise predictions around the true class label, preventing the correction process from skewing towards the majority.
Sample-wise Distribution Update: After debiasing, the mean and standard deviation for each sample are updated. The magnitude of this update is controlled by a ‘correction coefficient’ that balances the model’s confidence with the class frequency. If the model’s prediction error is large compared to the current uncertainty, the uncertainty (standard deviation) increases, suggesting a noisy label. Conversely, if the error is small, uncertainty decreases. The label’s mean is then shifted towards the model’s debiased prediction.

Demonstrated Effectiveness

The effectiveness of ORDAC was rigorously evaluated on benchmark datasets for age estimation (Adience) and disease severity detection (Diabetic Retinopathy), under various scenarios of asymmetric Gaussian noise. The results were compelling:

ORDAC and its extended versions (ORDAC C and ORDAC R) consistently and significantly outperformed standard ordinal regression methods (like CORAL) and fixed-form LDL methods (like DLDL-v2) across both datasets and all noise levels.
For instance, on the Adience dataset with 40% noise, ORDAC R dramatically reduced the mean absolute error from 0.86 to 0.62 and increased the recall metric from 0.37 to 0.49.
Crucially, ORDAC demonstrated its effectiveness even on the original, uncorrupted datasets, suggesting it can identify and correct intrinsic noise already present in real-world data.
When compared to state-of-the-art sample selection methods like CASSOR, ORDAC’s correction-based approach generally yielded better performance, highlighting the benefits of preserving data rather than discarding it.
Ablation studies confirmed the importance of the class-wise prediction debiasing step, which prevents the correction process from collapsing into majority classes.

Also Read:

Paving the Way for More Reliable AI

This research marks a significant step forward in handling label noise in ordinal classification. By shifting the paradigm from sample removal to adaptive label correction, ORDAC offers a more data-efficient and robust strategy. It not only moves noisy labels closer to their true values but also intelligently manages label uncertainty throughout the training process, leading to more accurate and generalizable models. This work paves the way for more reliable artificial intelligence systems, particularly in critical domains where clean data is scarce and label ambiguity is high.

For a deeper dive into the methodology and experimental details, you can read the full research paper: Ordinal Adaptive Correction: A Data-Centric Approach to Ordinal Image Classification with Noisy Labels.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Adaptive Label Correction Enhances Ordinal Image Classification with Noisy Data

Introducing ORDAC: A New Approach to Label Correction

How ORDAC Works

Demonstrated Effectiveness

Paving the Way for More Reliable AI

Gen AI News and Updates

Precision Screening for Diabetic Retinopathy Using Deep Ensembles

Enhancing Text Legibility in AI-Generated Videos with Synthetic Data

Tailoring Image Edits: A Collaborative Approach to User Preferences in AI

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates