A New Approach to Classification: Introducing the Linearly Adaptive Cross Entropy Loss Function

TLDR: Researchers have developed a new ‘Linearly Adaptive Cross Entropy Loss function’ that improves machine learning classification performance. Unlike standard cross-entropy, this new function includes an additional term based on the true class’s predicted probability, enhancing the optimization process. Tested on a ResNet model with the CIFAR-100 dataset, it consistently achieved higher accuracy and lower error rates with minimal additional computational cost, offering a promising alternative for classification tasks.

In the realm of machine learning, particularly for classification tasks, the choice of a loss function is paramount. It guides the model during training, helping it learn to make accurate predictions. One of the most widely used loss functions is Cross Entropy, which has its roots in information theory and measures the dissimilarity between a model’s predicted probability distribution and the true distribution.

While effective, the standard Cross Entropy loss primarily focuses on increasing the predicted probability of the correct class. This implicitly reduces the probabilities of incorrect classes, but it doesn’t directly leverage information from these “false” classes during the learning process. This is where a new approach, the Linearly Adaptive Cross Entropy Loss function, steps in.

Proposed by Jae Wan Shim, this novel loss function introduces an additional term that specifically depends on the predicted probability of the true class. This unique feature is designed to enhance the optimization process, especially when dealing with one-hot encoded class labels, a common representation where only the true class is marked with a ‘1’ and all others with a ‘0’. The theoretical foundation for this new function is derived from fundamental concepts in information theory, specifically building upon the symmetric Kullback-Leibler divergence, also known as Jeffreys divergence.

To evaluate its effectiveness, the Linearly Adaptive Cross Entropy Loss function was put to the test against the conventional Cross Entropy loss. The experiments were conducted using a deep learning model based on the ResNet (Residual Network) architecture, a popular choice for image classification tasks known for its ability to handle very deep networks and mitigate issues like vanishing gradients. The model, consisting of 18 layers, was trained on the CIFAR-100 dataset, which comprises 100 classes of 32×32 color images.

The training process involved several key hyperparameters and techniques to ensure robust evaluation. Stochastic Gradient Descent (SGD) was used as the optimizer with a learning rate of 0.1, momentum of 0.9, and weight decay of 5e-4. A StepLR scheduler adjusted the learning rate, decaying it by a factor of 0.1 every 50 epochs. The models were trained for 200 epochs with a batch size of 100. Data augmentation, including random horizontal flips and random cropping, was applied to the training images to improve generalization, and per-pixel mean subtraction was used for preprocessing.

The results were compelling. Across multiple training iterations, the Linearly Adaptive Loss function consistently outperformed the standard Cross Entropy Loss function in terms of classification accuracy. For instance, when comparing the top-5 error rates (the percentage of times the true class was not among the top 5 predicted classes), the Linearly Adaptive Loss achieved a lower mean error rate of 6.2% compared to Cross Entropy’s 6.7%. This indicates that the proposed function leads to more accurate predictions.

Crucially, this enhanced performance comes with minimal additional computational cost. The Linearly Adaptive Loss function only requires two extra operations—one subtraction and one multiplication—compared to the standard Cross Entropy loss. This means it maintains practically the same efficiency, making it a highly practical alternative for real-world applications.

The findings suggest that this linearly adaptive approach could significantly broaden the scope for future research into loss function design. Future work could delve deeper into the theoretical analysis of its convergence properties, explore its impact on model robustness against adversarial attacks (small, intentional perturbations to input data that can trick models), and investigate its potential for multi-label classification tasks, where an input can belong to multiple classes simultaneously.

Also Read:

For more detailed information, you can refer to the full research paper available at arXiv:2507.10574.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

A New Approach to Classification: Introducing the Linearly Adaptive Cross Entropy Loss Function

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates