TLDR: A new online learning policy, H2T2, is proposed for edge intelligence systems to optimize binary classification by using two confidence thresholds to decide whether to classify locally or offload to a more accurate remote model. This approach effectively manages asymmetric costs (where false negatives are more costly than false positives) and demonstrates superior performance compared to existing single-threshold methods, especially in scenarios with distribution shifts, without requiring model retraining.
Edge intelligence systems are becoming increasingly common, operating in environments where decisions need to be made quickly and efficiently, often with limited resources. Think of applications like medical diagnostics, surveillance, or smart cameras detecting theft. In these scenarios, balancing the accuracy of predictions with the cost of making those predictions is crucial. A particularly challenging aspect arises in binary classification problems (e.g., detecting a disease or a threat) where the consequences of a ‘false negative’ (missing something important) are far more severe and costly than a ‘false positive’ (a false alarm).
A new research paper titled “Inference Offloading for Cost-Sensitive Binary Classification at the Edge” by Vishnu Narayanan Moothedath, Umang Agarwal, Umeshraja N, James Richard Gross, Jaya Prakash Champati, and Sharayu Moharir addresses this very challenge. The paper introduces an innovative approach called H2T2 (HI-Hedge with Two Thresholds) to optimize decision-making in such systems.
The Challenge of Edge AI
Typically, an edge intelligence system uses a compact, local model for initial inference. This local model is fast and resource-efficient but might not be as accurate as a larger, more powerful model located remotely (e.g., in the cloud). Offloading a sample to the remote model can significantly improve accuracy, but it incurs costs like network latency and data transfer. The core problem is deciding, for each incoming data sample, whether to trust the local model’s prediction or to offload it to the remote model, especially when false negatives and false positives have different costs.
Previous methods, often referred to as Hierarchical Inference (HI), have explored using a single threshold on the local model’s confidence score to make this offloading decision. If the confidence falls below this threshold, the sample is offloaded. However, the authors argue that for situations with asymmetric costs (where missing a threat is much worse than a false alarm), a single threshold is not optimal.
Introducing H2T2: A Two-Threshold Solution
H2T2 is an online learning framework that continuously adapts a pair of thresholds based on the local model’s confidence scores. Instead of just one boundary, H2T2 uses two thresholds, creating three distinct regions for the local model’s output:
- Region 1: The local model is very confident about one class (e.g., ‘no threat’), and the sample is classified locally as such.
- Region 2: The local model is very confident about the other class (e.g., ‘threat detected’), and the sample is classified locally as such.
- Region 3 (Ambiguity Region): The local model’s confidence falls between the two thresholds, indicating an ambiguous case. In this scenario, the sample is offloaded to the more accurate remote model for a definitive decision.
This two-threshold approach allows for more nuanced decision-making, directly addressing the problem of asymmetric costs. For instance, if missing a critical event is very expensive, the system can be tuned to offload more ambiguous cases to minimize false negatives, even if it means incurring higher offloading costs.
Key Advantages and Performance
One of the significant strengths of H2T2 is its online learning capability. It adapts continuously during the inference phase, meaning it doesn’t require retraining the local model or extensive offline data. It’s also ‘model-agnostic,’ which means it can work with various local and remote models without needing specific architectural changes. The policy learns from limited feedback, making it practical for real-world deployments.
The research demonstrates that H2T2 consistently outperforms simpler ‘naive’ policies (like always classifying locally or always offloading) and even existing single-threshold HI policies. In some cases, it even surpasses the performance of offline optimal single-threshold policies, which have the advantage of knowing all data beforehand. The policy also shows remarkable robustness to ‘distribution shifts’ – situations where the incoming data changes over time, which is a common challenge in edge environments.
For example, in experiments with out-of-distribution (OOD) data, H2T2 significantly reduced false negative rates, demonstrating its ability to handle unexpected data patterns effectively. This makes H2T2 a flexible and reliable solution for critical applications where accuracy and cost management are paramount.
Also Read:
- Restoring Learning Vitality: CBPNet’s Approach to Continual Learning on Edge Devices
- RLinf: Smarter Scheduling for Faster Reinforcement Learning
Looking Ahead
While the paper primarily focuses on binary classification, it also briefly discusses the potential extension to multiclass classification, where the decision regions would become more complex. The H2T2 policy represents a significant step forward in optimizing cost-sensitive inference at the edge, providing a practical and robust framework for balancing accuracy and operational costs in dynamic environments. You can read the full paper here: Inference Offloading for Cost-Sensitive Binary Classification at the Edge.


