Enhancing Fire and Smoke Detection in Compact AI Models with Uncertainty and Visual Cues

TLDR: A new post-detection framework significantly improves fire and smoke detection in compact AI models like YOLOv5n and YOLOv8n. It refines detection confidence by combining statistical uncertainty (from single-pass dropout) with domain-relevant visual features (color, edge, texture) using a lightweight Confidence Refinement Network (CRN). This approach boosts precision, recall, and mAP, reducing false alarms and missed detections with only a modest increase in computational overhead, making it ideal for real-world deployment on resource-constrained devices.

In the critical domain of safety and disaster response, accurate and timely fire and smoke detection is paramount. However, current vision-based systems, especially those relying on compact deep learning models like YOLOv5n and YOLOv8n, often struggle to balance efficiency with reliability. These smaller models, ideal for deployment on drones, CCTV, and IoT devices, can suffer from false alarms or missed detections due to their reduced processing capacity. Traditional methods for refining detections, such as Non-Maximum Suppression (NMS), only consider how much bounding boxes overlap, which can lead to errors in complex or crowded scenes involving fire and smoke.

Addressing these challenges, independent researchers Aniruddha Srinivas Joshi, Godwyn James William, and Shreyas Srinivas Joshi have proposed an innovative uncertainty-aware post-detection framework. This framework aims to significantly enhance fire and smoke detection in compact deep learning models without altering their core architecture. The core idea is to refine the confidence scores of detected objects by considering both the model’s statistical uncertainty and relevant visual characteristics of fire and smoke.

A Smarter Approach to Confidence

The proposed framework introduces a lightweight Confidence Refinement Network (CRN) that acts as a crucial post-processing step. Instead of relying on simple overlap rules, the CRN integrates several key pieces of information to adjust detection scores:

Uncertainty Estimation: The framework uses a clever technique called single-pass dropout during inference. While dropout is typically used during training to prevent overfitting, here it’s repurposed to estimate how confident (or uncertain) the model is about its predictions. This provides a statistical measure of reliability for each detected bounding box.
Feature-Aware Confidence Normalization: To ensure detections align with the physical appearance of fire and smoke, the framework analyzes specific visual cues within each detected region. This includes:
- Color: Using HSV histograms, it assesses color intensity and saturation, recognizing that fire exhibits strong red-orange saturation, while smoke appears more diffuse.
- Edge: Canny edge detection is employed to identify smooth, gradient-like transitions typical of fire and smoke, helping to filter out false positives that often have sharp, unnatural edges.
- Texture: Haralick texture features, such as contrast and homogeneity, are used to differentiate the high-frequency patterns of fire from the smoother textures of smoke.

These raw confidence scores, uncertainty estimates, and visual features are then fed into the CRN. The CRN, a compact neural network, learns to combine these inputs to produce a more accurate and refined confidence score for each detection. This learned approach replaces heuristic-based adjustments, making the detection pipeline more adaptive and robust.

Experimental Validation and Promising Results

The researchers rigorously evaluated their framework using the D-Fire dataset, a benchmark specifically designed for fire and smoke detection, containing over 21,000 images. They applied their method to two popular compact models, YOLOv5n and YOLOv8n, and compared its performance against several existing post-detection techniques, including NMS, Soft-NMS, and various feature-based filters.

The results were compelling. For YOLOv8n, the framework significantly improved precision from 0.712 to 0.845 and recall from 0.674 to 0.82. The mean Average Precision (mAP), a comprehensive measure of detection accuracy, also saw a boost from 0.625 to 0.651. Similar improvements were observed with YOLOv5n, where precision rose from 0.703 to 0.84 and recall from 0.659 to 0.818, with mAP increasing from 0.609 to 0.641. Notably, the framework showed strong gains in detecting both fire and smoke categories.

While the framework introduces a modest increase in processing time (from approximately 12-14 ms to 20-23 ms per image), this overhead is considered well within acceptable limits for many real-time applications, especially fixed surveillance systems where sub-second responses are sufficient. This demonstrates a justifiable trade-off between a slight increase in latency and substantial gains in accuracy, which is crucial for safety-critical applications.

Also Read:

Impact and Future Directions

This research offers a practical and effective solution for enhancing the reliability of compact deep learning models in fire and smoke detection. Its model-agnostic nature means it can be integrated with existing detectors without requiring extensive retraining or architectural changes, making it highly suitable for deployment on resource-constrained edge devices. The framework’s ability to reduce false positives and recover true positives, which heuristic methods might miss, is a significant step forward in building more robust vision-based fire safety systems.

The authors acknowledge certain limitations, such as the use of a single-pass dropout approximation for efficiency and the handcrafted nature of visual features optimized for fire and smoke. Future work aims to explore alternative uncertainty estimation strategies, extend the framework to video-based detection to leverage temporal information, and validate its adaptability in other object detection domains by designing new domain-specific visual features. For more in-depth technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Enhancing Fire and Smoke Detection in Compact AI Models with Uncertainty and Visual Cues

A Smarter Approach to Confidence

Experimental Validation and Promising Results

Impact and Future Directions

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates