Predicting Air Quality with Images and Sensors: Introducing AQFusionNet

TLDR: AQFusionNet is a novel deep learning framework that significantly improves Air Quality Index (AQI) prediction by synergistically combining atmospheric imagery with environmental sensor data. Designed for robustness, it maintains high accuracy even with partial sensor unavailability and is computationally efficient for edge deployment. Evaluated on data from India and Nepal, the EfficientNet-B0 variant achieved 92.02% accuracy, demonstrating an 18.5% improvement over unimodal baselines and offering a scalable solution for air quality monitoring in resource-constrained regions.

Air pollution is a critical global health issue, responsible for millions of premature deaths annually, particularly severe in rapidly industrializing regions like South Asia. Accurate, real-time monitoring of the Air Quality Index (AQI) is essential for public health, but traditional methods face significant challenges. Ground-based sensors offer high temporal resolution but are costly and sparsely distributed, especially in developing areas. Satellite observations provide broad coverage but suffer from limitations like cloud interference and lower sensitivity to ground-level pollutants.

Addressing these challenges, researchers Koushik Ahmed Kushal and Abdullah Al Mamun from Clarkson University have introduced AQFusionNet, a novel multimodal deep learning framework. This innovative system is designed to predict AQI robustly by combining atmospheric imagery with environmental sensor data. Unlike many existing approaches that rely on a single data source, AQFusionNet leverages the strengths of both visual and sensor information to provide a more comprehensive and accurate picture of air quality.

The core of AQFusionNet lies in its dual-objective learning architecture. It uses lightweight Convolutional Neural Network (CNN) backbones, such as MobileNetV2, ResNet18, and EfficientNet-B0, to extract detailed visual features from ground-level atmospheric images. These visual features are then seamlessly integrated with pollutant concentration measurements (like PM2.5, PM10, NO2, SO2, CO, O3) through semantically-aligned embedding spaces. A key innovation is the ability to estimate sensor values directly from visual features, which makes the system incredibly robust even when some sensor data is unavailable – a common scenario in resource-constrained environments.

The framework’s architecture consists of an image encoder, a sensor encoder, a multimodal fusion module, and dual prediction heads. The image encoder processes atmospheric images, while the sensor encoder handles environmental measurements. The fusion module combines these two data streams, and the dual prediction heads simultaneously predict the AQI and estimate sensor values. This design ensures that the model can maintain predictive capability across varying data availability scenarios.

Extensive evaluation was conducted on over 8,000 samples from 15 cities across India and Nepal, collected between 2019 and 2022. The results demonstrated AQFusionNet’s superior performance across all backbone configurations. The EfficientNet-B0 variant achieved optimal results, with a Root Mean Square Error (RMSE) of 7.70 and a remarkable 92.02% classification accuracy on test data. This represents an 18.5% improvement over unimodal baselines and significant gains over other multimodal approaches, including a 23.7% RMSE reduction compared to a framework using CCTV traffic imagery and environmental sensors.

Beyond its accuracy, AQFusionNet is also computationally efficient, with the EfficientNet-B0 variant having only 6.2 million parameters and the MobileNetV2 configuration even less at 2.41 million. This lightweight design makes it highly suitable for deployment on edge devices and mobile platforms, which is crucial for real-time AQI monitoring in areas with limited computational infrastructure. The model’s ability to maintain performance under partial sensor unavailability further enhances its practical deployability in real-world settings.

To understand how the model makes its decisions, Grad-CAM visualization was used. This technique showed that for good air quality, the model focused on clear sky regions, while for high pollution levels, it prioritized hazy or smoggy areas, directly correlating visual cues with pollutant concentrations. This interpretability builds trust and can guide air quality interventions.

The researchers acknowledge that future work will focus on integrating temporal attention mechanisms for long-term forecasting, incorporating satellite imagery for broader spatial coverage and all-weather performance, developing unsupervised domain adaptation for seamless cross-regional deployment, and extending the framework to real-time streaming architectures. These advancements aim to further strengthen AQFusionNet’s robustness and scalability, ultimately helping to democratize air quality monitoring in developing nations facing severe pollution challenges.

Also Read:

For more detailed information, you can read the full research paper: AQFusionNet: Robust Multimodal Deep Learning for Air Quality Index Prediction through Atmospheric Imagery and Environmental Sensor Integration.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Predicting Air Quality with Images and Sensors: Introducing AQFusionNet

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates