Clearing the Horizon: Hierarchical Fusion for Long-Range Haze Removal

TLDR: The Hierarchical Semantic-Visual Fusion (HSVF) framework is proposed to address challenging long-range haze removal by jointly leveraging high-level semantic consistency and low-level visual features from visible and near-infrared images. It uses a semantic stream for clear scene reconstruction and a visual stream for structural detail recovery. A new pixel-aligned visible-near-infrared haze dataset (VNHD) with semantic labels is also introduced to benchmark performance. Experiments show HSVF outperforms existing methods in restoring both contrast and texture in hazy long-range scenes.

Haze is a common atmospheric phenomenon that significantly degrades the quality of images, especially over long distances. This degradation not only makes scenes look blurry and low-contrast but also impairs the performance of various high-level vision tasks, such as surveillance and autonomous navigation. While significant progress has been made in removing haze from images, most existing methods primarily focus on short-range scenarios, leaving the challenging problem of long-range haze removal largely unaddressed.

As the distance increases, the scattering of light intensifies, leading to severe haze and substantial signal loss. This makes it incredibly difficult to recover clear details from visible light images alone. Near-infrared (NIR) light, however, has a superior ability to penetrate fog and haze, offering crucial complementary information. Current methods that combine visible and near-infrared images often focus on integrating visual content but frequently overlook the residual haze still present in the visible images, resulting in outputs that lack full clarity.

Introducing Hierarchical Semantic-Visual Fusion (HSVF)

A new research paper introduces a novel framework called Hierarchical Semantic-Visual Fusion (HSVF) that tackles the complexities of long-range haze removal. The core idea behind HSVF is that visible and near-infrared images not only provide complementary low-level visual features (like textures and edges) but also share a consistent high-level semantic understanding (what objects are in the scene, like sky, buildings, or vegetation). By leveraging both these aspects, HSVF aims to produce images that are both clear and rich in detail.

The HSVF framework operates through two complementary components: a semantic stream and a visual stream. The semantic stream is designed to reconstruct haze-free scenes by first identifying and aligning modality-invariant intrinsic representations. This means it learns to understand the underlying meaning of the scene (e.g., “this is a building,” “this is the sky”) regardless of whether the input is a visible or near-infrared image, or how hazy it is. This shared semantic understanding then acts as a powerful guide to restore clear, high-contrast distant scenes, even under severe haze.

In parallel, the visual stream focuses on recovering the fine structural details that are often lost in hazy visible images. It achieves this by fusing complementary cues from both visible and near-infrared images. This stream uses a sophisticated mechanism that combines self-attention (looking at details within one type of image) and cross-attention (looking at how details from one image type relate to another) to effectively integrate information. Through the combined effort of these two streams, HSVF generates results that boast both high-contrast scenes and rich texture details.

Also Read:

A New Dataset for Real-World Haze

To facilitate further research and provide a reliable benchmark for long-range haze removal, the researchers also introduce a new real-world dataset called VNHD (Visible-Near-infrared Haze Dataset). This dataset consists of 1519 hazy and 1442 clear pixel-aligned visible and near-infrared image pairs, along with semantic labels. Unlike previous datasets, VNHD was specifically captured with a dual-band camera to ensure precise pixel alignment between visible and near-infrared channels, making it ideal for multimodal fusion tasks. It includes scenes with varying depths, from less than 1 km to over 10 km, covering diverse real-world atmospheric conditions.

Extensive experiments conducted on the VNHD dataset, as well as other existing datasets like RGB-NIRScene and RANUS, demonstrate the superior performance of HSVF compared to state-of-the-art approaches in real-world long-range haze removal. The method excels at simultaneously removing haze and enhancing fine details, leading to more visually appealing and informative images. This advancement holds significant potential for various long-range imaging applications, including video surveillance and autonomous driving, where clear vision through haze is critical.

For more in-depth information, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Clearing the Horizon: Hierarchical Fusion for Long-Range Haze Removal

Introducing Hierarchical Semantic-Visual Fusion (HSVF)

A New Dataset for Real-World Haze

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates