spot_img
HomeResearch & DevelopmentClearing the Horizon: Hierarchical Fusion for Long-Range Haze Removal

Clearing the Horizon: Hierarchical Fusion for Long-Range Haze Removal

TLDR: The Hierarchical Semantic-Visual Fusion (HSVF) framework is proposed to address challenging long-range haze removal by jointly leveraging high-level semantic consistency and low-level visual features from visible and near-infrared images. It uses a semantic stream for clear scene reconstruction and a visual stream for structural detail recovery. A new pixel-aligned visible-near-infrared haze dataset (VNHD) with semantic labels is also introduced to benchmark performance. Experiments show HSVF outperforms existing methods in restoring both contrast and texture in hazy long-range scenes.

Haze is a common atmospheric phenomenon that significantly degrades the quality of images, especially over long distances. This degradation not only makes scenes look blurry and low-contrast but also impairs the performance of various high-level vision tasks, such as surveillance and autonomous navigation. While significant progress has been made in removing haze from images, most existing methods primarily focus on short-range scenarios, leaving the challenging problem of long-range haze removal largely unaddressed.

As the distance increases, the scattering of light intensifies, leading to severe haze and substantial signal loss. This makes it incredibly difficult to recover clear details from visible light images alone. Near-infrared (NIR) light, however, has a superior ability to penetrate fog and haze, offering crucial complementary information. Current methods that combine visible and near-infrared images often focus on integrating visual content but frequently overlook the residual haze still present in the visible images, resulting in outputs that lack full clarity.

Introducing Hierarchical Semantic-Visual Fusion (HSVF)

A new research paper introduces a novel framework called Hierarchical Semantic-Visual Fusion (HSVF) that tackles the complexities of long-range haze removal. The core idea behind HSVF is that visible and near-infrared images not only provide complementary low-level visual features (like textures and edges) but also share a consistent high-level semantic understanding (what objects are in the scene, like sky, buildings, or vegetation). By leveraging both these aspects, HSVF aims to produce images that are both clear and rich in detail.

The HSVF framework operates through two complementary components: a semantic stream and a visual stream. The semantic stream is designed to reconstruct haze-free scenes by first identifying and aligning modality-invariant intrinsic representations. This means it learns to understand the underlying meaning of the scene (e.g., “this is a building,” “this is the sky”) regardless of whether the input is a visible or near-infrared image, or how hazy it is. This shared semantic understanding then acts as a powerful guide to restore clear, high-contrast distant scenes, even under severe haze.

In parallel, the visual stream focuses on recovering the fine structural details that are often lost in hazy visible images. It achieves this by fusing complementary cues from both visible and near-infrared images. This stream uses a sophisticated mechanism that combines self-attention (looking at details within one type of image) and cross-attention (looking at how details from one image type relate to another) to effectively integrate information. Through the combined effort of these two streams, HSVF generates results that boast both high-contrast scenes and rich texture details.

Also Read:

A New Dataset for Real-World Haze

To facilitate further research and provide a reliable benchmark for long-range haze removal, the researchers also introduce a new real-world dataset called VNHD (Visible-Near-infrared Haze Dataset). This dataset consists of 1519 hazy and 1442 clear pixel-aligned visible and near-infrared image pairs, along with semantic labels. Unlike previous datasets, VNHD was specifically captured with a dual-band camera to ensure precise pixel alignment between visible and near-infrared channels, making it ideal for multimodal fusion tasks. It includes scenes with varying depths, from less than 1 km to over 10 km, covering diverse real-world atmospheric conditions.

Extensive experiments conducted on the VNHD dataset, as well as other existing datasets like RGB-NIRScene and RANUS, demonstrate the superior performance of HSVF compared to state-of-the-art approaches in real-world long-range haze removal. The method excels at simultaneously removing haze and enhancing fine details, leading to more visually appealing and informative images. This advancement holds significant potential for various long-range imaging applications, including video surveillance and autonomous driving, where clear vision through haze is critical.

For more in-depth information, you can read the full research paper here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -