spot_img
HomeResearch & DevelopmentFloodVision: AI and Knowledge Graphs Combine for Precise Urban...

FloodVision: AI and Knowledge Graphs Combine for Precise Urban Flood Depth Estimation

TLDR: FloodVision is a zero-shot AI framework that accurately estimates urban flood depth. It integrates GPT-4o’s semantic reasoning with a domain knowledge graph (FloodKG) containing verified object dimensions. This approach identifies reference objects in images, retrieves their heights from FloodKG to prevent AI hallucination, and calculates submergence ratios. Evaluated on crowdsourced images, FloodVision achieved an 8.17 cm mean absolute error, a 20.5% improvement over a GPT-4o-only baseline, demonstrating enhanced accuracy and generalization for real-time flood response.

Urban flooding is a growing concern, causing significant damage and disrupting daily life. Accurate and timely information about floodwater depth is crucial for emergency services, road accessibility, and overall urban resilience. Traditional methods for estimating flood depth often fall short, being either too slow, spatially limited, or computationally intensive.

Recent advancements in computer vision have offered new ways to detect floods, but estimating precise water depth remains a challenge. Many existing computer vision methods struggle with accuracy and generalization because they rely on fixed object detectors and require extensive, task-specific training data. This often means they can’t adapt well to diverse flood scenarios or when specific reference objects aren’t clearly visible.

A significant hurdle for advanced AI models, particularly vision-language models (VLMs), in this domain is their tendency for “quantitative hallucination.” This means they might generate plausible but incorrect estimations for real-world object dimensions, undermining their reliability in critical applications like flood depth measurement.

To address these limitations, researchers have developed a novel framework called FloodVision. This innovative system combines the powerful semantic reasoning capabilities of a foundation vision-language model, specifically GPT-4o, with a carefully structured domain knowledge graph. The core idea is to ground the AI’s reasoning in physical reality by providing it with verified real-world dimensions of common urban objects.

FloodVision works by dynamically identifying visible reference objects in standard RGB images, such as vehicles, people, or infrastructure elements. Once identified, it retrieves their canonical heights from a specialized “FloodKG” (Flood Knowledge Graph). This knowledge graph acts as a reliable source of truth, preventing the VLM from hallucinating object dimensions. The system then estimates how much of each object is submerged and applies statistical filtering to ensure the final depth values are accurate and reliable, removing any anomalous readings.

The FloodKG is a meticulously constructed repository of physical dimensions. It includes a hierarchical ontology covering vehicles (like sedans and SUVs), humans (adults, children), and infrastructure (curbs, fire hydrants). Each entry in the graph provides a mean height and standard deviation, sourced from authoritative data like vehicle specifications, anthropometric surveys, and design manuals. This ensures that the AI has access to accurate physical context.

In experiments, FloodVision was evaluated using 110 crowdsourced images from the MyCoast New York platform, where residents submit geotagged photos and flood depth estimates. The results were impressive: FloodVision achieved a mean absolute error (MAE) of 8.17 cm. This represents a significant 20.5% reduction in error compared to a GPT-4o-only baseline, which scored 10.28 cm MAE. It also outperformed earlier methods based on Convolutional Neural Networks (CNNs).

The framework’s ability to generalize across varying scenes and operate in near real-time makes it highly suitable for practical applications. It could be integrated into digital twin platforms for dynamic visualization of flood conditions or citizen-reporting apps, significantly enhancing smart city flood resilience efforts. This research marks an important step towards more accurate, generalizable, and real-time urban flood depth estimation for emergency response and urban planning.

Also Read:

While FloodVision offers substantial improvements, the researchers acknowledge areas for future development. These include incorporating additional visual cues beyond just reference objects, such as water surface texture or reflections, and exploring few-shot or reinforcement learning to further enhance accuracy and adaptability. The full research paper can be found here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -