Distortion-Aware Video Inpainting for Immersive Omnidirectional Content

TLDR: DAOVI is a novel deep learning model for omnidirectional video inpainting that effectively removes unwanted objects while preserving spatial and temporal consistency. It addresses the unique geometric distortion of 360-degree videos through two key modules: Geodesic Flow-Consistent Image Propagation (GFCIP), which evaluates optical flow validity using geodesic distance, and Omnidirectional Depth-Assisted Feature Propagation (ODAFP), which propagates features using distortion-guided modulation, specialized convolutions, and depth maps. Experimental results demonstrate that DAOVI outperforms existing state-of-the-art methods in both quantitative and qualitative evaluations.

Omnidirectional videos, which capture a complete 360-degree view of surroundings, are becoming increasingly popular in applications like virtual reality (VR), augmented reality (AR), and remote sensing. While these videos offer an immersive experience, their wide field of view often leads to the capture of unwanted objects or regions. The process of removing these undesired elements and seamlessly filling the gaps is known as video inpainting.

However, a significant challenge arises because most existing video inpainting methods are designed for conventional, narrow field-of-view videos. They struggle to handle the unique geometric distortions inherent in omnidirectional videos, particularly those projected using the equirectangular projection (ERP) format. Applying these standard methods to 360-degree content often results in noticeable artifacts and visually unconvincing reconstructions, as they fail to account for the varying distortion across the spherical view.

To address this critical limitation, researchers Ryosuke Seshimo and Mariko Isogawa from Keio University, Japan, have introduced a novel deep learning model called Distortion-Aware Omnidirectional Video Inpainting (DAOVI). This innovative framework is specifically engineered to tackle the geometric distortion in omnidirectional videos, enabling the natural removal of objects while preserving both spatial and temporal consistency.

How DAOVI Works: Two Core Modules

DAOVI’s effectiveness stems from two primary modules, each designed to handle distortion in different aspects of the video inpainting process:

1. Geodesic Flow-Consistent Image Propagation (GFCIP): This module operates in the image space, focusing on propagating pixel values from adjacent frames. Traditional flow-based methods often use Euclidean distance to evaluate the reliability of optical flow (motion information). However, in omnidirectional videos, Euclidean distance in ERP pixel coordinates does not accurately represent true distances, especially near the poles where distortion is most severe. GFCIP overcomes this by evaluating flow validity using geodesic distance on a unit sphere. This ensures that only truly reliable motion vectors are used for initial pixel propagation, leading to more accurate and distortion-aware inpainting.

2. Omnidirectional Depth-Assisted Feature Propagation (ODAFP): Working in the feature space (a more abstract representation of video content), ODAFP propagates information from adjacent frames using deformable convolutional networks (DCN). To specifically address ERP distortion, this module incorporates several key innovations. It utilizes convolutions and padding schemes, such as circular padding, that are tailored for 360-degree images, maintaining continuity across the video’s edges and poles. Furthermore, ODAFP employs a distortion map, which quantifies the amount of ERP distortion at each pixel, to weight the DCN offsets and modulation masks. This allows the propagation to adapt dynamically to the spatially varying distortion. Crucially, ODAFP also integrates depth maps as an additional input. This depth guidance provides a more stable and reliable source of information compared to relying solely on optical flow, which can be prone to errors in masked or highly dynamic regions.

Also Read:

Performance and Impact

The DAOVI model was rigorously evaluated on the ODV360 omnidirectional video dataset and compared against several state-of-the-art video inpainting methods, including FuseFormer, STTN, and ProPainter. The experimental results demonstrated that DAOVI consistently outperformed these existing methods across all quantitative metrics, including PSNR, SSIM, and specialized omnidirectional metrics like WS-PSNR and WS-SSIM, as well as perceptual quality (VFID).

Qualitative comparisons further highlighted DAOVI’s superiority, producing visually plausible results with improved structural consistency and significantly fewer artifacts compared to methods not designed for omnidirectional content. This indicates that by explicitly accounting for geometric distortion, DAOVI can generate much more realistic and seamless video completions.

In conclusion, DAOVI represents a significant advancement in the field of video inpainting for immersive media. By intelligently addressing the unique challenges posed by omnidirectional video distortion, it offers a robust and effective solution for content creators and researchers alike. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Distortion-Aware Video Inpainting for Immersive Omnidirectional Content

How DAOVI Works: Two Core Modules

Performance and Impact

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates