Filling Gaps: 2D Gaussian Splatting for Coherent Image Inpainting

TLDR: This paper introduces a new method for image inpainting using 2D Gaussian Splatting (2DGS) and semantic alignment. Unlike traditional methods that struggle with pixel-level coherence, 2DGS encodes incomplete images into a continuous field of Gaussian coefficients, enabling smoother and more consistent image reconstruction. The framework incorporates a patch-wise rasterization strategy for efficiency and leverages DINO features for global semantic consistency, ensuring that inpainted regions blend naturally with the surrounding scene. Experiments show competitive performance in both quantitative metrics and visual quality on standard benchmarks.

Image inpainting, the process of filling in missing or corrupted parts of an image, has long been a challenging task in computer vision. Traditional methods often struggle to create results that are both locally coherent at the pixel level and globally consistent in terms of meaning and context. This is largely due to the inherent discrete nature of digital images and the pixel-based operations of many neural networks.

A new research paper titled “2D Gaussian Splatting with Semantic Alignment for Image Inpainting” by Hongyu Li, Chaofeng Chen, Xiaoming Li, and Guangming Lu introduces a novel approach that leverages 2D Gaussian Splatting (2DGS) to overcome these limitations. Gaussian Splatting is a technique that represents discrete points as continuous spatial representations, and it has previously shown promise in 3D scene modeling and 2D image super-resolution.

A Continuous Approach to Image Inpainting

The core idea behind this new framework is to encode incomplete images into a continuous field of 2D Gaussian splat coefficients. Instead of directly synthesizing missing pixels, the method learns these Gaussian parameters from the available image data. The final image is then reconstructed through a differentiable rasterization process. This continuous rendering paradigm naturally promotes pixel-level coherence, leading to smoother and more realistic inpainted results.

One of the significant challenges with high-resolution image processing is the computational overhead and memory consumption. To address this, the researchers introduced a patch-wise rasterization strategy. This approach divides the image into smaller, manageable segments, each with its own set of Gaussians. This significantly reduces GPU memory demands and accelerates inference by allowing parallel processing of patches. To prevent visible seams at patch boundaries, an overlap strategy with blending techniques is employed, ensuring spatial continuity across the entire image.

Semantic Alignment for Global Consistency

Maintaining global semantic consistency is crucial for believable inpainting, especially when dealing with large missing regions. The paper tackles this by incorporating features from a pretrained DINO model. DINO (Self-Supervised Vision Transformers) features are known for their robustness and ability to capture high-level semantic information. The researchers observed that DINO’s global features are naturally resilient to small missing areas and can be effectively adapted to guide semantic alignment even in scenarios with large masks.

To make these DINO features more effective for masked inputs, a lightweight feature adaptation module is proposed. This module transforms potentially noisy features from masked images into semantically coherent representations, which then serve as conditional inputs to the inpainting network. The integration is achieved using Adaptive Layer Normalization (AdaLN), a parameter-efficient mechanism that modulates network activations globally, ensuring the inpainted content remains contextually consistent with the surrounding scene.

Also Read:

Experimental Validation and Future Directions

Extensive experiments were conducted on standard benchmarks like Celeba-HQ and Places2 datasets. The results demonstrate that the proposed method achieves competitive performance in both quantitative metrics (like FID and LPIPS) and perceptual quality. Qualitative comparisons show that the method produces visually coherent and semantically plausible completions, often outperforming existing techniques that may exhibit artifacts or inconsistencies.

Ablation studies further validated the effectiveness of each key component, including the DINO-based semantic guidance, the rasterization-based decoder, and the AdaLN module. The research establishes a new direction for applying Gaussian Splatting to 2D image processing, highlighting its strong potential for realistic image restoration and broader visual synthesis tasks. For more technical details, you can refer to the full research paper here.

While the current approach delivers strong results, the authors note that it currently lacks explicit controllability, which is often found in methods benefiting from multimodal inputs like textual prompts. Enhancing the framework with cross-modal conditioning mechanisms is identified as a compelling direction for future exploration.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Filling Gaps: 2D Gaussian Splatting for Coherent Image Inpainting

A Continuous Approach to Image Inpainting

Semantic Alignment for Global Consistency

Experimental Validation and Future Directions

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates