TLDR: ReLumix is a novel framework that enables any image relighting technique to be applied to videos. It works by allowing an artist to relight a single reference frame, and then uses a fine-tuned Stable Video Diffusion (SVD) model to consistently propagate this new illumination across the entire video sequence. Trained efficiently on synthetic data, ReLumix demonstrates strong generalization to real-world videos, offering a flexible, fast, and high-quality solution for dynamic lighting control in video post-production.
Controlling the lighting in videos during post-production has long been a complex challenge in computational photography. Existing methods often restrict users to specific relighting models, making it difficult to achieve desired artistic effects or integrate advanced techniques. This is where ReLumix, a novel framework, steps in to offer a flexible and efficient solution.
ReLumix introduces a groundbreaking approach that separates the relighting algorithm from the process of creating temporal consistency in video. This means that any image relighting technique, whether it’s based on advanced Diffusion Models or precise physics-based renderers, can now be seamlessly applied to video sequences.
How ReLumix Works: A Two-Stage Process
The core of ReLumix lies in its simple yet highly effective two-stage process:
- An artist first selects a single reference frame from a video and relights it using their preferred image-based technique. This could involve changing the color, intensity, or direction of light to achieve a specific mood or effect.
- Once the reference frame is relit, a specially fine-tuned Stable Video Diffusion (SVD) model takes over. This model then intelligently propagates the new, target illumination across all the remaining frames of the video, ensuring that the lighting changes are consistent and free from flickering or other artifacts.
To achieve this seamless propagation and maintain temporal coherence, ReLumix incorporates several key innovations. These include a gated cross-attention mechanism for smooth blending of features, and a temporal bootstrapping strategy that leverages the powerful motion understanding of SVD models. This allows the system to learn how light and shadow interact with objects, rather than just memorizing specific lighting conditions.
Training and Generalization
One of the remarkable aspects of ReLumix is its ability to generalize effectively to real-world videos, even though it is primarily trained on synthetic data. The researchers used the CARLA simulator to generate a vast dataset called CARLA Relight. This dataset allowed them to record the same scenes under various lighting and weather conditions, isolating illumination as the only variable. This efficient training approach means the model can be fine-tuned in a relatively short time (around 12 hours on a single H100 GPU) and still perform robustly on diverse real-world footage without needing further fine-tuning.
Performance and Efficiency
Experiments have shown that ReLumix achieves state-of-the-art performance in video relighting. It significantly outperforms existing methods in terms of consistency, visual quality, and adaptability. For instance, it boasts a 9x speed-up over I2VEdit and a 6x speed-up over Light-A-Video, all while maintaining excellent temporal consistency and visual fidelity. In human evaluations, users showed a strong preference for ReLumix’s results, confirming its superior visual quality and temporal stability.
The framework’s modular design is a major strength, allowing it to integrate with various relighting techniques. This means it can enhance workflows that are driven by text prompts, physics-based rendering, or image-based edits, extending their capabilities from static images to dynamic video.
Also Read:
- Achieving Precise Image Edits with Editable Noise Maps
- Object-AVEdit: Precise Audio-Visual Editing at the Object Level
Future Directions
While ReLumix represents a significant leap forward, the researchers acknowledge certain limitations. The current approach relies on a single, static reference frame. This can create challenges when dealing with videos that have significant camera motion or parallax, as new parts of the scene are revealed that weren’t present in the initial frame. Additionally, the framework is not yet equipped to handle dynamic light sources within the scene, such as moving spotlights or vehicle headlights.
Despite these limitations, ReLumix is a crucial step towards practical and controllable video relighting. By combining the power of large-scale generative models with the precision of classic graphics principles and artist-in-the-loop workflows, it promises to redefine the landscape of digital content creation. You can read the full research paper here.


