TLDR: Google has launched Veo 3.1 and Veo 3.1 Fast, significant updates to its AI video model, now available in paid preview via the Gemini API and other platforms. The new versions feature richer native audio, longer and more coherent clip generation (up to one minute), improved prompt adherence, and advanced cinematic controls like scene extension and first-and-last frame transitions. These enhancements aim to make AI-driven video creation more practical and powerful for developers and creators.
Google has rolled out a major upgrade to its generative AI video model, introducing Veo 3.1 and a faster variant, Veo 3.1 Fast. These updated models are now accessible in paid preview for developers and creators through the Gemini API, Google AI Studio, and Vertex AI, and are integrated into Google’s Flow editing application. The release, dated October 16, 2025, marks an evolutionary step from the earlier Veo 3 model, focusing on three core improvements: richer native audio, advanced scene and shot control, and significant quality and length enhancements.
One of the most notable advancements in Veo 3.1 is its expanded native audio support. While Veo 3 offered synchronized sound, the new iteration generates synchronized, contextual audio—including natural conversations, ambient sounds, and effects—as a built-in output. This eliminates the need for separate sound design in post-production, streamlining the creative workflow. Features such as ‘Ingredients to Video,’ ‘Frames to Video,’ and ‘Scene Extension,’ which previously produced silent video, now integrate generated sound, making the final product more dynamic and complete.
Veo 3.1 also boasts tighter prompt adherence, meaning the generated videos more accurately reflect the written and visual inputs provided by users. Google states that the AI now ‘thinks’ more like a real director and editor, having been trained to understand cinematic principles such as camera motion, lighting, and pacing. This results in videos with film-like rhythm, consistent timing between shots, smoother transitions, and logical framing, contributing to a more natural and cinematic feel.
For narrative control and longer content creation, Veo 3.1 introduces several key features. It supports substantially longer single clips, with Google and its partners demonstrating outputs up to one minute for certain generation modes, and targets 1080p output as a baseline. The ‘Scene Extension’ feature allows users to continue generating a clip from a previously made video, maintaining continuity. Additionally, the ‘Frames to Video’ function enables creators to define a starting and ending still image, with Veo seamlessly filling in the motion and transitions between them. Users can also guide the visual style using up to three ‘reference images’ with the ‘Ingredients to Video’ feature.
Also Read:
- Veo 3.1 Elevates Google’s AI Video Capabilities, Posing Direct Challenge to Sora 2
- Google Ads API v22 Unleashes Generative AI for Enhanced Campaign Automation and Smarter Bidding
The pricing for Veo 3.1 remains consistent with its predecessor, Veo 3, at $0.40 per second for the standard model and $0.15 per second for Veo 3.1 Fast. This strategic update positions Veo 3.1 as a versatile tool for various applications, from rapid prototyping to higher-fidelity production workflows, and is expected to empower storytellers, brands, and developers in the evolving landscape of AI-driven video content creation.


