Key Points
- Veo 3.1 improves “prompt adherence” and supports native audio generation across tools.
- New Flow features like “Frame to Video” and “Extend” offer more control and longer clips.
- The model supports multiple inputs—text, image, and video—and outputs up to 1080p video.
- Pricing remains unchanged from Veo 3, but no veo 3.1 free tier exists.
- Early reactions praise new controls but note Sora 2 still leads in realism.
Veo 3.1 brings better prompt control and audio generation
Google unveiled Veo 3.1 on October 16, 2025, marking a major update to its AI-powered video generation model. Available now through the Gemini API and Flow editor, the release improves how Veo interprets prompts, handles image inputs, and produces synchronized audio.

According to Google, Veo 3.1 builds on the foundation of Veo 3—first announced at Google I/O 2025—offering stronger “prompt adherence” and the ability to turn uploaded images into smooth, consistent videos. The new model also generates matching audio in real time, eliminating the need for manual sound design.
In Flow, Google’s AI-assisted filmmaking app, Veo 3.1 powers features such as “Frame to Video,” which interpolates scenes between a first and last frame, and “Ingredients to Video,” which lets users combine visual elements from multiple sources. This new level of narrative control extends to dialogue and ambient sound, creating more cohesive and expressive clips.
Expanded editing and enterprise integration
Beyond creative upgrades, Veo 3.1 introduces workflow features for developers and enterprise teams. The model now accepts text, still images, and video clips as inputs and supports up to three reference images to guide visual style or product appearance.
Enterprise-focused tools like “Scene Extension,” “Insert,” and “Remove” allow users to lengthen footage, add or delete objects, and ensure continuity across clips. While some functions are exclusive to Flow, others are accessible via the Gemini API or will roll out to Vertex AI soon.
Output quality has also improved. Veo 3.1 delivers 720p or 1080p resolution video at 24 frames per second, with default clip durations of 4 to 8 seconds. Using the “Extend” feature, videos can now stretch up to 148 seconds, offering longer-form storytelling without external tools.
Pricing and access
Despite its expanded features, Veo 3.1 maintains the same pricing as its predecessor. Through the Gemini API, users can choose between:
- Standard model: $0.40 per second of video
- Fast model: $0.15 per second
There is no veo 3.1 free option; charges apply only to successfully generated clips. Flow users, however, gain these capabilities through their existing subscriptions, making the platform accessible to hobbyists and professionals alike.
Performance and early reactions
Reviewers and early adopters describe Veo 3.1 as a technically impressive but still imperfect step forward. Tests show more cinematic, polished results compared to previous versions, though some users note that its visuals remain more “artificial” than those of OpenAI’s Sora 2.
AI creator Matt Shumer called Veo 3.1 “noticeably worse than Sora 2” in realism but praised its strong editing features. Others, like 3D artist Travis Davids, highlighted improved audio but criticized the lack of custom voice options and the 8-second generation limit for non-extended clips. Meanwhile, AI newsletter writer @kimmonismus described it as “amazing,” albeit still behind OpenAI’s offering.
Despite mixed reviews, adoption remains strong. Google reports that over 275 million videos have been generated across Veo models since Flow’s launch five months ago.
Safety and responsible AI design
All Veo 3.1 videos are automatically watermarked using SynthID, Google’s invisible marker that confirms AI origin. The company applies moderation filters across its APIs and stores generated content temporarily, deleting it after two days unless downloaded. These safeguards aim to support ethical use and transparency, particularly for enterprise clients in regulated industries.
What’s next
Google says Veo’s roadmap focuses on deeper multimodal integration and expanded enterprise deployment through Vertex AI. Future updates are expected to refine realism, extend duration limits, and introduce custom voice generation—addressing some of the most frequent user requests.
As the AI video landscape accelerates, Veo 3.1 positions Google not as the most realistic player, but as one prioritizing creative control and scalability—key advantages for developers and content teams building the next wave of AI-driven storytelling.