AI MODEL

VEO 3.1 T2V

Veo 3.1 is a text-to-video model from Google DeepMind that turns natural language prompts into cinematic video clips with synchronized audio. It supports horizontal (16:9) and vertical (9:16) aspect ratios, and output resolutions up to 1080p. The system allows users to specify durations up to around 60 seconds, and it features advanced capabilities like multiple image references, first-and-last-frame guidance, and scene extension tools. The audio layer includes ambient sounds, music, and lip-synced dialogue, and the model emphasizes prompt fidelity, visual realism, and temporal coherence across frames.

Release Date

October 15, 2025

Developer

Google

Model Type

Text to Video (T2V)

VEO 3.1 T2V Prompts & Outputs