Veo
Veo is Google DeepMind's text-to-video model, producing high-quality 1080p video clips from text descriptions with cinematic quality and improved physics understanding.
June 18, 2026

Veo is Google DeepMind's text-to-video AI model, capable of generating high-quality 1080p video clips from text prompts with strong cinematic coherence and physics understanding.
How it works
Veo uses a latent diffusion model architecture trained on vast amounts of video data. It interprets text prompts to generate video sequences frame by frame, with attention to temporal consistency, physics simulation, and cinematic composition. Google's expertise in multimodal AI — combining language understanding with visual generation — gives Veo strong prompt interpretation and scene construction.
Why it matters
Veo represents Google's entry into the AI video generation race alongside OpenAI's Sora and other models. Its emphasis on physics understanding — how objects should move, interact, and respond to forces — addresses one of the key weaknesses in earlier AI video, where impossible physics broke immersion.
Key features
- 1080p output — full HD resolution suitable for web and social media.
- Cinematic styles — supports various film styles, camera movements, and lighting setups via prompt.
- Physics-aware motion — better understanding of how physical objects move and interact.
- Extended durations — longer clips than early AI video models.
In eaxy
Eaxy monitors the AI video landscape and selects best-fit models for each generation task. While Veo and Sora are notable models, eaxy's pipeline uses available, proven models like Kling 3 that deliver excellent results for the clip durations and styles that creators need today.
Related terms
Perguntas frequentes
What is Google Veo?+
Veo is Google DeepMind's text-to-video AI model. It generates 1080p video clips from text prompts and is known for cinematic quality, improved physics understanding, and support for various visual and cinematic styles.
How does Veo compare to Sora?+
Both Veo and Sora are high-quality text-to-video models. Veo emphasizes physics understanding and 1080p output, while Sora is known for longer clip durations. Eaxy uses best-fit models to deliver quality video regardless of which specific model leads at any time.
What resolution does Veo generate?+
Veo generates video at 1080p (1920×1080) resolution, which is suitable for most web and social media content. Higher resolutions may require upscaling.
Crie com eaxy
Descreva qualquer coisa e gere imagens incríveis em segundos; depois dê movimento com os melhores modelos de vídeo com IA.