Skip to main content

Veo 3 & Veo 3.1 by Google

Feature Release: Veo 3 & Veo 3.1 , the new state-of-the-art video model, now on Leonardo.Ai

Ayumi Umehara avatar
Written by Ayumi Umehara
Updated today

Introduction

Leonardo.Ai brings best-in-class video generation with Veo 3 and Veo 3.1 by Google.

Veo 3 delivers enhanced realism, improved physics, and native audio generation -including dialogue - so you can create fully realised videos without post-production. Veo 3.1 builds on this foundation with sharper realism, smarter motion, enhanced expression, and greater creative control over how your story unfolds.

Read on to find out why they're great for video creation and story telling, the differences between the models, and how to get the best results out of them.


What’s the Difference Between Veo 3 and Veo 3.1?

Veo 3.1 builds on Veo 3 with additional creative control and improved realism. Faster versions of each model are available with Veo 3 Fast and Veo 3.1 Fast, for more affordable and rapid ideation.

Veo 3 Includes:

  • Text-to-Video generation

  • Image-to-Video (Start Frame support)

  • Native audio generation (including dialogue)

  • Strong prompt adherence

  • 4, 6, or 8 second duration options

  • 720p and 1080p resolution (same token cost)

  • Veo 3 Fast mode (lower cost, faster generations)

Veo 3.1 Includes Everything in Veo 3, Plus:

  • End Frames: Upload an image to define how your video ends. This allows precise control over the final shot and improves scene continuity. Note that when using an End Frame, a Start Frame is required.

  • Enhanced Image-to-Video Fidelity: Improved realism, better motion quality, stronger depth and consistency

  • Smarter Physics & Expression: More natural human movement, improved emotional detail, more believable gestures and reactions.

Core capabilities of the Veo 3 suite of models

  • Native Audio Generation: Audio is generated automatically with Veo videos. You can add audio cues directly in your prompt, for example :

    • “The sound of an ice cream truck plays in the background.”

    • “The captain turns and says, ‘We set sail at daybreak.’”

    • audio: sound of keyboard typing and soft ambient AC hum

    • Note that audio cannot currently be turned off.

  • Multi-Modal input support: You can combine text prompts and image prompts (start and end frames where applicable).


How to generate videos with Veo 3

  1. From the home page, navigate to the AI Video Creation tool by clicking Video beneath the prompt bar or in the left side bar

2. Click the Models menu in the side bar

3. Select Veo 3 Fast, Veo 3, Veo 3.1 Fast or Veo 3.1 from the Models menu.

4. Enter your text prompt, and click Generate. To get the most out of your videos, check out this blog for tips on mastering prompts for Veo models.

Optional: Add a start and end frame (where applicable). Learn more.


Veo 3 Image to Video - Using a Start Frame

You can now have even more control over your Veo 3 outputs by using an image as a start frame. Combined with your prompt, Veo 3 will use your image as the starting point of the video, letting you achieve the exact aesthetic you want and guiding your scene in the right direction.

Tips for creating seamless videos with Start and End frames:

  • Create your Start Frame, then explore fresh perspectives and angles along the way with Nano Banana via our Inline Editor. Nano Banana also helps maintain character consistency across scenes, ensuring you achieve perfect end frames. (Alternatively use the End Frame as your jumping off point to create a suitable Start Frame)

  • Craft moody, cinematic shots with Lucid Realism, then bring them to life with Start–End Frame video transitions.

  • Longer videos may offer smoother but potentially slower transitions. Choosing a shorter duration may offer much faster but potentially less smoother transitions - consider subtler transitions for shorter videos.

  • Avoid having extremely different Start and End frames (such as different settings or extreme changes and transformations). Veo 3.1 is best used for clean actions within a specific scene or context. If fancier morphing or scene transitions are required, consider using Kling 2.1 Pro instead.


Frequently Asked Questions

Is Veo 3 available to free users?

Unfortunately due to the token cost, Veo 3 models are only accessible to paid plan holders.

What are the token costs for using Veo 3?

  • Veo 3 generations on Leonardo.Ai have a fixed token cost of 2,500.

  • Veo 3 Fast generations have a fixed token cost of 2,000.

  • Veo 3.1 generations cost 2,500 tokens, regardless of resolution.

  • Veo 3.1 Fast generation costs 1,250 tokens, regardless of resolution.

Can I control the length of my Veo 3 generations?

Yes, you can choose from 4, 6, and 8 seconds with any model.

How do I add audio to Veo 3 generations?

Audio cues can be added at any points in the prompt when required E.g. “the sound of an ice cream truck can be heard in the background”, or add general audio cues at the end of the prompt, e.g "audio: sound of keyboard typing and ambient sound of air conditioner unit".

What are start and end frames, and how do I use them?

Check out this detailed guide that cover how to create videos using start and end frames.

Did this answer your question?