Overview
All the most performant AI video generation models are aggregated within Arcads.
Each model comes with its own strengths and limitations, but they all share one common characteristic:
Every AI video model has a maximum video length.
This article explains:
The maximum video duration supported by each model on Arcads
Why these limits exist
Proven methods to create videos longer than 30 seconds despite those constraints
Why AI Models Have Video Length Limits?
AI video models generate content frame by frame using heavy compute resources. To maintain:
visual consistency
audio sync
facial realism
rendering speed
each model enforces a maximum clip duration.
Video Length Limits by Model (Recap)
Exact limits may evolve over time — always refer to the model selector inside Arcads for the most up-to-date values.
Model | Max lenght (per clip) |
Sora 2 pro | 12 sec |
Veo 3.1 | 8 sec |
Kling 2.6 | 10 sec |
Arcads 1.0 | 1500 characters limit (> 1min) |
Audio Driven | 600 characters limit (> 45sec) |
Omnihuman 1.5 | 400 characters limit (> 30sec) |
When to Use Each Model ?
Visual Video Models
These models are designed to generate pure visual clips (motion, scenes, product shots). They work best for short, high-impact sequences and should be combined together for longer videos.
Sora 2 Pro
Use for cinematic scenes, storytelling shots, or high-quality visuals where realism matters. Best for premium-looking clips up to 12 seconds.
Veo 3.1
Ideal for fast-paced hooks and dynamic motion. Use it when you need attention in the first seconds of an ad or video.
Kling 2.6
Best suited for product visuals and smooth transitions. A good balance between motion and visual stability.
Face Cam Models
These models are optimized to generate talking human avatars (UGC-style, testimonials, explanations). They rely on script length rather than seconds.
Arcads 1.0
Use for long-form talking head videos: product explanations, tutorials, structured messaging. Best choice when you need 30–45s+ of continuous speech.
Audio Driven
Best for short, voice-led clips with a natural speaking rhythm. Ideal for concise messages, intros, or mid-video segments.
Omnihuman 1.5
Designed for very short UGC-style hooks and punchlines. Perfect for social ads, openings, or quick reactions
How to extend your videos?
Option 1 : Face cam Models (Arcads 1.0, Audio driven, Omnihuman 1.5) ?
If your initial video was created using one of the models above, you can easily extend it by generating additional clips.
To do so:
Select the same model
Choose the same actor
Keep the same voice
Write a new script for the next segment
Each generated clip will maintain 100% actor and voice consistency with the original video.
Once all clips are generated, simply combine them using any third-party video editing tool outside of Arcads to create a longer, seamless video.
This approach is the recommended way to create videos longer than the model’s per-clip limit while preserving visual and audio continuity.
Option 2: Visual Video Models (Sora 2 pro, Veo3.1, Kling 2.6)
How to extend a video, from a specific frame ?
If your initial video was created using one of the Visual Video Models (Sora 2 Pro, Veo 3.1, or Kling 2.6), follow these steps:
Click on your generated video
Select Take Snapshot
3. Choose the frame where you want the next video to start, then click Pick Frame.
4. The selected frame will be extracted and saved as an image.
5. Click Transform to Video.
You can now generate another clip starting from the selected frame. This ensures scene, context, and actor consistency across clips.
Additional Tips
Since this method uses models such as Veo 3.1, Kling 2.6, and Sora 2 Pro—which do not allow direct voice control—you may experience voice inconsistencies across clips.
In such cases, we recommend using ElevenLabs to standardize the voice. You can upload your video to ElevenLabs, select Voice Changer, and apply the same voice across the entire video for consistent audio output.
How to continue a video without selecting a specific frame ?
To continue a video without picking a specific frame, simply choose the extend option and do not select any keyframe. In this case, the system will automatically use the last moments of the existing video as context and continue the scene naturally from there.
This approach works best when you want a smooth, natural continuation of the same action, setting, and dialogue. The model will preserve visual consistency such as camera angle, lighting, and character behavior, and extend the video forward in time.
Select your video, and click “extend Video”
2.Prompt the next video
The video will be extended by 7 seconds
This process can take a 30-second video as input maximum total length of 37 seconds







