Understanding Omnihuman 1.5, Audio-Driven, and Arcads 1.0

Arcads offers several models to animate digital actors, each designed for different inputs, levels of realism, and video lengths. Depending on whether you’re working from audio, an image, or a full script, some models will be better suited than others.

👉 You can also compare the differences between these models in more detail here: https://f.io/HbBCqK8r

Below is a quick overview to help you understand when to use Omnihuman 1.5, Audio-Driven, or Arcads 1.0. The "Veo 3.1" model is another advanced option, enabling actors to perform various actions like object interactions, location changes, and adjusting appearances through features like customizable start frames. Omnihuman 1.5:

Omnihuman 1.5 animates a static image.

Result: a talking, animated face
Style: Expressive, with stronger facial movements and emotions
Perceived as: slightly more AI-styled (this is subjective — some users prefer this look)
No character limit, but best to keep it below 400 characters. Cost is 1 credit per minute
The actor remains static (no body or camera movement)

Audio-Driven

Audio-Driven works the same way as Omni Human: it animates an image but based on an audio. This model excels with shorter clips where natural speaking rhythms are emphasized, making it ideal for quick introductions or concise updates.

Result: realistic talking face animation
Style: Lipsync is generally perceived as more natural / realistic than Omnihuman 1.5 (subjective)
No character limit, but best to keep it below 600 characters. Cost is 1 credit per minute
The actor remains static (no body or camera movement)

Arcads 1.0

Arcads 1.0 generates a video from a text script, using a digital actor that speaks and moves. This makes it particularly effective for long-form talking-head videos, such as tutorials, structured messaging, or narratives requiring extended sessions of explanatory content.

Result: a stable, animated actor with consistent movement
Best for: Stable for longer videos (education, demos, narratives)
No character limit, but best to keep it below 1500 characters. Cost is 1 credit per minute
The actor follows a predefined movement, making the video more dynamic while remaining visually stable.

Which one should I use?
If your project involves advanced actor actions or dynamic interactions, consider using the Veo 3.1 model for better control and customizable scenes.

Use Omnihuman 1.5 for expressive videos
Use Audio-Driven if you want a better lipsync
Use Arcads 1.0 if you want longer videos with scripted content and stable actor movement.

Each method serves a different purpose — choosing the right one depends on your content, duration, and desired level of realism.

How to extend your videos ?

How to create a talking actor video (no product)?

How to Clone Yourself Into a Talking Actor?

How to create a custom talking actor ?

How Are Credits Counted?