Skip to main content

Understanding Omnihuman 1.5, Audio-Driven, and Arcads 1.0

Main differences with our Talking Actors models: Omnihuman 1.5, Audio-Driven, and Arcads 1.0

Guilhem carriere avatar
Written by Guilhem carriere
Updated over 2 weeks ago

Arcads offers several models to animate digital actors, each designed for different inputs, levels of realism, and video lengths. Depending on whether you’re working from audio, an image, or a full script, some models will be better suited than others.

πŸ‘‰ You can also compare the differences between these models in more detail here: https://f.io/HbBCqK8r

Below is a quick overview to help you understand when to use Omnihuman 1.5, Audio-Driven, or Arcads 1.0.
​
​Omnihuman 1.5:

Omnihuman 1.5 animates a static image.

  • Result: a talking, animated face

  • Style: Expressive, with stronger facial movements and emotions

  • Perceived as: slightly more AI-styled (this is subjective β€” some users prefer this look)

  • 400 characters max

  • The actor remains static (no body or camera movement)


Audio-Driven

Audio-Driven works the same way as Omni Human: it animates an image but based on an audio input.

  • Result: realistic talking face animation

  • Style: Lipsync is generally perceived as more natural / realistic than Omnihuman 1.5 (subjective)

  • 600 characters max

  • The actor remains static (no body or camera movement)


Arcads 1.0

Arcads 1.0 generates a video from a text script, using a digital actor that speaks and moves.

  • Result: a stable, animated actor with consistent movement

  • Best for: Stable for longer videos (education, demos, narratives)

  • 1500 characters max

  • The actor follows a predefined movement, making the video more dynamic while remaining visually stable.


Which one should I use?

  • Use Omnihuman 1.5 for expressive videos

  • Use Audio-Driven if you want a better lipsync

  • Use Arcads 1.0 if you want longer videos with scripted content and stable actor movement.

Each method serves a different purpose β€” choosing the right one depends on your content, duration, and desired level of realism.

Did this answer your question?