Skip to main content

How to Create a Custom "Talking Actor" ?

Create and animate any custom actor in minutes

Written by Damien goubin

Here are 3 processes, depending on the model you choose:

Process 1: For audio-driven models and the OmniHuman model

Process 2: For the Arcads 1.0 model
Process 3: Clone option from a video


PROCESS 1 : Using Audio driven OR Omnihuman models :

Step 1 — Generate the actor image (choose a model)

Before you can make a “talking actor,” you need a strong single portrait image (clean face, good lighting, sharp details). In the image generator, pick one of these:

Nano Banana Pro

Best when you want very accurate, “smart” images (good real-world understanding), strong editing, and reliable text rendering in images

Seedream 4.5

Best when you need cinematic aesthetics, strong spatial reasoning, and especially consistent characters across multiple generations (great if you’re iterating a character look and want it to stay stable).

GPT Image 1.5

Best when you want very strong instruction-following and a tight prompt-to-image match, plus a solid edit workflow (generate + transform + edit). It’s a good “default” choice when you want the model to do exactly what you described.

Quick pick

  • Want “most controllable prompt fidelity”? → GPT Image 1.5

  • Want “best cinematic look + consistency across variations”? → Seedream 4.5

  • Want “smart editing + strong text/precision + all-around reliability”? → Nano Banana Pro


How to prompt the perfect actor image

Use a prompt that locks down identity + camera framing.

Prompt template

  • Subject: age range, ethnicity (optional), hairstyle, wardrobe

  • Framing: “front-facing medium close-up” / “head-and-shoulders”

  • Lighting: “soft key light, natural skin texture”

  • Background: “plain studio background” (keeps attention on the face)

  • Style: “photorealistic” (recommended for talking actors)

Example prompt

"Photorealistic head-and-shoulders portrait of a confident presenter, front-facing, neutral background, soft studio lighting, sharp focus on eyes, natural skin texture, 35mm lens look, minimal shadows, high detail"

Result

GPT Image ->

Nanobanana image ->

Seedream image ->

Tip: avoid heavy motion blur, extreme angles, hands covering face, or busy backgrounds—clean facial visibility makes the talking result look more believable.


Step 2 — Turn the image into a talking actor

  1. Click the image you just generated.

  2. Click Transform → Talking actor

  3. Write your script (what the actor will say)

  4. Pick a voice

  5. Click Generate

Result ->


PROCESS 2 : USING ARCADS 1.0 MODEL

  • Click on Talking Actor section, then "Add actors"

  • Then click on Create Actor

  • Upload an image that you either generated with Arcads, or use one of yours! You can also prompt it straight from the platform!

You can add guidance on how the actor should speak and behave directly in the prompt.

For example:

“Make the actor talk with excitement and energy, looking directly at the camera, with a friendly and engaging tone.”

Once your instructions are ready, simply click Turn into Talking Actor to generate the video.

Then pick a voice of your choice, and click Pick Voice.

It will then generate a preview for you.

The actor is now ready to be used 🎉 Ensure the actor has completed the training process before it appears in the 'My Actors' folder. It usually takes 2-4 hours.


You can find it under Talking Actor → My Actors. Custom actors are private to your workspace and cannot be accessed by other users, ensuring privacy and security. If you encounter issues locating your actor, ensure the training process is complete or check the Custom Actors section of your workspace.



PROCESS 3: Using CLONE Option

  • This allows you to use a video as a reference for creating a custom actor. This is recommended if you want the Custom Actor to move more fluidly and with higher fidelity.


Strict requirements tied to the reference video to achieve best results:

Accepted file formats: .mp4 / .mov
Max file size is: 100MB
Minimum video reference length: 2 minutes

Speaking:

  • Speak continuously throughout the video

  • Say anything you like (for example, talk about your company or your product)

  • Full, uninterrupted speech is required for accurate voice and lip-sync training

Framing:

  • Face must be fully visible at all times

  • The mouth must never be covered

  • Keep a natural, stable framing

  • Avoid abrupt head turns or jerky movements

Expression and Tone:

  • Speak naturally

  • Use the tone you want your avatar to replicate (professional, casual, enthusiastic, etc.)

Did this answer your question?