When creating an Interactive Avatar, the quality of your footage is key. It's the foundation for the entire avatar, so be sure to follow these tips closely for the best possible outcome.
How to Film the Best Footage for Your Interactive Avatar
We offer two options to record your footage:
Upload a Google Drive or local video file
Record with your computer's webcam
Recording Structure
Your video must be continuous with no cuts or edits, and should be a minimum of 2 minutes in total length. It must be divided into three distinct sections: Listening, Talking, and Idling.
1. Listening (15 seconds – Silent)
This opening portion is crucial for capturing natural engagement behaviors. Use facial expressions like smiling, nodding, or raising your eyebrows to show attentiveness—without speaking. This helps your avatar look actively engaged when it's not delivering lines.
2. Talking (90 seconds)
This is where you deliver your scripted message. Speak clearly, confidently, and in a natural tone. Aim for clean audio and avoid background noise. Keep your message direct and well-paced—this will directly influence the clarity and tone of your avatar’s responses.
3. Idling (15 seconds – Silent)
In this final portion, your goal is to appear present but passive. Maintain a neutral facial expression, with light occasional nods to show continued presence. Avoid dramatic expressions—this segment is used to create the avatar’s listening stance between speech responses.
Key Difference Between Listening and Idling:
During the Listening portion, you’ll show more animated engagement. In contrast, the Idling section is more static and calm, focusing on subtle presence.
Recording Environment
Background & Environment
Record against a clean, static background or a professional-grade green screen. Avoid dynamic lighting, moving people or objects, and any reflections in the background. If using greenscreen, ensure it’s elevated off the floor to reduce color spill and keep the subject at least 5 feet away to prevent shadows.
Camera Setup & Framing
Use a camera with at least 1920x1080 resolution (HD)—4K is not supported at this time for Interactive Avatars due to real-time streaming latency. Frame your subject from the chest up. Avoid including hands unless absolutely necessary, and never record full-body footage (this is not supported). If hands are visible, they should remain still and relaxed at the sides or in a resting position.
Lighting Tips
Good lighting is critical for creating a clean, professional avatar. Use soft, even lighting to avoid shadows on the face and background. Two diffused lights placed on either side of the subject will provide balanced coverage. If needed, add a backlight behind and above the subject to separate them from the background, and consider warming up the light color temperature (around 4800K) to avoid pale or washed-out tones. If shadows appear beneath the hands or chin, place a soft light beneath the subject as well.
Performance Best Practices
Maintain steady eye contact with the camera throughout the entire recording.
Avoid swaying, loud breathing, or unnecessary movements.
Remove glasses, shiny clothing, or jewelry, which can interfere with background removal.
Do two takes if possible:
One with minimal movement.
One with 4–5 slow, subtle gestures that stay below the shoulders.
Talent Reminder
Before recording, coach your talent to:
Begin with a listening pose—engaged and expressive but silent.
Transition into a confident delivery for the speaking portion.
End with a still, neutral idle pose to complete the recording.
For a visual explainer on how to create the best possible interactive avatar, please watch this short video.
Once you've created the best possible footage, learn how to set up your Interactive Avatar in our Interactive Avatar 101: Creation and Use Guide article.