Video Guidelines for Avatar Creation
Overview
To create a Digital Human, you need to upload a video of a person - referred to as the Idle Video. This video provides the facial reference our Unith Video Synthesis AI Engine uses to generate the Digital Human.
Ensure the first and last frames of the video are identical or as close as possible to minimize visual artifacts (e.g. create video loop).
Technical Specifications for Idle Video
- Container-Codec: MP4 - H265/H264
- Max Resolution: 1280x720p
- Frame Rate: 25p
- Optimal File Size: 1 MB
- Optimal length: 4-7 seconds, max 10 seconds
Please check the following document for full technical details:
Good Examples


Video Synthesis Output Parameters
For optimal real-time streaming performance, the Digital Human output resolution is fixed at 1280×720p.
Usage Guidelines:
- Phone and desktop: ideal performance
- 4K Screens: Supported, performs acceptably even in fullscreen, but optimal viewing is on a tablet or smaller.
- Kiosk Installations: Not recommended at life-size. For larger displays, adapt interface design to maintain output resolution quality.

Video Capture Recommendations
- Framing: Maintain a slightly larger than medium shot for consistency across our library.
- Focal Length: 50–80mm to prevent distortion.
- Background: Editable or removable (green/blue screen preferred).
- Wardrobe: Solid, simple colors; editable shirt color; avoid busy patterns.
- Lighting: Soft, even lighting to avoid harsh shadows.
- Premium Model Training: Record at least 2 minutes of the subject speaking directly to the camera.

Shooting Deliverables
Format: Apple ProRes 422 Resolution: 3840×2160p (Ultra HD, 16:9) Frame Rate: 50fps ideal (25/30fps acceptable) Clip Length: Minimum 5 seconds Audio: Required – PCM, WAV, or MP3
Lighting Guidelines
- Maintain soft, diffuse lighting for natural skin tones.
- Avoid mixed color temperatures.
- Ensure even illumination across the subject’s face.

Additional considerations:
Video guidelines
It is recommended to have a video generated in a way that the first and last frame of the clip are identical, internally known as a Loop.
In general terms, our AI technology works better with women than with men faces.
Ideal Facial Characteristics:
- Freckles, Facial Skin Conditions, Beards or Facial Hair should be discarded for a matter of AI integration
- Prioritize Brown eyes over green/blue ones
- Hair style must be easy to crop in green screen, carefully avoiding any rebel hair falling over face
Talent acting requirements
- At least one blink, ideally 2-3
- Half Smile
- Half nod
- Consistent Torso Movement. Try some slight movement.