This document describes how to create head visuals—the personalized visual representations of Digital Humans.

A head visual is essential for deploying a Digital Human. It consists of a video asset that is preprocessed by the UNITH synthesis engine. This processed video becomes the foundation for the head visual, which serves as the face of your Digital Human.

info

Note: Once a head visual is created, multiple Digital Humans can share a single head visual ID.

error_outline

Important Considerations:

Content Policy: UNITH reserves the right to remove any head visual that is deemed offensive, harmful, or inappropriate.
Video Best Practices: Before you begin, please refer to our separate documentation on the best practices for creating idle videos for Digital Humans. This will ensure optimal results.
Maximum Video Length: The maximum supported video length for head visual creation is currently 20 seconds.

For more details on video best practices, please refer to our video guidelines.

Process Overview

The process of creating a custom head visual involves the following steps:

Uploading the source video.
Creating the head visual resource.
Saving the head visual.
Assigning the head visual to your organization.

API Endpoints

1. Upload Video

Endpoint: /video/upload
Method: POST
Description: Uploads the video source for the head visual.
Request Body:
- file: The video file to upload (e.g., video.mp4). The file parameter name is important.

curl

curl -X 'POST' \
  'https://platform-api.unith.ai/video/upload' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer yourAuthBearerToken' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@/path/to/your/video.mp4'  # Replace /path/to/your/video.m

Response:
- Status Code: 200 (OK)
- Response Body:

curl

{
  "token": "temporary_video_token"  // The temporary token for the uploaded video.
}

Response Parameters:
- token (string): A temporary, unique token representing the uploaded video. This token is required for the next step.
Error Handling:
- The endpoint will return standard HTTP error codes for invalid requests, upload failures, or server errors. Ensure your request is correctly formatted and the video file is valid.

2. Create Head Visual

Endpoint: /head_visual/create
Method: POST
Description: Creates a new head visual resource from the uploaded video.
Request Body:

curl

{
  "update": false,
  "detector_version": "v2",
  "detector_threshold": -0.2,
  "mode": "default",
  "cut_timestamp": 0.1,
  "debug": false
}

Request Parameters:
- update (boolean): Indicates whether to update an existing head visual (set to false for new). Don't change.
- detector_version (string): The version of the face detection algorithm to use. Use "v2" for best results.
- detector_threshold (number): The threshold for face detection. Don't change.
- mode (string): The processing mode. "default" is the standard mode. Don't change.
- cut_timestamp (number): The timestamp for cutting the video. Don't change.
- debug (boolean, optional): If set to true, the response will include a task_id. If video processing fails, a ZIP file containing frames and face detection results will be provided for debugging.
Curl Example:

curl

curl -X 'POST' \
  'https://platform-api.unith.ai/head_visual/create' \
  -H 'accept: application/json' \
  -H 'x-head-video-token-id: yourTemporaryVideoToken' \
  -H 'Authorization: Bearer yourAuthBearerToken' \
  -H 'Content-Type: application/json' \
  -d '{
  "update": false,
  "detector_version": "v2",
  "detector_threshold": -0.2,
  "mode": "default",
  "cut_timestamp": 0.1,
  "debug": false
}'

Response:
- Status Code: 200 (OK)
- Response Body:

curl

{
  "data": {
    "id": "yourNewHeadVisualId",    // The unique ID of the new head visual.
    "task_id": "yourTaskId"       //  The ID of the processing task (only if debug=true).
  }
}

Response Parameters:
- id (string): The unique identifier for the newly created head visual. This ID is used in subsequent steps.
- task_id (string, optional): The ID of the video processing task. This is only included if the debug parameter was set to true in the request.
Error Handling:
- The endpoint will return standard HTTP error codes for invalid requests, missing headers, or server errors.

3. Save Head Visual

Endpoint: /head_visual/save
Method: POST
Description: Saves the head visual resource with the specified metadata.
Request Body:

curl

{
  "id": "yourNewHeadVisualId",      //  The head visual ID from the /head_visual/create response.
  "name": "yourUniqueHeadVisualName",  //  A unique name for the head visual.
  "gender": "MALE" or "FEMALE",        //  The gender of the Digital Human.
  "type": "TALK"                     //  The type of head visual.
}

Request Parameters:
- id (string, required): The ID of the head visual to save (obtained from the /head_visual/create response).
- name (string, required): A unique name for the head visual. This name must be unique within your organization.
- gender (string, required): The gender of the Digital Human. Use either "MALE" or "FEMALE".
- type (string, required): The type of head visual. Typically, this is "TALK".
- categoryId (string, optional): The category ID to assign to the head visual. Defaults to Unset if omitted. See the Categories section below.
Curl Example:

curl

curl -X 'POST' \
  'https://platform-api.unith.ai/head_visual/save' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer yourAuthBearerToken' \
  -H 'Content-Type: application/json' \
  -d '{
  "id": "yourNewHeadVisualId",
  "name": "yourUniqueHeadVisualName",
  "gender": "FEMALE",
  "type": "TALK"
}'

Response:
- Status Code: 200 (OK)
- Response Body: An empty string.

warning_amber

Important Notes:
- The name parameter must be unique. Choose a descriptive and unique name for your head visual.
- This endpoint may take some time to process, depending on the length of the uploaded video.
- The head visual status will initially be "pending" until the video processing is complete. You may need to check the status of the head visual separately if you need to confirm processing is done.
  - This pending state may take a few minutes and correlates to the length of the video being processed.
- The url in the response body will be empty, unless debug was set to true, in which case the URL of the debug ZIP file is returned.

Error Handling:
- The endpoint will return standard HTTP error codes for invalid requests, missing parameters, or if the head visual ID is invalid. It will also return an error if the chosen name is not unique.

Simple Head Visual Post-Processing Guide

This document outlines an optional, step-by-step procedure for post-processing your source videos manually to achieve custom idle video.

This documentation assumes you have recorded your model according to the best practices described in the "Creating Head Visuals" documentation and have a video with a green screen background.

General Post-Processing Procedure (Default Idle Loop)

This procedure focuses on creating a short, seamless idle loop for a default head visual.

1. Creating the Seamless Idle Loop

The goal is to create a short, natural-looking idle segment (under 5 seconds) that can be seamlessly looped. The total length of your final video must be shorter than 10 seconds.

Select Software: Open your captured video in editing software (e.g., DaVinci Resolve, Adobe Premiere, etc.).
Identify Loop Points: Find a brief segment of the video (ideally less than 5 seconds) where the model's movement (e.g., head movement, eye blinking) is natural and smooth. Avoid brisk or sudden movements, as these are highly visible when looping.
Reverse and Duplicate: Cut the selected segment, duplicate it, and reverse the speed of the duplicated clip. By appending the reversed clip to the original, you create a perfect loop where the start and end frames match, ensuring a natural transition.

info

Try to take one blink during the video, preferably around the middle of the recording — not at the beginning or end. This helps create a more relaxed and natural appearance. Avoid a body or head movement in the first or last frame, we want a smooth movement and this will make the loop transition more noticeable.

2. Keying and Background Removal

Key the Green Screen: Use keying tools (such as those available in After Effects or DaVinci Resolve) to accurately remove the green screen background from the subject.

3. Adding a Custom Background

Select Background: Add a custom background of your choice behind the keyed subject.
Avoid Distraction: If you use a video background, ensure the movement or activity is minimal. This video could also be turned into a loop to blend it better. This prevents a noticeable change when transitioning from the static idle state to the active speaking state.

4. Color Correction and Final Adjustments

Color Match: Perform color correction on the foreground (the model) to ensure the lighting and color tone seamlessly match the new background layer. This can be done in any professional editing software.

5. Exporting the Final Video

The final video must adhere to the following specifications for processing by our synthesis pipeline as mentioned in the “head visual creation” documentation:

Resolution: 1280 x 720p (16:9 HD)
Frame Rate: 25 frames per second (25fps)
Format: .mp4
Duration: Less than 10 seconds total.
Size: 3MB maximum

Key Difference: Two Loops Video Input

When creating a video for the Two Loops (more expressive) head visual. This format consists of a single video. The creation process is simpler

Single Continuous Video: You do not need to manually cut and reverse the video (Step 1 is skipped).
Video Structure: The input video is a single, continuous recording where the first half is the still idle state, (which will be defined by the Cut-Timestamp) and the second half is the expressive state (e.g., subject moving hands, changing facial expression).
System Handles Looping: When creating the head visual in two_loops mode, you specify the cut_timestamp where the transition between the idle and speaking states occurs. Our system automatically handles the necessary looping and inversion for both the idle and expressive segments to ensure seamless, non-jarring transitions.

This distinction is crucial: for Two Loops, your editing work is focused purely on keying, background, and color correction, as the platform manages the looping mechanism.

info

In the first part of the video, follow the same recommendations as for the Default Idle Loop format to ensure a seamless infinite loop.
Include one blink in the first 4 seconds, and another between second 4 and the end. This enhances realism and avoids a robotic look.
As before, avoid noticeable body or head movement in the first or last frame to generate a smooth loop.

info

The two_loops mode works slightly different between legacy Digital Humans and Digital Humans in streaming mode.

Please check this page for legacy and streaming digital humans.

Using AI Generation Tools

You have the freedom to use a variety of tools, including AI-based solutions for video creation, source image generation, or face swapping. However, please be aware:

Training Data: Our model was trained on real human video footage, and real video may deliver better performance than visibly AI-generated content.
Face Detection: Our synthesis model relies on accurate face detection in every frame of the video. Ensure that any post-processing or AI generation does not interfere with the clarity or consistency of the subject's face.

Happy creating!

scheduleLast updated Apr 20, 2026

Process Overview

API Endpoints

1. Upload Video

2. Create Head Visual

3. Save Head Visual

Categories

Simple Head Visual Post-Processing Guide

General Post-Processing Procedure (Default Idle Loop)

1. Creating the Seamless Idle Loop

2. Keying and Background Removal

3. Adding a Custom Background

4. Color Correction and Final Adjustments

5. Exporting the Final Video

Key Difference: Two Loops Video Input

Using AI Generation Tools