Converting Legacy Digital Humans to Streaming Mode
Overview
Digital Humans created in legacy mode can be upgraded to streaming mode for improved performance and reduced latency. Streaming mode delivers audio and video in real-time as content is generated, providing a more natural and responsive user experience.
Streaming mode requires a TTS provider that supports audio streaming. Currently supported providers are Microsoft Azure and ElevenLabs.
Prerequisites
Before converting your Digital Human to streaming mode, ensure you have:
- Active UNITH account with API access
- Valid Bearer token for authentication
- Your Digital Human's head_id (can be obtained via interFace or api)
- A voice from a streaming-compatible TTS provider (Microsoft Azure or ElevenLabs)
Legacy vs. Streaming Mode
Legacy Mode Characteristics
- Uses text splitting to deliver responses in chunks
- Pre-generates complete audio before video synthesis
- Accessible via https://chat.unith.ai/{org_id}/{head_id}
Streaming Mode Characteristics
- Delivers audio and video in real-time as content is generated
- Lower latency for first response
- Requires streaming-compatible TTS provider
- Accessible via https://stream.unith.ai/{org_id}/{head_id}
- Text splitting must be disabled
Conversion Process
Converting from legacy to streaming mode requires three configuration changes and a URL update. Follow these steps in order.
Step 1: Configure Streaming-Compatible Voice
Select a voice from a TTS provider that supports streaming. Currently supported providers are Microsoft Azure and ElevenLabs.
Endpoint: PUT https://platform-api.unith.ai/head/update
Request Body
{
"id": "yourHeadId",
"ttsProvider": "elevenlabs",
"ttsVoice": "voiceId"
}curl -X 'PUT' \
'https://platform-api.unith.ai/head/update' \
-H 'accept: application/json' \
-H 'Authorization: Bearer yourBearerToken' \
-H 'Content-Type: application/json' \
-d '{
"id": "yourHeadId",
"ttsProvider": "elevenlabs",
"ttsVoice": "rachel"
}'For optimal streaming performance, refer to the Voice Selection Guide for recommended voices. ElevenLabs voices using flash_v2, flash_v2_5, turbo_v2, or turbo_v2_5 models are recommended for streaming.
Step 2: Enable videoStreaming
By enabling videoStreaming you will trigger a series of checks and processes, including text splitting behaviour adjustments. Legacy mode uses text splitting to deliver messages in chunks, but streaming mode handles content delivery differently.
Endpoint: PUT https://platform-api.unith.ai/head/yourHeadId/video-streaming?videoStreaming=true
Query Parameter
CURL Example
curl -X 'PUT' \
'https://platform-api.unith.ai/head/yourHeadId/video-streaming?videoStreaming=true' \
-H 'accept: */*' \
-H 'Authorization: Bearer yourBearerKey'Converting to a streaming digital human is only possible if your head_visual is in default mode. If you used the expressive mode for your digital human, your head_visual used the two_loops mode, and conversion to streaming is not possible. In this case, you'll need to create a new digital human.
Note: You can only convert streaming digital humans back to legacy digital humans if the head_visual is using default mode.
To retrieve the mode of head visual follow these steps:
# 1. Get head_visual id from head id resource
curl -X 'GET' \
'https://platform-api.unith.ai/head/yourHeadId' \
-H 'accept: application/json' \
-H 'Authorization: Bearer yourBearerToken'
# 2. Get head visual mode using head_visual_id
curl -X 'GET' \
'https://platform-api.unith.ai/head_visual/yourHeadVisualId' \
-H 'accept: application/json' \
-H 'Authorization: Bearer yourBearerToken'Text splitting (splitter=true) is incompatible with streaming mode. By setting videoStreaming to true, the text splitter will automatically be disabled. Enabling text splitter will forcefully turn streaming digital human back to legacy.
Step 3: Update Access URL
- api will return new streaming URL
- legacy chat.unith will no longer be accesible
URL Format Change
| Mode | URL Format | Example |
|---|---|---|
| Legacy Mode | https://chat.unith.ai/{org_id}/{head_id} | https://chat.unith.ai/example-corp/jane-15326 |
| Streaming Mode | https://stream.unith.ai/{org_id}/{head_id} | https://stream.unith.ai/example-corp/jane-15326 |
Simply replace chat with stream in your Digital Human's URL to access streaming mode after completing the configuration steps.
Complete Conversion Example
This example demonstrates the full conversion process from legacy to streaming mode:
# Step 1: Configure streaming-compatible voice
curl -X 'PUT' \
'https://platform-api.unith.ai/head/update' \
-H 'accept: application/json' \
-H 'Authorization: Bearer yourBearerToken' \
-H 'Content-Type: application/json' \
-d '{
"id": "headId",
"ttsProvider": "ttsProvider",
"ttsVoice": "ttsVoice"
}'
# Step 2: Enable videoStreaming parameter
curl -X 'PUT' \
'https://platform-api.unith.ai/head/yourHeadId/video-streaming?videoStreaming=true' \
-H 'accept: application/json' \
-H 'Authorization: Bearer yourBearerToken'
# Step 3: Access your Digital Human at the streaming URL
# https://stream.unith.ai/yourOrgId/yourHeadIDEmbed Integration Considerations
When converting to streaming mode, update your embed configurations accordingly. To learn more please visit our embeding guideline here.
Important Notes
TTS Provider Compatibility: Only Microsoft Azure and ElevenLabs support streaming mode. Other providers will return an error if you attempt to enable streaming.
Text Splitter Incompatibility: The text splitter (splitter=true) cannot be used in streaming mode. These features are mutually exclusive.
Voice Selection: Not all voices from streaming-compatible providers support streaming. Refer to the Voice Selection Guide for recommended streaming voices.
Configuration Order: Follow the conversion steps in the exact order provided to avoid configuration errors. The platform validates state transitions to prevent invalid configurations.
URL Access: Your Digital Human must be accessed via the correct URL for its mode. Legacy mode uses chat.unith.ai while streaming mode uses stream.unith.ai.
API Validation: The platform enforces configuration rules at the API level. Invalid state transitions will return descriptive error messages to guide proper configuration.