UNITHdocs
Sign inarrow_forward

Overview

The Two Loops Streaming Mode enables more expressive and natural Digital Human presentations by using separate video segments for idle and talking states. This advanced mode is specifically designed for streaming Digital Humans where audio duration is unknown in advance.

Two Loops mode creates more engaging Digital Humans by allowing dynamic transitions between idle gestures and expressive talking animations.

How Two Loops Streaming Works

Traditional vs. Two Loops Architecture

Traditional Streaming Mode:

  • Single idle loop plays continuously
  • Talking state uses the same loop with lip-sync overlay
  • Limited expressiveness during responses

Two Loops Streaming Mode:

  • Separate idle loop (0 to cut timestamp)
  • Separate talking loop (cut timestamp to end)
  • Smooth transitions between states
  • More natural and expressive responses

Video Requirements

Duration

  • Maximum video length: 120 seconds

Structure

  • Single continuous recording (no manual cutting required)
  • First half: Idle state with minimal movement
  • Second half: Expressive talking state
  • Natural transition at cut_timestamp

Creating a Two Loops Head Visual

info

To learn about how to create head visual via API, please check this page.

Step 1: Prepare Your Video

Your video should follow these specifications:

Idle State (First Half)

  • Subject in neutral pose
  • Minimal body and head movement
  • Subtle, natural gestures only
  • Include one blink in the first 4 seconds
  • Include another blink between second 4 and cut_timestamp
  • Avoid noticeable movement in first and last frames of this segment

Talking State (Second Half)

  • More expressive facial expressions
  • Natural hand gestures and movements
  • Animated, engaged body language
  • Subject appears actively communicating
  • Avoid abrupt movements at segment boundaries
info

The platform automatically handles looping and inversion for both segments to ensure seamless, non-jarring transitions.

Step 2: Determine Cut Timestamp

The cut_timestamp defines where your video transitions from idle to talking state.

Example:

  • Video duration: 20 seconds
  • Idle state: 0-10 seconds
  • Talking state: 10-20 seconds
  • cut_timestamp: 10

Guidelines:

  • Cut timestamp should occur at a natural transition point
  • Ensure smooth motion at the cut point
  • Typically set at the midpoint of your video for balanced loops
  • Measured in seconds from video start

Step 3: Create Head Visual via API

Endpoint: POST https://platform-api.unith.ai/head_visual/create

Request Body

code
{
"mode": "two_loops_streaming",  
"cut_timestamp": 10
}

CURL Example

code
curl -X 'POST' \  'https://platform-api.unith.ai/head_visual/create' \
  -H 'accept: application/json' \
  -H 'x-head-video-token-id: yourVideoTokenId' \
  -H 'Authorization: Bearer yourBearerToken' \
  -H 'Content-Type: application/json' \
  -d '{
  "mode": "two_loops_streaming",
  "cut_timestamp": 10
}'
ParameterTypeRequiredDescription
modestringYesMust be "two_loops_streaming"
cut_timestampnumberYesTimestamp in seconds where idle transitions to talking state

Video Production Best Practices

Idle State Guidelines

Movement:

  • Keep body and head movements minimal
  • Subtle weight shifts are acceptable
  • Natural breathing motion is encouraged
  • No dramatic gestures or expressions

Blinking:

  • Include exactly one blink in the first 4 seconds
  • You can include one additional blink between second 4 and cut_timestamp
  • Natural blink timing prevents robotic appearance
  • Avoid blinking in the first or last 0.5 seconds of the segment

Talking State Guidelines

Expressiveness:

  • Engaged facial expressions
  • Dynamic body language
  • Subject appears actively communicating

Movement Range:

  • More animated than idle state
  • Natural conversational gestures
  • Avoid extreme or distracting movements
  • Maintain professionalism appropriate to use case

Transitions:

  • Smooth motion at cut_timestamp boundary
  • Avoid abrupt changes at segment start/end
  • Natural flow between states

Complete Workflow Example

Step 1: Record Video

  • Record 10-second video with subject
  • 0-5 seconds: Subject in neutral waiting pose (idle)
  • 5-10 seconds: Subject with engaged, helpful expressions (talking)
  • Include natural blinks at 2 seconds and 7 seconds

Step 2: Post-Production

  • Follow our best practices for video recording
  • Export as high-quality video file

Step 3: Upload Video

  • Upload video to UNITH platform. Find more info about head visual creation here.
  • Receive video token ID

Step 4: Create Head Visual

code
curl -X 'POST' \  'https://api.unith.live/head_visual/create' \
  -H 'accept: application/json' \
  -H 'x-head-video-token-id: videoToken' \
  -H 'Authorization: Bearer yourBearerToken' \
  -H 'Content-Type: application/json' \
  -d '{
  "mode": "two_loops_streaming",
  "cut_timestamp": 5
}'

Step 5: Configure Digital Human

  • Associate head visual with Digital Human
  • Configure for streaming mode
  • Test idle and talking state transitions

Important Notes

Automatic Loop Handling: The platform automatically manages looping and transitions. You do not need to manually reverse, blend, or stitch video segments.

Cut Timestamp Precision: Set the cut_timestamp at the exact second where your subject transitions from idle to expressive state. Precision is important for smooth state changes.

Video Quality: High-quality source video is essential. Ensure proper lighting, clear edges after keying, and consistent framing throughout the recording.

Blink Timing: Strategic blink placement enhances realism. Include blinks as specified to avoid a static, robotic appearance.

Streaming Mode Requirement: Two Loops Streaming mode only works with streaming Digital Humans. Ensure your Digital Human is configured with streaming: true.

Testing: Always test your Two Loops head visual with actual conversations to verify smooth transitions and natural appearance.

Performance: Two Loops mode provides better expressiveness without significant performance impact, as loops are preprocessed during video processing.

scheduleLast updated Apr 9, 2026