UNITHdocs
Sign inarrow_forward

This documentation provides an overview of our Text-to-Speech (TTS) service and how you can manage TTS voices for your Digital Humans using our Voice Connectors.

info

Please check our documentation on Voice connectors that we support.

Need a different voice provider? You have full flexibility to create custom voice connectors. Please check out the following repository.

Text-to-Speech (TTS) Overview

Text-to-Speech (TTS) is a service that converts written text into synthesized human-like speech. Your Digital Human must have a TTS voice configured to function correctly. Our platform's Voice Connectors are designed to integrate with a variety of external TTS providers, offering you a wide selection of voice options. We are continuously adding support for new providers.

info

If you're interested in creating a custom Voice Connector, please use the following template as a starting point. Once you've completed your connector, please get in touch with us for final review and approval.

Supported TTS Providers

The platform relies on a number of third-party TTS providers.

Currently supported providers include:

  • Elevenlabs
  • Microsoft azure
warning_amber

Each provider has its own unique set of usable voices, and it is not possible to cross-match voices with different providers. When creating or updating a Digital Human head, you need to configure both the ttsProvider and ttsVoice parameters. These values must correspond to a valid provider name and a specific voice ID from that provider. Use the /voice/all endpoint to obtain the corresponding provider and voiceId values.

Regional Voice Compatibility

Voices available in one region may not be present in others. This can cause a Digital Human created in one region (e.g., the US) to not be fully functional in another (e.g., the EU or Australia).

To ensure that the same Digital Human is fully operational in any region, you can use our common voices file. The provided JSON file contains a list of voices that are guaranteed to be available and fully supported across all regions. We recommend using these voices for maximum compatibility and reliability.

Voice Management Endpoints

You can retrieve the list of voices and their corresponding parameters using the following API endpoints.

1. Retrieving the Full List of Voices

Use the /voice/all endpoint to obtain the complete list of supported voices, including their respective providers and IDs.

Endpoint: /voice/all Method: GET

code
curl -X 'GET' \
  'https://platform-api.unith.ai/voice/all?provider=azure' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer yourBearerToken'

The response data contains the full configuration for each voice, which includes the following controllable parameters:

ParameterDescriptionValue
ttsProviderThe specific TTS provider name.elevenlabs | azure | audiostack
ttsVoiceThe unique voice identifier (Voice ID).N/A
languageThe human language name.N/A
languageCodeThe standard language code (e.g., en-US).N/A
accentThe voice's regional accent.N/A
genderThe intended gender characteristic of the voice.MALE | FEMALE | NEUTRAL | UNKNOWN | CHARACTER

2. Previewing a Voice

You can instantly preview the audio output for any given voice using the preview endpoint.

Endpoint: /voice/preview Method: GET

code
curl -X 'GET' \
  '[https://platform-api.unith.ai/voice/preview?voiceId=yourVoiceId&provider=azure&text=hello%20world](https://platform-api.unith.ai/voice/preview?voiceId=yourVoiceId&provider=azure&text=hello%20world)' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer yourBearerToken'
ParameterDescription
voiceIdThe unique ID of the voice to preview.
providerThe corresponding TTS provider name.
textThe text to be converted to speech.

info

For convenience, you can also select and preview voices directly within the platform's user interface (frontend) when creating or editing a Digital Human head.

scheduleLast updated Mar 6, 2026