Override Legacy Digital Human's Microphone + Language Detection Template Guide
This example template enables users to speak with a Unith Digital Human using a microphone button. It is intended for developers who would like to create a bespoke microphone experience - It includes:
- Voice Activity Detection (VAD)
- Azure Speech SDK
- Automatic language detection
- Transcript preview
- Message delivery via
postMessage
Assumes the Digital Human is configured to accept external events as defined here.
This documentation is relavant for legacy digital humans. If your digital human uses streaming mode, please refer to our SDK documentation instead.
To check whether your digital human is in legacy or streaming mode, see this page.
If the digital human configuration states that videoStreaming=true then your digital human is in streaming mode.
Features
| Feature | Description |
|---|---|
| Click-to-activate mic | Manual start/stop mic via button |
| Voice Activity Detection | Only triggers when real speech is detected |
| Language Detection | Auto-detects up to 4 supported languages |
| Live Transcript | Displays recognized speech as text |
| Sends Message to DH | Final transcript sent to Unith iframe |
How It Works
1. Embed UNITH Iframe
<iframe
id="my-iframe"
src="https://chat.unith.ai/ORG-ID/HEAD-ID?api_key=YOUR_API_KEY&mode=video"
allow="microphone">
</iframe>- Use
mode=videoif you would like to hide the UNITH chat widget and only leverage the video-component - Unsert the appropriate
api_keyfor your org - Must include
allow="microphone"
For more information on video-only mode, see this page.
2. Azure Speech Key Configuration
Step 1: Get Your Credentials
- Log in to Azure Portal
- Create a Speech resource (Cognitive Services)
- Copy your:
- Key
- Region
Step 2: Add to Template
Replace these lines in your template (below):
const speechKey = "YOUR_AZURE_SPEECH_KEY";
const serviceRegion = "YOUR_AZURE_REGION"; // e.g. "eastus"Do not expose real keys in production environments.
3. Auto Language Detection
Configure your supported languages:
const autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages([
"en-US", "fr-FR", "es-ES"
]);Limit: Azure allows max 4 languages for auto-detect.
4. Transcript Display
(OPTIONAL) Live updates as you speak:
transcriptEl.innerText = "Transcript: " + transcriptBuffer;5. Message Delivery
After recognition ends, this is called:
iframe.contentWindow.postMessage({
event: "DH_MESSAGE",
payload: { message: finalMessage }
}, "https://chat.unith.ai");Configure the Digital Human to accept external events as defined here. This can also be done directly via the advanced modification window in interFace.
Silence Handling
After 2 seconds of silence, the recognizer will stop:
function resetSilenceTimer() {
silenceTimer = setTimeout(() => {
status.innerText = "Status: Silence detected. Stopping recognition...";
recognizer.stopContinuousRecognitionAsync();
}, 2000);
}Modify this if you want always-on behavior.
Customization Options
| Task | How |
|---|---|
| Change languages | Edit the fromLanguages array |
| Change UNITH Digital Human | Modify iframe src |
| Customize UI | Change button or layout |
| Disable silence timeout | Remove resetSilenceTimer() logic |
Setup Checklist
| Task | Complete? |
|---|---|
| Embed iframe with correct URL | ⬜ |
Add Azure speechKey and region | ⬜ |
| Configure languages | ⬜ |
Replace placeholder vad.js | ⬜ |
| Test in browser | ⬜ |
Template & Examples Files
example template.htmlvad.js
example template
vad.js