Text Splitter
Overview
The Text Splitter is a crucial mechanism that breaks down longer messages into smaller, more manageable chunks. This process significantly improves the readability, speed, and responsiveness of the Digital Human by allowing responses to be delivered incrementally.
By default, the UNITH platform automatically engages the Text Splitter for all Digital Humans in non-streaming mode. However, when you use a plugin, you take full control of the response processing chain, meaning you are responsible for implementing the text splitting logic on your end.
Plugin Implementation
When building a plugin, your core function is to respond synchronously to UNITH's requests. This requires you to receive the complete response text from your source (e.g., an LLM), split it into smaller chunks, and then send those chunks back to the platform in the required format.
Quick Start Guide
Here's a quick guide to help you implement text splitting capabilities within your plugin.
First, you'll need a function to perform the actual splitting. This example uses a regular expression to split the text by common sentence-ending punctuation.
Next, after you've generated the complete response text, you can use the split_response_by_delimiter function to prepare the message for the Digital Human. Remember that the plugin_response must be a list of PluginMessage objects.
def split_response_by_delimeter(response_message):
delimiters = ['.', '!', '?', ':']
regex_pattern = '|'.join('(?<={})'.format(re.escape(delimiter)) for delimiter in delimiters)
chunks = re.split(regex_pattern, response_message)
return [chunk.strip() for chunk in chunks if chunk]plugin_response = []
response = response.content
if response is not None:
# VALID RESPONSE
response_chunks = split_response_by_delimeter(response)
for idx, chunk in enumerate(response_chunks):
plugin_response.append(
PluginMessage(
type="text",
payload=Payload(type="text", message=f"{chunk}"))
)
else:
# FALLBACK MESSAGE
plugin_response.append(
PluginMessage(
type="text",
payload=Payload(type="text", message="I'm sorry but i can't retrieve the information"))
)
return plugin_responseBy adding this logic to your plugin, you ensure that the Digital Human receives the response in manageable segments, which improves both responsiveness and the user experience.
The UNITH API's POST body and response are formatted as a list. This list is designed to support multiple parts for a single message, which can be of different data types (e.g., text, image, or other metadata).
In case of errors, you are responsible to return to the platform a fallback message or a valid PluginMessage.