Using the Toxicity Shield in Salesforce Enhanced Chat
First published: 2026.03.10 | Last revision: 2026.03.10
What is the Toxicity Shield?
The Toxicity Shield feature is an automated, LLM-based moderation tool. this tool evaluates the toxicity of a message to assign it a score. Then, it can take certain actions or not, as applicable, based on this score.
“Toxicity” in this context means harmful or unpleasant content, such as messages that are abusive, aggressive, or demeaning.
How does the Toxicity Shield work?
The Toxicity Shield can be broken down into two parts:
- Assessing and assigning a toxicity score to a message.
- Use this score as a trigger for certain actions (see next section)
The Toxicity Shield runs when the message is ready to be delivered to its recipient. This means that it runs after the translation is performed.
Enabling the Toxicity Shield
Enabling Toxicity Shield for Language IO Enhanced Chat is a three-step process.
- First, Language IO must enable Toxicity Shield for Enhanced Chat for your Language IO account. Submit a request to Language IO Support to enable the feature.
- Your Language IO Customer Self-Service Portal (CSSP) admin must configure Toxicity Shield tolerances and actions for your Language IO account within the CSSP. For more information, see Configuring the Toxicity Filter in the Customer Self-Service Portal.
- After Language IO enables the feature, and your CSSP admin has configured tolerances and action, you must enable the corresponding "Toxicity filter" setting in the Language IO Enhanced Chat Settings of your Salesforce org.
When the feature is enabled, agents do not have the ability to turn it off.
Use and Possible actions
What Toxicity Shield does
Toxicity Shield assesses the toxicity of agent and end-user’s translated messages. Depending on the toxicity of a message relative to your tolerance settings, translations are flagged with a colored dot icon (yellow for low, orange for medium, or red for high toxicity).
If the toxicity of a translated message is medium or high:
-
For inbound messages (from the end user to the agent)
- If “Rewrite inbound toxic messages…” is enabled, Toxicity Shield suggests a rewritten message with reduced toxicity.
- Agents see the sparkle icon next to the rewritten message.
- The original message, its translation, and the colored dot icon are hidden beneath the eye (source) icon, away from the agent’s immediate view.
-
For outbound messages (from the agent to the end user)
- If “Provide suggested rewrite…” is enabled, the translation of the original message is not sent. Toxicity Shield suggests a rewritten message with reduced toxicity.
- If the agent accepts, or edits and sends, the suggested message, the agent see the sparkle icon next to the sent message. The sent message translation (sent to the end user), and the original message and its translation (not sent to the end user), are hidden beneath the eye (source) icon.
- If the agent rejects the suggested message, this action adds a "message-not-delivered" placeholder to the conversation. The suggested message and its translation, and the original message and its translation, are hidden beneath the eye (source) icon.
- If “Provide suggested rewrite…” is disabled, the message is not sent. The agent is prompted to rewrite their message. A "message-not-delivered" placeholder is added to the conversation. The original message and its translation are hidden beneath the eye (source) icon.
- If the agent edits and sends their message, the agent sees the sparkle icon next to the sent message. The sent message translation (sent to the end user), and the original message and its translation (not sent to the end user), are hidden beneath the eye (source) icon.
- If “Provide suggested rewrite…” is enabled, the translation of the original message is not sent. Toxicity Shield suggests a rewritten message with reduced toxicity.
In any case, the end user is never aware if messages have been flagged or rewritten by Toxicity Shield.
Examples
Flagged message (showing the tooltip text from an orange (medium severity) icon)
Rewritten message (showing the tooltip text from the LLM icon)
Viewing the original message and its translation
Limitations
- When enabled, the use of the Toxicity Shield adds about one second of latency to all translations. A rewrite also adds about one additional second of latency.
- There is a credits cost incurred for each message that is rewritten and retranslated.
Apps & scope
The Toxicity Shield is currently available for the following integration:
- SF Enhanced Chat
- Applies to outbound (Agent to Customer) messages
- Applies to inbound (Customer to Agent) messages