Using the Toxicity Shield in Salesforce Enhanced Chat

Introduced in: v7.19 (CSSP), v1.21 (Salesforce Enhanced Chat)
First published: 2026.03.10 | Last revision: 2026.03.10

What is the Toxicity Shield?

The Toxicity Shield feature is an automated, LLM-based moderation tool. this tool evaluates the toxicity of a message to assign it a score. Then, it can take certain actions or not, as applicable, based on this score.

“Toxicity” in this context means harmful or unpleasant content, such as messages that are abusive, aggressive, or demeaning.

Content warning: For illustration purposes, this page contains examples with strong or abusive language.

How does the Toxicity Shield work?

The Toxicity Shield can be broken down into two parts:

Assessing and assigning a toxicity score to a message.
Use this score as a trigger for certain actions (see next section)

The Toxicity Shield runs when the message is ready to be delivered to its recipient. This means that it runs after the translation is performed.

Enabling the Toxicity Shield

Enabling Toxicity Shield for Language IO Enhanced Chat is a three-step process.

First, Language IO must enable Toxicity Shield for Enhanced Chat for your Language IO account. Submit a request to Language IO Support to enable the feature.
Your Language IO Customer Self-Service Portal (CSSP) admin must configure Toxicity Shield tolerances and actions for your Language IO account within the CSSP. For more information, see Configuring the Toxicity Filter in the Customer Self-Service Portal.
After Language IO enables the feature, and your CSSP admin has configured tolerances and action, you must enable the corresponding "Toxicity filter" setting in the Language IO Enhanced Chat Settings of your Salesforce org.

When the feature is enabled, agents do not have the ability to turn it off.

Use and Possible actions

What Toxicity Shield does

Toxicity Shield assesses the toxicity of agent and end-user’s translated messages. Depending on the toxicity of a message relative to your tolerance settings, translations are flagged with a colored dot icon (yellow for low, orange for medium, or red for high toxicity).

Note: For an overview of tolerance levels, see Configuring the Toxicity Filter in the Customer Self-Service Portal.

If the toxicity of a translated message is medium or high:

For inbound messages (from the end user to the agent)
- If “Rewrite inbound toxic messages…” is enabled, Toxicity Shield suggests a rewritten message with reduced toxicity.
- Agents see the sparkle icon next to the rewritten message.
- The original message, its translation, and the colored dot icon are hidden beneath the eye (source) icon, away from the agent’s immediate view.
For outbound messages (from the agent to the end user)
- If “Provide suggested rewrite…” is enabled, the translation of the original message is not sent. Toxicity Shield suggests a rewritten message with reduced toxicity.
  - If the agent accepts, or edits and sends, the suggested message, the agent see the sparkle icon next to the sent message. The sent message translation (sent to the end user), and the original message and its translation (not sent to the end user), are hidden beneath the eye (source) icon.
  - If the agent rejects the suggested message, this action adds a "message-not-delivered" placeholder to the conversation. The suggested message and its translation, and the original message and its translation, are hidden beneath the eye (source) icon.
- If “Provide suggested rewrite…” is disabled, the message is not sent. The agent is prompted to rewrite their message. A "message-not-delivered" placeholder is added to the conversation. The original message and its translation are hidden beneath the eye (source) icon.
  - If the agent edits and sends their message, the agent sees the sparkle icon next to the sent message. The sent message translation (sent to the end user), and the original message and its translation (not sent to the end user), are hidden beneath the eye (source) icon.

In any case, the end user is never aware if messages have been flagged or rewritten by Toxicity Shield.

Note: Glossary imposition does not take place in rewritten messages. Glossary imposition that occurred in the original translation may be maintained, and only the toxic content removed in the rewrite. However, in some cases, if the glossary terms themselves are considered toxic, they could be rewritten.

Examples

Flagged message (showing the tooltip text from an orange (medium severity) icon)
Screenshot 2026-03-09 at 18.31.10.png

Rewritten message (showing the tooltip text from the LLM icon)
Screenshot 2026-03-09 at 18.31.36.png

Viewing the original message and its translation
Screenshot 2026-03-09 at 18.32.42.png

Limitations

When enabled, the use of the Toxicity Shield adds about one second of latency to all translations. A rewrite also adds about one additional second of latency.
There is a credits cost incurred for each message that is rewritten and retranslated.

Apps & scope

The Toxicity Shield is currently available for the following integration:

SF Enhanced Chat
- Applies to outbound (Agent to Customer) messages
- Applies to inbound (Customer to Agent) messages

Menu

Articles in this section

What is the Toxicity Shield?

How does the Toxicity Shield work?

Enabling the Toxicity Shield

Use and Possible actions

What Toxicity Shield does

Examples

Limitations

Apps & scope

Menu

Articles in this section

Using the Toxicity Shield in Salesforce Enhanced Chat

What is the Toxicity Shield?

How does the Toxicity Shield work?

Enabling the Toxicity Shield

Use and Possible actions

What Toxicity Shield does

Examples

Limitations

Apps & scope

Related articles