Configuring the Toxicity Shield in the Customer Self-Service Portal
Introduced in: v7.19 (CSSP), v1.21 (Salesforce Enhanced Chat)
First published: 2026.03.10 | Last revision: 2026.04.10
First published: 2026.03.10 | Last revision: 2026.04.10
What is the Toxicity Shield?
The Toxicity Shield feature is an automated, LLM-based moderation tool. this tool evaluates the toxicity of a message to assign it a score. Then, it can take certain actions or not, as applicable, based on this score.
“Toxicity” in this context means harmful or unpleasant content, such as messages that are abusive, aggressive, or demeaning.
Content warning: For illustration purposes, this page contains examples with strong or abusive language.
Note: The Toxicity Shield is not available on the EU-server CSSP (https://uk.golinguist.com/) at this time.
Configuring the Toxicity Shield in the Customer Self-Service Portal
Administrators can configure the Toxicity Shield in the Customer Self-Service Portal (CSSP).
Enabling the Toxicity Shield
Enabling Toxicity Shield is a multi-step process.
- First, Language IO must enable Toxicity Shield for Enhanced Chat for your Language IO account. Submit a request to Language IO Support to enable the feature.
- You must configure Toxicity Shield tolerances and actions for your Language IO account within the CSSP (see below).
- (for Enhanced Chat) After Language IO enables the feature, and you have configured tolerances and action, you must enable the corresponding "Toxicity filter" setting in the Language IO Enhanced Chat Settings of your Salesforce org.
Configuring the Toxicity Shield
- To access the configuration screen, go to Integrations > Toxicity Filter:
- The main configuration screen opens. You can configure inbound (user to agent) and outbound (agent to user) settings independently:
- Click on any option to expand (Settings layout for either option is identical).
- You can set a general tolerance on a scale of 0 to 100 (see Step 6 below for granular tolerance).
A tolerance of 0 means means extra sensitive (it will filter messages very aggressively), and a tolerance of 100 means not sensitive (it will let through most messages). The toxicity score is indicated in the agent’s UI through color-coded indicators (yellow: low, orange: medium, and red: high) - (Optional) Select “Soften toxic messages…” to set the filter to rewrite messages that are deemed toxic. If this option is not selected, the original message is delivered and only the color-coded indicators are shown. This option is selected by default.
- (Optional) If you need to have a granular level of filtering, select “Use category specific thresholds…”. The following section opens with dedicated filters for six sub-categories of speech. The general tolerance score still applies when granular scores are set.
- When you set granular category tolerance scores in addition to the general tolerance score, if a message exceeds any of these tolerances, granular or general, the content is filtered as toxic.
In other words:- if the general tolerance is met, but one of the granular tolerances is exceeded, the content is filtered.
- If the granular tolerances are met, but the general tolerance is exceeded, the content is filtered.
- The severity indicators are not based on absolute tolerance levels: they are relative to the tolerance levels that you set.
For example, if your tolerance is set to 30, and an incoming message scores a 40, this may trigger a high toxicity/red indicator. However, if your tolerance is set at 40, this same message with a score of 40 may only trigger a medium toxicity/orange indicator.
- When you set granular category tolerance scores in addition to the general tolerance score, if a message exceeds any of these tolerances, granular or general, the content is filtered as toxic.
-
(Optional) You can test your set tolerance levels with the Preview tool in your target languages of choice, and adjust accordingly until you find the level that suits your needs.
Note: In Enhanced Chat, the toxicity level is evaluated after a translation is done. However, in the CSSP, the content is evaluated directly, without any translation step beforehand. This difference can result in different toxicity levels in Enhanced Chat compared to the CSSP. Therefore, do not always expect the same toxicity levels for translated content in Enhanced Chat versus the same content in CSSP.- For example, you can set a language (this is the target language of your example, there is no translation in the preview tool), and use a sentence in this language that contains an expletive:
- The preview tool returns the analysis of the sentence, its score, and a suggested replacement:
- For example, you can set a language (this is the target language of your example, there is no translation in the preview tool), and use a sentence in this language that contains an expletive:
- When you are ready, click Save to save your settings.