Context awareness for message moderation

Christopher Dengsø

Aug 6, 2024 — 2 min read

Context is crucial when handling content moderation. One thing might seem innocent in one context, but hateful in a different context.

You can already supply contextId and authorId with content, and this can help you understanding the context when reviewing items in the review queue.

Now you can also enable context awareness in your project settings.

What is context awareness?

When you enable context awareness, the moderation pipeline pulls in the latest messages within the same conversation, thread, or game room, and includes the previous messages when analysing the new message.

How it helps

Each type of model might use context awareness in different ways.

Especially LLM models like AI agents are great at understanding conversations, and they can now assess messages in the light of the existing conversation.

Look at this example:

user 1 -> What's the worst thing you know?

user 2 -> European people [FLAGGED with context awareness]

Simpler ML models can still benefit from context awareness, even though they do not understand conversations.

For example, some users try to circumvent guardrails by spreading their content over multiple messages - now you can catch that as well.

For examples sharing a phone number:

msg 1 -> 2
msg 2 -> 4
msg 3 -> 6
msg 4 -> 5 
msg 5 -> 5 
msg 6 -> 5 
msg 7 -> 5 
msg 8 -> 5 [FLAGGED with context awareness]

Or someone swearing over multiple lines:

msg 1 -> f
msg 2 -> u
msg 3 -> c
msg 4 -> k [FLAGGED with context awareness]

How to enable it

First make sure that you include both contextId and/or authorId in your API requests.

The context ID can be the id of the chatroom, thread, or anything where messages appear sequentially after each other.

The author ID would be the ID of the user that wrote the message.

Context awareness starts working with either of these two fields - but include both for the best results, if possible.

Afterwards, make sure that you enable context awareness in your project settings.

What's next?

We are excited to hear if the context awareness improves the accuracy for your use case. We hope it helps you flag more content that previously was not caught.

If you have any questions or ideas to make them better, please let us know.

How to handle users reporting inappropriate content

Users often come across inappropriate content, and it's crucial for social platforms to handle this scenario effectively. Allowing your users to report content builds trust, maintains a safe online environment, and ultimately improves the bottom line. But as your user base grows, managing these reports can become challenging.

New API endpoints for wordlists and review queues

Today we just rolled out a suite of new API endpoints designed to improve the experience with our Wordlists and Review Queues for enterprise plans. These enhancements offer greater flexibility if you're aiming to customise your moderation interface and leverage our robust moderation and review queue engine. There&

New image toxicity model

The new image toxicity model adds a single but robust label for detecting and preventing harmful images. Where the image NSFW model can distinguish between multiple types of unwanted content, it can fail to generalise to toxic content outside of the provided labels. The toxicity model on the other hand

Object moderation endpoint

Until now, Moderation API allowed for the moderation of individual pieces of text or images. In practice, there’s often a need to moderate entire entities composed of multiple content fields. While one solution has been to call the API separately for each field, this approach can be inefficient and