Updates

New NSFW text detection model

Christopher Dengsø

Jul 20, 2022 — 1 min read

We've released a new NSFW ("Not Suitable For Work") model for detecting NSFW or otherwise sensitive text. However, it's still in the experimental stage, so we recommend using it alongside your existing models.

The model can detect and categorize UNSAFE or SENSITIVE content. It covers subjects like profanity, violence, pornography, discrimination, politics etc.

It does overlap a bit with the existing toxicity and propriety models. Still, where those have been trained mainly on conversational data, the NSFW model has been trained on a more extensive and diverse data set. That doesn't mean that the NSFW model is a better choice per se, but it can offer better accuracy and certainty in some edge cases or on different types of content. We've also seen that it can handle spelling mistakes with higher accuracy. We recommend you try all relevant models for your use case and see what works best.

In addition to detecting unsafe content, the model also notices sensitive and controversial topics like politics and religion. That can be the mention of a president or a controversial topic in a neutral or positive context, but if the text is negative or hateful it will usually be labelled unsafe . That way, the sensitive label is like a level before the text is deemed unsafe.

We hope the new model is helpful for your text moderation needs, and please reach out in case you have any questions.

Try it out in your dashboard here.

Read more

Upgraded Content Analytics.

Upgraded Content Analytics.

I'm excited to announce the launch of the upgraded analytics dashboard that will provide deeper insights into user behaviour and content trends on your platform. With a privacy-first design, these analytics tools will allow you to track and understand how people are using and potentially abusing your platform,

Review Queue Label Filter

Review Queue Label Filter

You can now add custom label filters for review queues. This allows you to create queues like: * Show items with the POSITIVE label to find positive user comments. * Show items where the TOXICITY label has a score between 20% and 70% to find content where the AI is uncertain. * Filter

Image moderation now available

Image moderation now available

I'm excited to announce that you can now moderate images with moderation API. Setting up image moderation works similarly to text moderation. You can adjust thresholds, and disable labels that you do not care about when flagging content. We offer 9 different labels out of the box -

2024 January Update

2024 January Update

We are thrilled to kick off 2024 with a host of exciting new features, and we have many more in store for the year ahead. Label Thresholds In your moderation project, you now have the ability to adjust the sensitivity per label, providing fine-grained control over content flagging. Additionally, you