Two weeks ago I started an experiment to use adult content labels in the For You algorithm with the goal to reduce the amount of unwanted content:

NSFW in For You - For You notes
your NSFW ≠ my NSFW
https://blog.foryou.club/3mkvhwjlagk2g

Today I'm sharing the results of the experiment:

  • 4.6% fewer users press "show less like this" on a given day. This is statistically significant at the 95% confidence level.

  • 11% fewer total "show less like this" interactions: 72,397 -> 64,446 = -8K. That's a huge reduction!

  • 0.1% more likes (+6.5K)

  • 0.4% fewer feed loads per user - somewhat concerning, but given the number of likes increased it means we probably reduced impressions of unwanted content

The new logic filters nsfw content based on whether the user liked nsfw content in the past. And so it is interesting to see how much this change affected For You for people who have never liked any nsfw posts vs people who did.

Let's first look at the segment of users who have liked only posts with no adult content labels:

  • These users now mostly see posts with no adult labels: 98.3% -> 99.7%.

    • Why didn't it become 100%? This is due to a delay when the adult label is applied. When the recommendation is made the adult label has not arrived yet, but when we log the recommendation the adult label is applied. As a result some nsfw content gets through when it should have been filtered out.

  • The "show less" rate for the "clean" posts decreased 5.4%, though this slice was not supposed to be affected by this change. Could be noise.

  • The "show less" rate for nsfw content went basically to 0

Now let's look at the segment of users who have liked at least one post with the "porn" label:

  • The number of "porn" impressions went down 2.1% - this tells me that the new algorithm is not too restrictive.

  • The like rate for "porn" increased by 2.2% - it means that the content that got filtered out is exactly the content that used to be not liked.

  • The "show less" rate for "porn" went down 15.4% - so the 2.1% of content that we end up filtering out accounted for 15.4% of "show less". We are filtering out the right content.

  • Similar changes for other labels such as "sexual", "nudity" and "sexual figurative".

The last segment we will look at is users who have liked at least one post with "sexual" label but no posts with "porn":

  • For these users we stop showing "porn" labeled content (with the exception of the label timing issue mentioned above). In the control arm the "porn" posts had a very low like rate for these users and very high "show less" rate. So it is a good slice to filter out.

  • Posts with "sexual" label had 33% fewer impressions, but the ones that were shown had a 50% higher like rate and a 17% lower "show less" rate. This tells me we are filtering the right set of posts with the new logic.

Now let's revisit the expectations I've set at the beginning of the experiment:

  • we would see a lower rate of "show less" for nsfw content

    • Indeed, when nsfw content is shown we see a much lower "show less" rate

  • the volume of nsfw labeled post impressions is not affected too much

  • overall like rates per feed load and per user stay the same

    • It is actually +0.5% higher

  • ❌ the number of feed loads per user is slightly higher because users are not turned away when they come across unwanted content

    • We see a slightly lower number of feed loads (-0.4%), but that is probably fine since we see more likes per feed load.

What's next?

I'm surprised how well this change worked. It basically worked as expected, which doesn't happen that often in practice.

I enabled the change for all users this morning.

Thanks everyone for the feedback and ideas on what to improve! Please keep them coming to inspire more tests like this.