The ultimate guide to content moderation

Many of the very best things to be found online are user-generated.

And so are many of the very worst.

If user-generated content is important to your online platform, content moderation is critical to creating experiences that feel safe and secure—and is make-or-break for growing your user base.

This guide will help you evaluate what the best content moderation approach is for your platform. It will answer your fundamental questions and link to more detailed explanations.

Let’s begin.

Table of contents:

  1. What is content moderation?
    • How do platforms handle content moderation?
  2. Why is content moderation important?
    • Why is content moderation so challenging?
    • How is the metaverse changing content moderation?
  3. What is trust and safety?
    • How transparent should your trust and safety practices be?
  4. What are the main content moderation types?
    • Pre-moderation
    • Post-moderation
    • Reactive moderation
    • Distributed moderation
  5. What is automated moderation?
    • Why use automated content moderation?
  6. What is human moderation?
    • How should you hire, train and manage your human content moderators?
  7. What content moderation tools do moderators need?
  8. Should you outsource your content moderation?


What is content moderation?

Content moderation is the monitoring and screening of user-generated content by online platforms. The aim of content moderation is to determine if this content adheres to the specific rules and guidelines of the platform—and to remove it if it doesn’t.

Today, all sorts of online platforms rely on user-generated content. Examples include social media platforms, online marketplaces, sharing economy apps, dating sites, communities, forums and chat rooms.

The content uploaded to these platforms can be problematic in all sorts of ways. It can be offensive, obscene, illegal, upsetting, fraudulent, misleading or simply (in the form of spam) irrelevant and irritating.

Some examples of problematic content include hate speech, trolling, flaming, spamming, graphic content (depicting sexual abuse, child abuse and other violent or upsetting acts), propaganda, misinformation and content containing fraudulent links.

As an online platform, you need to be able to rapidly identify such content before it does damage—both to your users and your platform’s reputation.

Once identified, you may then choose to remove the content, tag it with a warning and/or allow your platform users to block or filter it themselves.

How do platforms handle content moderation?

Depending on the size of your user base, you’ll carry out this screening and moderating process through a combination of algorithmic AI tools, user reporting and human review.

But however you do it, you can be sure it’s going to be tough.

Content moderation is already a huge challenge for online platforms (and the bigger the platform, the bigger the challenge)—and the emergence of the metaverse promises to swell this challenge to even more epic proportions.

But as daunting as this is, for most online platforms content moderation is completely unavoidable. Here’s why …


Why is content moderation important?

User-generated content is big business.

It helps brands convey authenticity, establish brand loyalty, and grow communities; it acts as a trust signal; it helps increase conversions and influence purchasing decisions. For platforms like YouTube and Reddit, user-generated content is the main or sole attraction for visitors.

But all this highly valuable user-generated content comes at a price.

From the point of view of users, content moderation is important because it protects us from accidentally viewing disturbing, misleading, inflammatory or dangerous content.

From the point of view of platforms, content moderation performs a number of vital ethical and commercial functions:

  • Protecting your users and the wider community from harm. We’ve all seen the grave repercussions disinformation and socially divisive content can have on the world.
  • Protecting your platform’s reputation. If your platform isn’t a pleasant or trustworthy place to be, customers will quickly go elsewhere (and not come back).
  • Protecting your brand identity. What does it say about your brand if your platform is (even unwittingly) hosting offensive content? Nothing good. Customers expect you to deal with it quickly, as do advertisers, who don’t want their products tarnished by associations.

Why is content moderation so challenging?

Not only do you need to make decisions that keep your platform clean while avoiding the appearance of censorship or bias, but you also need to make these decisions incredibly quickly—and at an often huge scale.

Although Facebook sits at the extreme end when it comes to sheer quantity of user-generated content, its example helps illustrate the scale of the challenge.

Facebook employs at least 15,000 content moderators (most of them outsourced) to review posts, pictures and videos that have been flagged by AI or reported by users. And these 15,000 moderators have to moderate about 3 million posts every day.

In the first quarter of 2020, Facebook removed 39.5 million pieces of content containing adult nudity or sexual activity, 25.5 million containing violent or graphic content, and 11 million relating to terrorism and organised hate.

Unless you work for Twitter or TikTok, you’re unlikely to be dealing with numbers like that. But your numbers are likely big enough to cause you problems.

How is the metaverse changing content moderation?

User-generated content is fundamentally changing with the emergence of the metaverse—and this change brings a whole new set of challenges for content moderators.

Many are predicting the metaverse will move us from a Web 2.0 model, in which platforms govern user content from the top-down, to a Web 3.0 model, where user content is distributed peer-to-peer.

If this happens, it will be a lot more difficult to moderate user behavior.

What’s more, in the metaverse, content moderation will also become conduct moderation. Users in the metaverse take a virtual-physical form as digital avatars, and can perform offensive or harassing actions such as groping and stalking—in real time.

From the perspective of AI models, it’s harder to detect physical actions like these, which are often more subtly nuanced than words, and which they haven’t yet been fed a ton of data on as they have with language.

And because the metaverse will be experienced by many in the form of virtual reality, users who are harassed or abused experience this in a more visceral way than in a traditional computer or game environment.

Moderating the metaverse is, then, a big challenge—one Andrew Bosworth, the current CTO of Meta, has said is “virtually impossible.”

Nevertheless, given the economic incentives involved (McKinsey believes the metaverse has the potential to generate up to $5 trillion by 2030), it’s a challenge that can’t be ducked.

So to summarize: content moderation is very important (and is set to become even more important) and very challenging (and set to become even more challenging).

The good news is that there are strategies and technologies to help your business face these challenges. And we’re about to outline them all: starting with trust and safety.


What is trust and safety?

Trust and safety provides a framework that underpins the work of content moderation.

Typically defined, updated and upheld by a trust and safety team or business function, the principles help online platforms reduce the risk of their users being exposed to harm—be that outright fraud or the harm that comes with the violation of community guidelines by other users.

That’s the ‘safety’ part. The ‘trust’ comes through helping to create a culture of care, protection and positive experiences around and within a company’s platforms and services.

For tips on setting up and managing your trust and safety team and framework, read our article on trust and safety.

How transparent should trust and safety be?

In recent years, many companies, particularly social networks like Facebook and Twitter, have been heavily criticized for—and lost consumer trust over—a lack of transparency around content moderation.

Users of these platforms have been confused, disturbed and angered by vague rules around acceptable and unacceptable content, unclear explanations of moderation decisions, and automated removal of posts that seem permissible within guidelines.

To ensure your customers don’t lose trust with your platform, you need to make content moderation as transparent as possible.

You can do this by regularly reviewing your own moderation policies and publicly disclosing the actions you decide to take. It’s also a good idea to publish a set of community guidelines and a content moderation policy your platform users and employees can refer to.

Here are a few tips for putting together these guidelines:

  • Your expectations and rules should be clear, comprehensive and accessible to all, including all users and everyone involved in content moderation within the business, from human moderators to data scientists developing moderation algorithms
  • Include a clear statement of purpose, examples of good and bad behaviors, and the process for reporting bad behaviors
  • To avoid ambiguity or misunderstanding, you should cover all the languages spoken by your platform users—and include differences based on cultural and geographic appropriateness


What are the main content moderation types?

There are four main types of content moderation. Each type comes with its own set of benefits and risks, and each suits particular types of platform better than others.


User-submitted content is placed in a moderator’s queue before being approved and made visible to all (or disapproved and blocked).

This method offers platforms a high degree of control over what appears on their site, making it well-suited to platforms where there is a high degree of legal risk.

This method can be problematic because it slows down the process of publishing content—which can be particularly damaging to platforms whose users enjoy sharing content in a rapid-fire, conversational manner.

As this method uses human moderators to approve all content, it’s expensive and difficult to scale. Once your community grows past a certain threshold, your moderators won’t be able to keep up and your platform will develop bottlenecks.


User-submitted content is immediately displayed publicly and placed in a queue for moderators to approve or remove.

In the age of high-speed internet, users expect immediacy. Post-moderation gives it to them—and has the advantage of not impeding fast-moving communities.

There are issues, though: because it depends on human moderators, post-moderation can become prohibitively expensive as communities grow. And because content is published without pre-screening, legal responsibility for hosting that content may fall on the platform.

Reactive moderation

This method relies on community members to flag content that breaches platform rules or is otherwise undesirable. Typically, platforms will attach a reporting button to each piece of user-generated content for this purpose.

Reactive moderation can act as a safety net to back up pre- or post-moderation methods. It can also be used as the sole method of moderation.

This method has the advantage of scaling with your platform’s community and—theoretically, at least—absolving you from responsibility for problematic content.

But there are of course big risks involved in allowing such content to stay up on your site for any period of time. You might not consider yourself responsible for this content, but your reputation could take a hit regardless.

Distributed moderation

In this democratic method, responsibility for moderating every piece of user-generated content is distributed among a number of people.

For example, a platform may have a rating system that enables (and obliges) members to vote on whether a piece of user-generated content adheres to the platform’s guidelines.

Alternatively, this responsibility can be given to platform employees, with staff voting on the acceptability of submissions and an average score being used to determine whether submissions should be reviewed or not.

Because most platforms don’t trust their own members to self-moderate, and staff-voting can lead to fractious internal divisions, distributed moderation is rarely used.

None of these methods are perfect, and all four run up against the same big challenge: balancing the need for speed (crucial in an age of shrinking attention spans) and mass competition with protecting users and platforms from harmful content.

Squaring this particular circle is where automated content moderation comes in.


What is automated content moderation?

Automated content moderation is performed by artificial intelligence (AI) and machine learning (ML) models.

Ideally, these models are built from platform-specific data, and the more platform-specific they are, the more accurate you can expect their content moderation of your platform to be.

Why use automated content moderation?

In theory, automated content moderation offers the best of both worlds for platforms. Users can submit content that goes live near-instantly, while the community and platform are generally protected from problematic content.

But of course, automated content moderation isn’t perfect, either, and comes with its own set of advantages and limitations.

You can read more about automated content moderation—and its pros and cons—in depth in our article on Automated Content Moderation.

But to skip to the chase—for all its impressive scalability and efficiency, AI is far from the perfect tool for content moderation, and using it in isolation is likely to result in errors and reputational blowback.

For the foreseeable future, content moderation must rely on a degree of human content moderation.


What is human content moderation?

When it comes to screening potentially problematic content, humans (for all their flaws) still have many advantages over machines.

Humans can empathize with other humans—and with the complicated range and mixture of emotions humans feel—and so can detect subtle contextual nuances in user-generated content that today’s best algorithms can’t. They can also pick up on cultural references that would entirely elude even the cleverest AI.

There’s also the perceptions of your customers to consider. Many of us still feel distrust toward AI.

Having human moderators for your customers to interact with helps your customers feel more connected to your business—often at precisely the moment when they’re most likely to turn against you: when something they post has been taken down or censored.

How should I hire, train, manage and protect my human content moderators?

Moderating content is stressful and important work. Human content moderators have to be carefully selected, trained and managed.

For advice on putting together an effective content moderation team, read our article: How to Structure and Run a Content Moderation Team.

They also have to be protected. The work of a content moderator can be mentally and emotionally taxing and all too often traumatizing.

There are all sorts of ways you can protect your moderators that rely on behavioral science.

You can find out all about them in our article Protecting the Mental Health of Your Content Moderators.

Aside from managing your team, you’ll also need to think about equipping them with a range of content moderation tools.

What content moderation tools do moderators need?

Your team can use automated content moderation tools to screen user-generated content and flag content that is potentially unacceptable for their attention, including:

  • Computer vision: Uses object recognition to screen images, including uploaded and live-streamed video, for problematic content such as nudity, self-harm, gore, alcohol and drugs, weapons, obscene gestures, and culturally defined inappropriateness. It also analyses text within images.
  • Audio algorithms: These are required to detect inappropriate audio elements in videos. Using speech-to-text models, they transcribe audio in human readable text—which can then be analyzed.
  • Natural Language Processing (NLP): These analyze and moderate text for problematic content. NLP algorithms are becoming ever more sophisticated—and can already identify a text’s meaning, emotional charge and even tone.

When it comes to moderating unambiguously unacceptable content, your team can deploy automated filters:

  • Word filters to filter, ‘star out’ and replace banned words or block posts containing those words entirely
  • An IP ban list to prevent repeat offenders (this is particularly useful when dealing with spammers)

And for dealing with common incoming customer inquiries:

  • Conversational AI: to engage in conversations with your customers at scale, with more complex queries getting handed off to your human content moderators

As you can see, having an in-house content moderation team (on top of requiring you to train, manage and protect that team) will oblige you to implement and manage a range of products.

The effort and cost involved in this might give you cause to consider outsourcing your content moderation. But is that the right move?


What content moderation tools do moderators need?

Given the scale of the task of content moderation, and the high degree of specialized training and care human content moderators need, it’s not surprising that many online platforms prefer to outsource content moderation to a BPO.

Depending on factors such as the size and regional scope of your business, outsourcing could be the way to go. Having an in-house content moderation team gives you more direct control over the moderation process—and makes instant communication between you and your moderators easier.

On the other hand, an in-house team can also cost a lot more to hire and care for: aside from the cost of standing up a new function, funding recruitment and paying salaries, you’ll also have to pay for your team members’ overhead, employee benefits and training. (An outsourced team will be well-trained going in.)

What’s more, if you have to moderate content across different regions and languages, a BPO could get you up and running in particular regions quickly. They can also ensure your content moderation teams are available 24/7.

Read our article on Outsourcing Content Moderation for a more detailed breakdown of benefits and risks of outsourcing—plus advice on what to consider when selecting a company to outsource your content moderation to.


Should you outsource your content moderation?

So there you have it.

That was a high-level overview of everything you need to know about content moderation. Remember to look at our more detailed articles and discover a host of insights and best practices.

If you think your business could use a helping hand with content moderation, you should get in touch.

Our robust processes and efficient in-house technologies allow you to scale your content moderation capabilities with demand.
Our content moderation solution combines tailor-made automation, augmented human content moderation and real-time monitoring to completely protect your platform’s user experience.
Our expert team of content moderation experts moderate 1 billion pieces of content in more than 25 languages a year. Thanks to our highly flexible operations, they’re available 24/7. And you can feel good about our teams—we take rigorous steps to protect their mental health.

Sounds good?

Contact Us