In a recent press briefing, Facebook informed that technology has started to play a crucial role in content moderation on the platform.
This shift has taken place to prioritize the reported content. While there are still around 15,000 human reviewers across over 50 time zones, Artificial Intelligence (AI) is also primarily helping in proactively removing content that goes against Facebook’s policies.
Between April and June this year, 95 percent of the content Facebook removed was identified and removed by their technology without needing someone to report to them. That included 99.6 percent of fake accounts, 99.8 percent of spam, 99.5 percent of violent and graphic content, 98.5 percent of terrorist content, and 99.3 percent of child nudity and sexual exploitation content.
Furthermore, they shared that they now prioritize content that needs reviewing based on several factors such as virality, severity, and likelihood of violation.
ALSO READ
Facebook Introduces Disappearing Messages For Instagram And Messenger
Categorizing content in this way, regardless of when it was shared on Facebook or whether it was reported by a user or detected by their technology, allows them to get to the highest severity content first.
It also means the reviewers in their Global Operations team spend more time on complex content issues where thorough judgment is required and less time on lower severity reports that technology is capable of handling.
Although technology is playing a gradually increasing role in the way Facebook moderates content, for certain posts, they still use a combination of technology along with reports from the community and human review to identify and review content against their Community Standards.
This is done to ensure the context of the post being reviewed is understood better. The technology created for this is called Whole Post Integrity Embeddings (WPIE).
Lastly, they shared details about a newly developed technology called XLM-R, which can understand the text in multiple languages. This model is trained in one language and then used with other languages without the need for additional training data or content examples.