Meta's new AI deepfake strategy: Increased labeling, reduced content removal

Meta has unveiled updates to its guidelines regarding AI-generated content and manipulated media in response to feedback from its Oversight Board. Starting the upcoming month, Meta plans to attach a "Made with AI" badge to a broader array of content, including deepfakes, to alert users to the AI involvement. It will also provide extra context for content altered in ways that could significantly mislead the public on critical issues.

This initiative may result in a higher volume of content being labeled, especially in a year teeming with elections globally. However, Meta will only label deepfakes that meet "industry standard AI image indicators" or when the content uploader acknowledges its AI-generated nature. AI-generated content not fitting these criteria might not be labeled.

Meta is opting for a strategy that leans more toward transparency and context rather than removal for AI-generated and manipulated media. This approach likely means that more such content will stay on platforms like Facebook and Instagram, marked with labels instead of being taken down.

Another significant shift is that Meta will discontinue the removal of content purely based on its manipulated video policy starting in July. This change comes as Meta navigates the complex terrain of content moderation in light of legal challenges and regulatory demands, such as the European Union's Digital Services Act.

Feedback from Meta's Oversight Board has spurred these policy updates. The Board had called for a reevaluation of Meta's strategy towards AI-generated content, highlighting the need for a more inclusive approach that captures the evolving nature of AI-generated media beyond just videos. As a result, Meta is working towards refining its labeling process, incorporating a "Made with AI" label not just for videos but also for audio and images that are AI-generated or significantly altered. This more comprehensive labeling strategy aims to better inform the public and provide context, without outright removing manipulated content that doesn't violate other community standards.