Problem Statement

Tracing the Origin of Social Media Post

Social media networks are working continuously to develop and enhance automatic systems that can recognise and reject such content at the time of upload. However, there are situations when such explicit content, including that which contains nudity, is not caught when it is uploaded and ends up online. Such content typically has a propensity to go viral instantly, after which others share it or download it and repost it on their social media profiles.

PS Number: PSDAT001

Domain Bucket: Data Analytics
Category: Software
Dataset : NA

Create a solution that can determine who was the first to post a given fragment of text, a picture, or a video on a specific social networking platform. Please keep in mind that before reposting it from their accounts, some users might have duplicated it and made a few tiny changes.

Background of the Problem

This problem statement should result in a solution that can identify perpetrators who first posted sexually explicit abuse content on social media platforms. Images and videos of individuals of a sexually explicit nature are quite often posted online intentionally by perpetrators on various social media platforms with the sole intent of causing harassment, humiliation and distress to the victim.

Objective

Given a piece of text, image or video snippet as input, build a solution that can identify the person who was the first one to post it online on a particular social media platform. Please bear in mind that people could have copied it and made minor modifications before reposting it from their accounts. Participants are expected to obtain suitable data required to work on this problem statement on their own.

Summary

Social media platforms are constantly trying to create and improve automatic mechanisms to identify and reject such content at the time of upload itself. However, there are times when such content containing nudity and of a sexually explicit nature, fails to be detected at the time of upload and gets published. Such content usually has a tendency to instantly become viral, and others thereafter share it, or download and post it again from their social media accounts.

Social media platforms take down such posts when reported but by the time this is done, several copies have already been made and go in circulation. It quickly becomes ambiguous as to who posted it first, and taking advantage of this ambiguity, the original perpetrator evades detection. It is therefore critical to identify the person who first posted such distressing content online.