Automated classifiers are tools used to detect harmful content, such as hate speech. These safety measures can be used to significantly reduce people’s experiences of harm online. Researchers also use these tools to identify how a change to a platform (for example, when it changes its rules or removes certain content or users) impact the frequency of hate speech.