Working paper

Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate

Abstract:: Detecting online hate is a complex task, and low-performing detection models have harmful consequences when used for sensitive applications such as content moderation. Emoji-based hate is a key emerging challenge for online hate detection. We present HatemojiCheck, a test suite of 3,930 short-form statements that allows us to evaluate how detection models perform on hateful language expressed with emoji. Using the test suite, we expose weaknesses in existing hate detection models. To address these weaknesses, we create the HatemojiTrain dataset using an innovative human-and-model-in-the-loop approach. Models trained on these 5,912 adversarial examples perform substantially better at detecting emoji-based hate, while retaining strong performance on text-only hate. Both HatemojiCheck and HatemojiTrain are made publicly available.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Kirk, H., Vidgen, B., Röttger, P., & Hale, S. A. (2021). Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate.

MLA Style

Kirk, H., et al. Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate. 2021.

Chicago Style

Kirk, H, B Vidgen, P Röttger, and SA Hale. 2021. “Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate.”
Share
Print

Access Document

Files:: Kirk_et_al_2021_Hatemoji_a_test_suite--.pdf

(Preview, Version of record, 1.2MB, Terms of use)

Authors

+ Kirk, H More by this author

Role:: Author

+ Vidgen, B More by this author

Role:: Author

+ Röttger, P More by this author

Role:: Author

+ Hale, SA More by this author

Institution:: University of Oxford
Division:: SSD
Sub department:: Oxford Internet Institute
Oxford college:: Hertford College
Role:: Author
ORCID:: 0000-0002-6894-4951

+ Volkswagen-Stiftung More from this funder

Grant:: 92136

Publication date:: 2021-08-12

Language:: English
Keywords:: SBTMR
Pubs id:: 1190679
Local pid:: pubs:1190679
Deposit date:: 2021-08-12

Terms of use

Copyright holder:: Kirk et al.
Copyright date:: 2021

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP