Conference item icon

Conference item

Detecting East Asian prejudice on social media

Abstract:
During COVID-19 concerns have heightened about the spread of aggressive and hateful language online, especially hostility directed against East Asia and East Asian people. We report on a new dataset and the creation of a machine learning classifier that categorizes social media posts from Twitter into four classes: Hostility against East Asia, Criticism of East Asia, Meta-discussions of East Asian prejudice, and a neutral class. The classifier achieves a macro-F1 score of 0.83. We then conduct an in-depth ground-up error analysis and show that the model struggles with edge cases and ambiguous content. We provide the 20,000 tweet training dataset (annotated by experienced analysts), which also contains several secondary categories and additional flags. We also provide the 40,000 original annotations (before adjudication), the full codebook, annotations for COVID-19 relevance and East Asian relevance and stance for 1,000 hashtags, and the final model.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.18653/v1/2020.alw-1.19

Authors


More by this author
Institution:
University of Oxford
Division:
SSD
Department:
Oxford Internet Institute
Role:
Author


Publisher:
Association for Computational Linguistics
Host title:
Proceedings of the Fourth Workshop on Online Abuse and Harms
Pages:
162-172
Publication date:
2020-11-29
Event title:
Fourth Workshop on Online Abuse and Harms (WOAH 2020)
Event location:
Virtual event
Event website:
https://www.aclweb.org/portal/content/fourth-workshop-online-abuse-and-harms
Event start date:
2020-11-20
Event end date:
2020-11-20
DOI:


Language:
English
Pubs id:
1644217
Local pid:
pubs:1644217
Deposit date:
2024-08-01

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP