Conference item
Assessing robustness of text classification through maximal safe radius computation
- Abstract:
- Neural network NLP models are vulnerable to small modifications of the input that maintain the original meaning but result in a different prediction. In this paper, we focus on robustness of text classification against word substitutions, aiming to provide guarantees that the model prediction does not change if a word is replaced with a plausible alternative, such as a synonym. As a measure of robustness, we adopt the notion of the maximal safe radius for a given input text, which is the minimum distance in the embedding space to the decision boundary. Since computing the exact maximal safe radius is not feasible in practice, we instead approximate it by computing a lower and upper bound. For the upper bound computation, we employ Monte Carlo Tree Search in conjunction with syntactic filtering to analyse the effect of single and multiple word substitutions. The lower bound computation is achieved through an adaptation of the linear bounding techniques implemented in tools CNN-Cert and POPQORN, respectively for convolutional and recurrent network models. We evaluate the methods on sentiment analysis and news classification models for four datasets (IMDB, SST, AG News and NEWS) and a range of embeddings, and provide an analysis of robustness trends. We also apply our framework to interpretability analysis and compare it with LIME.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, 3.9MB, Terms of use)
-
- Publication website:
- https://www.aclweb.org/anthology/2020.findings-emnlp.266
Authors
- Publisher:
- Association for Computational Linguistics
- Host title:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Pages:
- 2949–2968
- Publication date:
- 2020-11-01
- Acceptance date:
- 2020-09-15
- Event title:
- 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 16 – 20 November 2020, online
- Event location:
- Online
- Event website:
- https://2020.emnlp.org/
- Event start date:
- 2020-11-16
- Event end date:
- 2020-11-20
- Language:
-
English
- Keywords:
- Pubs id:
-
1136154
- Local pid:
-
pubs:1136154
- Deposit date:
-
2020-10-05
Terms of use
- Copyright holder:
- Association for Computational Linguistics
- Copyright date:
- 2020
- Rights statement:
- © 2020 Association for Computational Linguistics. Licensed under a Creative Commons Attribution 4.0 International License.
- Notes:
- This paper was presented at the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 16–20 November 2020, online. The paper is available online from the Association for Computational Linguistics at: https://www.aclweb.org/anthology/2020.findings-emnlp.266/
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record