4.1 Applications

This section will introduce several case studies that apply robustness techniques in real world applications, such as image or text classification. The concept of verified or certified robustness is a principled concept in the study of robustness in learning. It is a mathematical concept though can be smartly achieved in realistic applications. Let’s see how researchers resolve this problem of theory-reality mismatch.

4.1.1 Certified Robustness in Text Classification

Two papers (Jia et al. 2019) (Huang et al. 2019) accepted in EMNLP 2019 happen to be commonly focusing on certified robustness of text classifiers, and they even use the same technique for computing the bounded loss in the worst case via the co-called Interval Bound Propagation (IBP), a technique first appeared in Dvijotham et al. (2018) with applications to image classification. I list their name in the following:

Certified Robustness to Adversarial Word Substitutions, Robin Jia et al at Stanford University.
Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation, Po-Sen Huang et al at DeepMind.

4.1.1.1 Adversaries in Text Classification

The task of text classification is defined as standard classification setting, with \(\mathcal{Y}\) as the label set, \(\mathcal{X}\) as the input domain. Specifically, each \(\mathbb{x} = (\mathbb{x}_1, \dots, \mathbb{x}_m) \in \mathcal{X}\) is a discourse (sentence or paragraph) containing a sequence of discrete symbols within a vocabulary, that is, \(\mathbb{x}_i \in \mathcal{V}\).

4.1.2 Adversarial Examples for Natural Language

This part summarizes several works for generating (natural) adversarial examples for natural langauge inputs.

HotFlip (Ebrahimi et al. 2017):
Paraphrase-based advesarials (Ribeiro, Singh, and Guestrin 2018):

4.1.2.0.1 How to group adversarial papers in NLP?

Generating Natural Language Adversarial Examples, arXiv Apr. 2018.
Robust Neural Machine Translation with Doubly Adversarial Inputs, ACL 2019.
Discrete Adversarial Attacks and Submodular Optimization with Applications to Text Classification, SysML 2019.
GenAttack: Practical Black-box Attacks with Gradient-Free Optimization, arXiv May 2018.