4.1 Applications

This section will introduce several case studies that apply robustness techniques in real world applications, such as image or text classification. The concept of verified or certified robustness is a principled concept in the study of robustness in learning. It is a mathematical concept though can be smartly achieved in realistic applications. Let’s see how researchers resolve this problem of theory-reality mismatch.

4.1.1 Certified Robustness in Text Classification

Two papers (Jia et al. 2019) (Huang et al. 2019) accepted in EMNLP 2019 happen to be commonly focusing on certified robustness of text classifiers, and they even use the same technique for computing the bounded loss in the worst case via the co-called Interval Bound Propagation (IBP), a technique first appeared in Dvijotham et al. (2018) with applications to image classification. I list their name in the following:

  • Certified Robustness to Adversarial Word Substitutions, Robin Jia et al at Stanford University.
  • Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation, Po-Sen Huang et al at DeepMind.

4.1.1.1 Adversaries in Text Classification

The task of text classification is defined as standard classification setting, with \(\mathcal{Y}\) as the label set, \(\mathcal{X}\) as the input domain. Specifically, each \(\mathbb{x} = (\mathbb{x}_1, \dots, \mathbb{x}_m) \in \mathcal{X}\) is a discourse (sentence or paragraph) containing a sequence of discrete symbols within a vocabulary, that is, \(\mathbb{x}_i \in \mathcal{V}\).

4.1.2 Adversarial Examples for Natural Language

This part summarizes several works for generating (natural) adversarial examples for natural langauge inputs.

  • HotFlip (Ebrahimi et al. 2017):
  • Paraphrase-based advesarials (Ribeiro, Singh, and Guestrin 2018):