3.4 Understanding internal representation

Neural networks learn continuous hidden representations as a byproduct of the main supervised or unsupervised task. There are many works try to shed light on the question: what interpretable information is learned and entailed in these hidden representations. This article is try to group them in a systematic way and try to seek for frontiers in this pursuit.

Representation probe is one of the most frequently used technique for representation explanation, see Table SM1 in (Belinkov and Glass 2019) for a enumerative review of recent efforts on this very direction. However, a recent EMNLP 2019 best paper runner up (Hewitt and Liang 2019) argue that the capacity of the probe should be considered while using the constructed probing task. Let us discuss their benefit and criticism at the same time below.