rejection系列3 OpenMax

paper: Towards Open Set Deep Networks. CVPR


closed set recognition 天然的特性使得它必须选择一个类别作为预测对象。但是实际场景下, recognition system 必须学会 reject unknown/unseen classes 在 testing 阶段。

于是乎,作者提出了一个新的 model layer, OpenMax, 能够估计一个样本输入是来自于 unknown class 的概率。

A key element of estimating the unknown probability is adapting Meta-Recognition concepts to the activation patterns in the penultimate layer of the network.
所以关键词是 meta-recohnition, activation pattern/vector.


很多工作是基于 threshold 来找出 unknown 的,他们认为 unknwon 通过 softmax 会得到 low probability/confidence. 但是实际上很多 "fooling" "rubbish" 也会拥有 high probability/confidence scores. 比如通过对抗学习得到的 adversarial images. 作者在后面也提到了, threshold 实际上拒绝的不是 unknown, 而是 uncertain predictions.

OpenMax incorporates likelihood of the recognition system failure. This likelihood is used to estimate the probability for a given input belonging to an unknown class. For this estimation, we adapt the concept of Meta-Recognition[22, 32, 9] to deep networks. We use the scores from the penultimate layer of deep networks (the fully connected layer before SoftMax, e.g., FC8) to estimate if the input is “far” from known training data. We call scores in that layer the activation vector(AV).
关于 OpenMax 如果实现的简单总结,回过头在看。

A key insight in our opening deep networks is noting that “open space risk” should be measured in feature space rather than in pixel space.
一个重要的观点是,在 open deep networks 里面, open space risk 应该是从特征空间 feature space 的角度出发的, 而不是 pixel space. 也就是神经网络判断是不是 unknown, 应该是从 feature 的角度来看的。

We show that an extreme-value meta-recognition inspired distance normalization process on the overall activation patterns of the penultimate network layer provides a rejection probability for OpenMax normalization for unknown images, fooling images and even for many adversarial images.

Open set deep networks

Building on the concepts of open space risk, we seek to choose a layer (feature space) in which we can build a compact abating probability model that can be thresholded to limit open space risk.
基于 open space risk 的概念,提出了 compact abating probability model 能限制 open space risk.

multi-classes meta-recognition

> . Prior work on meta-recognition used the final system scores, analyzed their distribution based on Extreme Value Theory (EVT) and found these distributions follow Weibull distribution.

感觉看懂这部分先要理解极值理论(Extreme value theory).

from wikipedia: It seeks to assess, from a given ordered sample of a given random variable, the probability of events that are more extreme than any previously observed.

然后是 极值分布 的一种 Weibull distribution

所以 Weibull distribution 就是从整个分布中取最极端的例子 sampling top-n score,然后的到的分布。


We take the approach that the network values from penultimate layer (hereafter the Activation Vector (AV)), are not an independent per-class score estimate, but rather they provide a distribution of what classes are “related."
作者采用的方法是 倒数第二层,也就是 (Activation Vector) 提供不同 classes 之间的相关性分布,而不是每一个类对应的独立的分布。

interpretation of activation vector