- Abstract: This paper considers probability distributions of penultimate activations in deep classiﬁcation networks. We ﬁrst identify a dual relation between the activations and the weights of the ﬁnal fully connected layer: learning the networks with the cross-entropy loss makes their (normalized) penultimate activations follow a von Mises-Fisher distribution for each class, which is parameterized by the weights of the ﬁnal fully-connected layer. Through this analysis, we derive a probability density function of penultimate activations per class. This generative model allows us to synthesize activations of classiﬁcation networks without feeding images forward through them. We also demonstrate through experiments that our generative model of penultimate activations can be applied to real-world applications such as knowledge distillation and class-conditional image generation.
- Original Pdf: pdf