Hyperbolic Learning with Supervision from any Granularity
Abstract: Supervised classification commonly follows a one-vs-rest paradigm where each sample belongs to one category from a set of independent classes. In real-world settings, classes are typically not independent, but organized hierarchically from coarse-grained to fine-grained. More pressingly, people naturally annotate at different levels of granularity, depending on their expertise, biases, or data quality. What should be the correct label of a picture of a bird? Is it \emph{animal}, \emph{bird}, \emph{albatross}, or \emph{Laysan albatross}? What if one annotator is an ornithologist and the other has little bird knowledge? Similarly, if two pictures of a \emph{Laysan albatross} differ in blurriness, we tend to annotate blurry ones more generically, as we are unsure of details that differentiate classes at the finest levels. Currently, many annotations are removed, ignored, or reassigned because they do not match the required granularity. Instead of viewing the world as a flat, independent collection of concepts, this paper strives to perform supervised learning with labels at any granularity. We propose a hyperbolic embedding space, where classes are hierarchically organized as prototypes. We introduce a coarse-to-fine Busemann approach, where images are optimized to the correct region of the hyperbolic embedding space by projecting their labels -- which can be as precise or generic as desired -- to ideal prototypes on the boundary of the Poincaré ball. Experiments show that our approach improves multi-granular classification and beats the current state-of-the-art, which views different granularities as independent, instead of a connected tree.
Submission Number: 1198
Loading