HOPE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts
Abstract: Compositional Zero-Shot Learning (CZSL) has emerged as an essential paradigm in machine learning, aiming to overcome the constraints of traditional zero-shot learning by incorporating compositional thinking into its method-ology. Conventional zero-shot learning has difficulty managing unfamiliar combinations of seen and unseen classes because it depends on pre-defined class embeddings. In contrast, Compositional Zero-Shot Learning leverages the inherent hierarchies and structural connections among classes, creating new class representations by combining at-tributes, components, or other semantic elements. In our paper, we propose a novel framework that for the first time combines the Modern Hopfield Network with a Mixture of Experts (HOPE) to classify the compositions of previously unseen objects. Specifically, the Modern Hopfield Network creates a memory that stores label prototypes and identifies relevant labels for a given input image. Subsequently, the Mixture of Expert models integrates the image with the appropriate prototype to produce the final composition classi-fication. Our approach achieves SOTA performance on sev-eral benchmarks, including MIT-States and UT-Zappos. We also examine how each component contributes to improved generalization.
External IDs:dblp:conf/wacv/DatMNBB25
Loading