AADNet: A Human Mind Inspired Multi-modal Framework for Object Concept Learning

Chao Tang, Xinhai Chang

Published: 2025, Last Modified: 28 Feb 2026SMC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Object concept learning, the task of defining objects through visual perception, has seen significant progress with neural networks. However, existing approaches often focus solely on classification while overlooking the process of relating objects to explicit features such as affordance, specific attributes, and geometry. To bridge this gap, we redefine object concept learning to include the detection of explicit features and their relationships to object names, which together form what we call "concepts". We propose AADNet, a human-inspired framework that incorporates modules for affordance detection, attributes analysis, and depth estimation. These explicit features are integrated using transformer-based methods and transformed into a specific vector that encapsulates all the essential information defining the object’s concept. This vector is then used for object classification. AADNet’s innovative structure emulates human learning processes, offering a more holistic approach to concept formation. Extensive experiments validate its effectiveness.
Loading