Towards Human-like Machine Vision: Representing Part-Whole Relationship with hierarchically correlated neuronal activation in neural networks
Keywords: Part-whole hierarchy, neural symbolic, neural syntax, attractor dynamics, binding problem, neuroAI, object-centric representation, active perception
TL;DR: Building up "neural syntax" from "neural words" in neural networks by dynamically formed hierarchical correlation structure.
Abstract: Representing hierarchical structure is a key problem that characterizes the gap between current neural network and human-like intelligence. While human brain flexibly extracts part-whole hierarchy from unstructured sensory input, how can a neural network with fixed connection weight flexibly capture such compositional structure is still an open question. Most efforts in machine learning field focus on slot-based methods to temporally tackle the problem. In this paper, we provide new insights on this challenge without resort to the “slot” idea. From a interdisciplinary viewpoint that combine neuroscientific hypothesis and machine learning models, we propose the Composer, which dynamically “correlates” its distributed neural activation into an emergent implicit hierarchical structure to represent the part-whole hierarchy of objects. The observed representation is consistent to the widely-discussed “neural syntax” in neuroscience. Therefore, we hope the Composer shed light on a new paradigm to develop human-like vision and to build up compositional structure without “slots”. We also invent quantitative measures to evaluate the parsing quality, which shows that the Composer can parse a range of synthetic scenes of different complexities. By incorporating advanced machine learning models like LLMs or diffusion models into the paradigm, the capability of Composer is promising to be scaled into real-world datasets in the future. Taken together, we believe the Composer can inspire and inform future innovations and development towards artificial general intelligence (AGI).
Submission Number: 5
Loading