Dynamic Inference and Top-down Attention in a Hierarchical Classification Network

Published: 01 Jan 2024, Last Modified: 07 Mar 2025ICPR (8) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper introduces a new network topology that seamlessly integrates dynamic inference cost with a top-down attention mechanism, addressing two significant gaps in traditional deep learning models. Drawing inspiration from human perception, we combine sequential processing of generic low-level features with parallelism and nesting of high-level features. This design not only reflects a finding from recent neuroscience research regarding - spatially and contextually distinct neural activations - in human cortex, but also introduces a method for generating efficient ‘experts’: the ability to select only high-level features of task-relevant categories. In certain cases, it is possible to bypass nearly all unnecessary high-level features, significantly reducing inference cost. We believe this paves the way for future network designs that are lightweight and adaptable, making them suitable for a wide range of applications, from compact edge devices to expansive clouds. Our proposed topology also comes with a built-in top-down attention mechanism, which allows processing to be influenced by either enhancing or inhibiting category-specific high-level features, drawing parallels to the selective attention mechanism observed in human cognition. Using targeted external signals, we experimentally enhanced predictions across all tested models/experts. In terms of dynamic inference our methodology can achieve an exclusion of up to 73.5 % of parameters and 88.7 % fewer giga-multiply-accumulate (GMAC) operations, analysis against comparative baselines show an average reduction of 40 % in parameters and 8 % in GMACs across the cases we evaluated.
Loading