Abstract: Highlights•We have developed a Latent Attribute Augmented Network (LAAN), which focuses on discriminative local visual regions to capture fine-grained characteristics.•Different from previous methods that utilize explicit semantic knowledge, we design an auxiliary memory to learn latent attribute prototypes automatically.•We have constructed a transformer-based knowledge interaction module, which enables the information fusion among the local image regions and latent attributes, thereby enhancing the model’s ability to process complex and detailed information.
External IDs:dblp:journals/ijon/HuZJY25
Loading