Abstract: This paper describes LOCL: Learning Object-Attribute (O-A) Composition using Localization – that generalizes composition zero shot learning to objects in cluttered/more
realistic settings. The problem of unseen O-A associations has been well studied in the
field, however, the performance of existing methods is limited in challenging scenes. In
this context, our key contribution is a modular approach to localizing objects and attributes of interest in a weakly supervised context that generalizes robustly to unseen
configurations. Localization coupled with a composition classifier significantly outperforms state-of-the-art (SOTA) methods, with an improvement of about 12% on currently
available challenging datasets. Further, the modularity enables the use of localized feature extractor to be used with existing O-A compositional learning methods to improve
their overall performance
0 Replies
Loading