Fine-Grained Perception in Panoramic Scenes: A Novel Task, Dataset, and Method for Object Importance Ranking

Published: 2025, Last Modified: 28 Oct 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Existing Salient Object Ranking (SOR) aims to infer ranking of salient objects based on their saliency degree. However, it tends to only focus on salient objects while neglecting non-salient ones. This coarse-grained ranking limits the performance of downstream tasks. For instance, in image retrieval tasks, focusing solely on the relationship between salient objects is insufficient for achieving fine-grained scene analysis, which may result in retrieved results that do not satisfy user requirements. High-quality retrieval requires fine-grained analysis, making it essential to rank non-salient objects. Based on this need, we propose a new task: Fine-grained Object Importance Ranking in 360 Scenes (FOIR-360), which focus on predicting the relative importance of "ALL objects'' at the instance-level. Our task takes into account all objects, allowing us to refine the original "coarse-grained'' to a "fine-grained'' level. Currently, the main challenge for this new task is the lack of supervised data for model training or even for model testing. Therefore, we propose a novel weakly supervised method to address the shortage of datasets. Furthermore, to the best of our knowledge, there is no existing suitable annotation protocol for this new task. The main reason is that annotating fine-grained rankings is extremely difficult, especially in panoramic scenes that contain numerous instances where even humans are unable to determine which one is more important than others. As the first attempt, we introduce a new annotation protocol designed to highlight the ranking of objects that are non-salient yet still important. Based on this protocol, we construct the first fine-grained 360Rank dataset. In summary, all these new task, weakly supervised method, annotation protocol, and dataset have the potential to drive advancements in the field.
Loading