HierVision: Standardized and Reproducible Hierarchical Sources for Vision Datasets

Tejaswi Kasarla; Ruthu Hulikal Rooparaghunath; Stefano D'Arrigo; Gowreesh Mago; Abhishek Jha; Melika Ayoughi; Swasti Shreya Mishra; Ana Manzano Rodríguez; Teng Long; Mina Ghadimi Atigh; Max van Spengler; Pascal Mettes

HierVision: Standardized and Reproducible Hierarchical Sources for Vision Datasets

Tejaswi Kasarla, Ruthu Hulikal Rooparaghunath, Stefano D'Arrigo, Gowreesh Mago, Abhishek Jha, Melika Ayoughi, Swasti Shreya Mishra, Ana Manzano Rodríguez, Teng Long, Mina Ghadimi Atigh, Max van Spengler, Pascal Mettes

Published: 09 Jul 2025, Last Modified: 07 Sept 2025BEW 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: hierarchy collection, hierarchical learning, hyperbolic learning

Abstract: deep learning, whether the problem is supervised, self-supervised, or multi-modal in nature. The real world is however not binary, but governed by hierarchies. Hierarchies provide key information about the semantic relation between concepts, about which mistakes to avoid, and about the inherent organization of vision and language itself. Hierarchical learning, therefore, has a long history in computer vision and has gained further traction with the rise of hyperbolic deep learning. Currently, however, hierarchies are not standardized and centrally organized. Instead, such knowledge is scattered around various repositories, with inconsistent formatting, organizations, and availability. The lack of a central hub for hierarchies in vision datasets harms the utility and reproducibility of hierarchical learning. This paper introduces HierVision, a central hub for hierarchical knowledge in vision datasets. This hub contains 60+ hierarchical sources, spanning actions, concepts, fine-grained categories, vision-language, and more. We outline a uniform coding of the hierarchies and procedures to embed them in existing pipelines. With this hub, we hope to positively impact the broad use and re-use of hierarchies for deep learning in computer vision.

Track: Full paper (8 pages excluding references, same as main conference requirements)

Git: https://github.com/tkasarla/HierVision

Submission Number: 10

Loading