Keywords: Compositionality, Generalization, Dataset, Benchmark, Objects
TL;DR: A novel data generator and benchmark, with unmatched control and flexibility, to study compositional and systematic generalization in vision-like domains - with experimental results showcasing SOTA model failures.
Abstract: The ability to compose learned concepts and apply them in novel settings is key to human intelligence, but remains a key challenge in state-of-the-art machine learning models. To address this issue, we introduce COGITAO, a modulable data-generation framework to evaluate compositional and systematic generalization in object-centric domains. Drawing inspiration from ARC-AGI’s environment and problem-setting, COGITAO constructs rule-based tasks to be solved by applying a set of transformations to objects in grid-based environments. It supports composition over a set of 28 interoperable transformations, at adjustable composition-depth, along with extensive control over grid parametrization and object properties. This flexibility enables the creation of millions of unique task rules – surpassing existing datasets by several orders of magnitude – across a broad range of difficulties, while allowing virtually unlimited sample generation per rule. Alongside open-sourcing our flexible data-generation framework, we release benchmark datasets and provide baseline results with several state-of-the-art architectures that incorporate inductive biases well-suited for compositionality, such as diffusion-based Transformers (LLaDA) or recurrent Transformers with Adaptive Computation Time (Universal Trans- former/PonderNet). Despite strong in-domain performance, these models consistently fail to generalize to novel combinations of familiar elements – highlighting a persistent challenge in compositional and systematic generalization, which COGITAO allows to precisely characterize.
Primary Area: datasets and benchmarks
Submission Number: 17688
Loading