Keywords: Abstraction and Reasoning Corpus (ARC), analogical reasoning, synthetic datasets, GIF images, benchmark improvement
TL;DR: GIFARC is an analogy-inspired ARC dataset synthesized from GIF images that provides explicit human-intuitive analogies, significantly enhancing AI systems' abstract reasoning capabilities and improving solver accuracy on the ARC benchmark.
Abstract: The Abstraction and Reasoning Corpus (ARC) poses a stringent test of general AI capabilities, requiring solvers to infer abstract patterns from only a handful of examples. Despite substantial progress in deep learning, state-of-the-art models still achieve accuracy rates of merely 40–55% on the 2024 ARC Competition, indicative of a significant gap between their performance and human-level reasoning. In this work, we seek to bridge that gap by introducing an analogy-inspired ARC dataset, GIFARC. Leveraging vision-language models (VLMs), we synthesize new ARC-style tasks from a variety of GIF images that include analogies. Each new task is paired with ground-truth analogy, providing an explicit mapping between visual transformations and everyday concepts. By embedding robust human-intuitive analogies into ARC-style tasks, GIFARC guides AI agents to adopt analogical reasoning approaches, facilitating more concise and human-understandable solutions. We empirically demonstrate that GIFARC improves task-solving performance by aligning model reasoning with human analogical problem-solving strategies.
Primary Area: datasets and benchmarks
Submission Number: 18480
Loading