Reflection System for the Abstraction and Reasoning Corpus

Kiril Bikov; Mikel Bober-Irizar; Soumya Banerjee

Reflection System for the Abstraction and Reasoning Corpus

Kiril Bikov, Mikel Bober-Irizar, Soumya Banerjee

Published: 20 Dec 2024, Last Modified: 30 Dec 2024AI4Research @ AAAI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: broad generalization in visual puzzles

Abstract: The Abstraction and Reasoning Corpus (ARC) benchmarks broad generalization in artificial intelligence, and presents a significant challenge to existing machine learning models and program synthesis solvers. In this work, we introduce a Reflection System for ARC. It combines Large Language Models (LLMs) and a program synthesis solver based on a Domain Specific Language (DSL). We analyse the accuracy of LLMs on ARC and demonstrate unsatisfactory results. We create AugARC, an augmented ARC benchmark, which consistently improves the performance of LLMs compared to the normal ARC benchmark. Using augmented ARC data, we fine-tune LLMs and observe a significant gain in ARC accuracy after training. By utilizing reflection, we combine LLMs and a previous DSL solver into our Reflection System for abstraction and reasoning. The proposed Reflection System motivates research to advance previous ARC attempts by combining the advantages of LLMs and program synthesis solvers with reflection.

Archival Option: The authors of this submission do *not* want it to appear in the archival proceedings.

Submission Number: 14

Loading