[AML] T$^5$-ARC: Test-Time Training for Transductive Transformer Models in ARC-AGI Challenge

THU 2024 Winter AML Submission10 Authors

11 Dec 2024 (modified: 19 Dec 2024)THU 2024 Winter AML SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Language Models and Reasoning, Test-Time Training, Artificial General Intelligence
Abstract: The Abstraction and Reasoning Corpus (ARC-AGI) benchmark has emerged as a key challenge in evaluating machine intelligence by emphasizing skill generalization on novel tasks. Recent advancements primarily utilize Test-Time Training (TTT) to enhance reasoning ability of Large Language Models (LLMs). In this project, we focus on TTT for transductive models (end-to-end transformer models that directly generate the output grid) and develop our pipeline following the SOTA methods, which consists of three steps: Base Model Training, TTT and Active Inference. We have conducted detailed experiments to investigate on the effects of different base models, TTT strategies, data and model sizes and the use of negative training samples. Notably, a small transformer model trained from scratch equipped with TTT achieves performance comparable to SOTA LLMs, revealing the potentials of specialized models for certain reasoning tasks.
Submission Number: 10
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview