Pathfinder-X2: A Challenging Dataset for Evaluating Large Language Models on Long-Range Dependencies
Keywords: large language models, sequence, modeling, artificial intelligence
TL;DR: A new and more challenging dataset for long-range dependencies in large language models
Abstract: The rapid progress of large language models has led to impressive results in a
wide array of tasks. However, there remains a need for increasingly challenging
datasets to evaluate these models’ ability to handle long-range dependencies.
In this paper, we present Pathfinder-X2, a novel dataset that builds upon the
Pathfinder and Pathfinder-X datasets. Pathfinder-X2 comprises 512x512 pixel
images, designed to test large language models’ capacity to segment a specific
white line dash ”snake” with a circle at its tip among a collection of similar,
distractor snakes. We demonstrate that the increased image resolution and
complexity of Pathfinder-X2 present a substantially more challenging task for
large language models, contributing to the ongoing development and assessment
of such models.
0 Replies
Loading