Pathfinder-X2: A Challenging Dataset for Evaluating Large Language Models on Long-Range DependenciesDownload PDF

10 Apr 2023 (modified: 08 May 2023)Submitted to ICRA-23 Workshop on Pretraining4RoboticsReaders: Everyone
Keywords: large language models, sequence, modeling, artificial intelligence
TL;DR: A new and more challenging dataset for long-range dependencies in large language models
Abstract: The rapid progress of large language models has led to impressive results in a wide array of tasks. However, there remains a need for increasingly challenging datasets to evaluate these models’ ability to handle long-range dependencies. In this paper, we present Pathfinder-X2, a novel dataset that builds upon the Pathfinder and Pathfinder-X datasets. Pathfinder-X2 comprises 512x512 pixel images, designed to test large language models’ capacity to segment a specific white line dash ”snake” with a circle at its tip among a collection of similar, distractor snakes. We demonstrate that the increased image resolution and complexity of Pathfinder-X2 present a substantially more challenging task for large language models, contributing to the ongoing development and assessment of such models.
0 Replies

Loading