TimeSeriesGym: A Scalable Benchmark for (Time Series) Machine Learning Engineering Agents

Published: 22 Sept 2025, Last Modified: 03 Jan 2026WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM Agents, Time Series, Scalable Benchmarking, Fine-Grained Evaluation
Submission Number: 259
Loading