Position: What's the next frontier for Data-centric AI? Data Savvy Agents!

Published: 06 Mar 2025, Last Modified: 20 Apr 2025ICLR 2025 Workshop Data Problems PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: data-savvy, agents, data-centric, LLMs
TL;DR: data-savvy capabilities should be a top priority in the design of agents
Abstract: The recent surge in AI agents that autonomously communicate, collaborate with humans and use diverse tools has unlocked promising opportunities in various real-world settings. However, a vital aspect remains underexplored: how agents handle data. Agents cannot achieve scalable autonomy without the ability to dynamically acquire, process, and continually evolve their data ecosystems to navigate complex and changing environments. In this position paper, we argue that data-savvy capabilities should be a top priority in the design of agentic systems to ensure reliable real-world deployment. Specifically, we propose four key capabilities to realize this vision: (1) Proactive data acquisition: enabling agents to autonomously gather task-critical knowledge or solicit human input to address data gaps; (2) Sophisticated data processing: requiring context-aware and flexible handling of diverse data challenges and inputs; (3) Interactive test data synthesis: shifting from static benchmarks to dynamically generated interactive test data for agent evaluation; and (4) Continual adaptation: empowering agents to iteratively refine their data and background knowledge to adapt to shifting environments. While current agent research predominantly emphasizes reasoning, we hope this work inspires a broader reflection on the role of data-savvy agents as the next frontier in data-centric AI.
Submission Number: 30
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview