Abstract: Recent advances in agentic systems for data analysis have emphasized automation of insight generation through multi-agent frameworks, and orchestration layers. While these systems effectively manage tasks like query translation, data transformation, and visualization, they often overlook the structured reasoning process underlying analytical thinking. Reasoning large language models (LLMs) used for multi-step problem solving are trained as general-purpose problem solvers. As a result, their reasoning or thinking steps do not adhere to fixed processes for specific tasks. Real-world data analysis requires a consistent cognitive workflow: interpreting vague goals, grounding them in contextual knowledge, constructing abstract plans, and adapting execution based on intermediate outcomes. We introduce I2I-STRADA (Information-to-Insight via Structured Reasoning Agent for Data Analysis), an agentic architecture designed to formalize this reasoning process. I2I-STRADA focuses on modeling how analysis unfolds via modular sub-tasks that reflect the cognitive steps of analytical reasoning. Evaluations on the DABstep and DABench benchmarks show that I2I-STRADA outperforms prior systems in planning coherence and insight alignment, highlighting the importance of structured cognitive workflows in agent design for data analysis.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: data analysis, reasoning, large language model, planning
Contribution Types: NLP engineering experiment, Data analysis
Languages Studied: English natural language
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
A2 Elaboration: Risks associated with LLMs require guardrails that are application and organization requirement specific. Our work covers only the algorithmic view.
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Datasets used for benchmarking are mentioned under Sections 4.1 and 4.2. We are not distributing the code we created for our agent.
B2 Discuss The License For Artifacts: N/A
B2 Elaboration: We are not distributing any artifacts as part of this work. We have utlized open source datasets for benchmarking.
B3 Artifact Use Consistent With Intended Use: N/A
B3 Elaboration: We are not distributing any artifacts as part of this work. We have utlized open source datasets for benchmarking as per their intended use.
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B4 Elaboration: We have utlized open source datasets used for benchmarking in various other works.
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Section 4
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 4.1 - Anthropic's Claude Sonnet 3.5 was used to implement our agentic framework. The model is consumed via AWS Bedrock APIs which doesn't require dedicated server/GPU provisioning.
C2 Experimental Setup And Hyperparameters: N/A
C2 Elaboration: N/A since we use default hyperparameters with the Bedrock APIs.
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 4.1 and 4.2
C4 Parameters For Packages: Yes
C4 Elaboration: In Section 4.1 we mention the model used.
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1461
Loading