SessionIntentBench: A Multi-task Inter-Session Intention-Shift Modeling Benchmark for E-commerce Customer Behavior Understanding

ACL ARR 2025 July Submission321 Authors

27 Jul 2025 (modified: 01 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Session history is a common way of recording user interacting behaviors throughout a browsing activity with multiple products. For example, if an user clicks a product webpage and then leaves, it might because there are certain features that don't satisfy the user, which serve as an important indicator of on-the-spot user preferences. However, all prior works fail to capture and model customer intention effectively because insufficient information exploitation and only apparent information like descriptions and titles are used. There is also a lack of data and corresponding benchmark for explicitly modeling intention in E-commerce product purchase sessions. To address these issues, we introduce the concept of an intention tree and propose a dataset curation pipeline. Together, we construct a sibling multimodal benchmark, SessionIntentBench, that evaluates L(V)LMs' capability on understanding inter-session intention shift with four subtasks. With 1,952,177 intention entries, 1,132,145 session intention trajectories, and 13,003,664 available tasks mined using 10,905 sessions, we provide a scalable way to exploit the existing session data for customer intention understanding. We conduct human annotations to collect ground-truth label for a subset of collected data to form an evaluation gold set. Extensive experiments on the annotated data further confirm that current L(V)LMs fail to capture and utilize the intention across the complex session setting. Further analysis show injecting intention enhances LLMs' performances.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: benchmarking; NLP datasets; evaluation methodologies; evaluation; reproducibility
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
Previous URL: https://openreview.net/forum?id=hctZFyJDjZ
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: Previous reviewers and the area chair are not responding to the authors' responses. Their reviews are biased toward the scope of this paper, and very few of their comments provide valuable advice that would help improve it. We would like to request a new set of reviewers to ensure a fair evaluation. A1 Limitations Section: This paper has a limitations section.
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Ethics Statement
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: References
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: Ethics Statement
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Ethics Statement
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Ethics Statement
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: 5 Evaluations and Analyses
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: 5.2 Baselines and Model Selections
C2 Experimental Setup And Hyperparameters: N/A
C3 Descriptive Statistics: Yes
C3 Elaboration: 5.5 Error Analyses
C4 Parameters For Packages: Yes
C4 Elaboration: 5.2 Baselines and Model Selections
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: Yes
D1 Elaboration: Appendix B. Annotation Details
D2 Recruitment And Payment: Yes
D2 Elaboration: Appendix B. Annotation Details
D3 Data Consent: Yes
D3 Elaboration: 3.2 Dataset
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: Yes
D5 Elaboration: Appendix B. Annotation Details
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 321
Loading