PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts

ACL ARR 2025 February Submission4044 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Process-driven dialogue systems, which operate under strict predefined process constraints, are essential in customer service and equipment maintenance scenarios. Although Large Language Models (LLMs) have shown remarkable progress in dialogue and reasoning, they still struggle to solve these strictly constrained dialogue tasks. To address this challenge, we construct **P**rocess **F**low **Dial**ogue (**PFDial**) dataset, which contains 12,705 high-quality dialogue instructions derived from 440 flowcharts containing 5,055 process nodes. Based on PlantUML specification, each UML flowchart is converted into atomic dialogue units i.e., structured five-tuples. Experimental results demonstrate that a 7B model trained with merely 800 samples, and a 0.5B model trained on total data both can surpass 90\% accuracy. Additionally, the 8B model can surpass GPT-4o up to 43.88\% with an average of 11.00\%. We further evaluate models' performance on challenging backward transitions in process flows and, finally, we conduct an in-depth analysis of a wide range of dataset formats to reveal their impact on model performance in handling decision and sequential branches.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: task-oriented; LLM

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources

Languages Studied: English

Submission Number: 4044

Loading