Abstract: Large language models (LLMs) have gained considerable attention due to their remarkable generalization capabilities, as exemplified by ChatGPT and GPT-4. However, these models exhibit limitations in specific domains, such as local life service scenarios, stemming from insufficient relevant knowledge and considerable disparities between local life industry data and general data. To address this issue, we first introduce a 170GB domain-specific corpus, LocalEvolve, for unsupervised continued pretraining. Second, we employ a low-rank adaptation approach to train a customized LLM, LocalAdapt, for local life service scenarios. Notably, we design a Multi-Task mapping system that transforms structured industry data into various Fundamental Reasoning Units (FRUs). Our LocalAdapt model demonstrates superior few-shot performance across different local life tasks compared to baseline models. Extensive empirical analysis further confirm the effectiveness of FRUs.
Paper Type: short
Research Area: Generation
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data analysis
Languages Studied: English, Chinese
0 Replies
Loading