Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective

Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective

ACL ARR 2025 July Submission523 Authors

28 Jul 2025 (modified: 19 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) operate in two learning modes: fine-tuning (FT) and in-context learning (ICL). We ask which mode exhibits greater language proficiency, and whether their inductive biases in pattern recognition differ. We propose three desiderata for the comparison: (D1) a precise specification of the learning task, (D2) an equal resource allocation to FT and ICL, and (D3) a comparable evaluation metric to find the better mode. Several prior studies attempted to compare FT and ICL without satisfying all three desiderata, resulting in mixed and inconclusive results. To satisfy these desiderata, we propose a formal language learning task, where syntactic pattern recognition is the main focus. We also introduce a discriminative test for language proficiency, enabling direct comparison of FT and ICL. Empirically, we find that (a) FT has greater language proficiency than ICL on in-distribution generalization, but both perform equally well on out-of-distribution generalization. (b) Their inductive bias, measured as the correlation of string generation, is usually similar, but similarity decreases with better language learning. (c) Unlike FT, ICL performance differs substantially across models of varying sizes and families, and becomes sensitive to tokens used in the languages. Thus, our controlled setup reveals subtle behavior of FT and ICL, which is difficult to capture in natural language datasets.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, evaluation methodologies, metrics, evaluation

Contribution Types: Model analysis & interpretability, Reproduction study, Data resources, Data analysis

Languages Studied: Synthetic formal languages, English

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

Software: zip

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: Yes

A2 Elaboration: Section: Limitations and Ethics Statement

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Section 2, 3, 5

B2 Discuss The License For Artifacts: No

B2 Elaboration: We use publicly available datasets and models. We further contribute with synthetic formal language based datasets.

B3 Artifact Use Consistent With Intended Use: Yes

B3 Elaboration: Section: Ethics Statement

B4 Data Contains Personally Identifying Info Or Offensive Content: No

B4 Elaboration: Section 3 and Ethics Statement. We use synthetic formal languages with no semantics (or personally identifying info) involved.

B5 Documentation Of Artifacts: Yes

B5 Elaboration: Section 3

B6 Statistics For Data: Yes

B6 Elaboration: Section 3 and Appendix B

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: Section 3 and Appendix B

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: Section 3, 5, and Appendix B

C3 Descriptive Statistics: Yes

C3 Elaboration: Section 3, 5, and Appendix C

C4 Parameters For Packages: Yes

C4 Elaboration: Section 3 and Appendix B

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D1 Elaboration: Not applicable

D2 Recruitment And Payment: N/A

D2 Elaboration: Not applicable

D3 Data Consent: N/A

D3 Elaboration: Not applicable

D4 Ethics Review Board Approval: N/A

D4 Elaboration: Not applicable

D5 Characteristics Of Annotators: N/A

D5 Elaboration: Not applicable

E Ai Assistants In Research Or Writing: Yes

E1 Information About Use Of Ai Assistants: No

E1 Elaboration: We are using AI assistants only for grammar/spelling correction during code/paper writing.

Author Submission Checklist: yes

Submission Number: 523

Loading