Aligning Large Language Models via Chain-of-Thought Reasoning

Anonymous

Aligning Large Language Models via Chain-of-Thought Reasoning

Anonymous

16 Oct 2023ACL ARR 2023 October Blind SubmissionReaders: Everyone

Abstract: Chain-of-Thought (CoT) prompting empowers the reasoning abilities of Large Language Models (LLMs), eliciting them to solve complex reasoning tasks step-by-step. However, these capabilities appear only in models with billions of parameters, which represent a barrier to entry for many users who are forced to operate on a smaller model scale, i.e., Small Language Models (SLMs). Although many companies are releasing LLMs of the same family with a reduced number of parameters, these models sometimes produce misleading answers and are unable to deliver CoT reasoning. In this paper, we investigate the alignment of reasoning abilities from larger to smaller Language Models. In particular, using Instruction-tuning-CoT approach, that is, an Instruction-tuning empowered towards CoT-Demonstrations, we analyze the impact on the the downstream abilities. Hence, we instruct a smaller Language Model using outputs generated by more robust models belonging to the same family or not, and we analyze the impact and divergencies. Results obtained on four question-answering benchmarks show that SMLs can be instructed to reason via CoT-Demonstration produced by LLMs.

Paper Type: long

Research Area: Efficient/Low-Resource Methods for NLP

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Data resources

Languages Studied: English

Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.

0 Replies

Loading