LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0
Research Area: Inference algorithms for LMs
Keywords: Reasoning
TL;DR: We propose a new evaluation, library, and analysis of step-by-step reasoning with large language models
Abstract: Reasoning is a pivotal skill in the evolution of Large Language Models (LLMs), and constructing step-by-step reasoning chains is essential for enhancing their reasoning abilities. Despite a rich array of recent research aimed at deriving improved reasoning chains from LLMs, two major challenges hinder the progress in this field: the lack of effective methods to evaluate reasoning chains, and the absence of systematic analysis of reasoning algorithms. In this work, we introduce RICE, a novel LLM-based approach for automated evaluation of reasoning chains, which autonomously constructs a detailed evaluation criteria list to help itself recognize intermediate reason- ing mistakes. This fully automatic method proves to be more precise than existing metrics and offers a complementary angle to conventional answer-based evaluations. For the second challenge, we present a formulation that connects extensive existing reasoning algorithms. LLM Reasoners, a modular library for step-by-step reasoning algorithms, is developed based on the formulation. It enables users to specify problem domains and reasoning strategies with minimal effort. With the help of the new metric and library, we make a comprehensive study of the factors contributing to a reasoning algorithm, including the reward, the exploration strategy, the world model, and the prompt format, with interesting findings unveiled through RICE.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 1013
Loading