A large language model-driven reward design framework via dynamic feedback for reinforcement learning

Published: 01 Jan 2025, Last Modified: 02 Aug 2025Knowl. Based Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We introduce CARD, an LLM-based framework for reward code design and refinement.•Our method lowers human costs, token usage, and training time.•Results show that our method outperforms baselines and exceeds the human oracle.
Loading