Keywords: Retrieval-Augmented Generation, Advertisement Rewriting, Proximal Policy Optimization, Reinforcement Learning, Large Language Models, Content Ranking, Advertisement Ranking, Content Optimization
TL;DR: Rewriting ads with a PPO-tuned LLM measurably boosts their retrieval rank and inclusion in RAG pipelines without changing the underlying retriever.
Abstract: Search algorithms and user query relevance have given LLMs the ability to return relevant information, but the effect of content phrasing on ad visibility remains underexplored. We investigate how LLM-based rewriting of advertisements can improve their ranking in retrieval systems and inclusion in generated LLM responses, without modifying the retrieval model itself. We introduce a supervised fine-tuning framework with a custom loss balancing semantic relevance and content fidelity. To evaluate effectiveness, we propose two metrics: $\Delta$MRR@K (ranking improvement) and $\Delta$DIR@K (inclusion frequency improvement). Our approach presents a scalable method to optimize ad phrasing, enhancing visibility in retrieval-based LLM workflows. Experiments across both instruction-based and few-shot prompting demonstrate that PPO trained models outperform both prompt engineering and supervised fine-tuning in most cases, achieving up to a 2.79 $\Delta$DIR@5 and 0.0073 $\Delta$MRR@5 in instruction-based prompting. These results highlight the importance of how the ad is written before retrieval and prompt format and reinforcement learning in effective ad rewriting for LLM integrated retrieval systems.
Submission Number: 62
Loading