Leveraging LLM-based sentiment analysis for portfolio optimization with proximal policy optimization

Kemal Kirtac

Leveraging LLM-based sentiment analysis for portfolio optimization with proximal policy optimization

Kemal Kirtac

Published: 19 Jun 2025, Last Modified: 12 Jul 20254th Muslims in ML Workshop co-located with ICML 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Submission Track: Track 1: Machine Learning Research by Muslim Authors

Keywords: reinforcement learning, proximal policy optimization, stock portfolio optimization, sentiment analysis

Abstract: Reinforcement learning (RL) offers adaptive solutions to portfolio optimization, yet standard methods such as proximal policy optimization (PPO) rely exclusively on historical price data and overlook the impact of investor sentiment. We introduce sentiment-augmented PPO (SAPPO), a reinforcement learning framework that incorporates real-time sentiment signals extracted from Refinitiv financial news. Daily sentiment scores are generated using LLaMA 3.3. SAPPO integrates these signals into the PPO advantage function via a sentiment-weighted term, enabling allocation strategies that respond to both price movements and market sentiment. Experiments on a three-asset portfolio demonstrate that SAPPO increases the Sharpe ratio from 1.55 to 1.90 and reduces drawdowns relative to PPO. The optimal configuration uses a sentiment influence parameter $\lambda = 0.1$, as validated through ablation studies and statistically significant $t$-tests ($p < 0.001$). These findings show that sentiment-aware reinforcement learning improves trading performance and offers a robust alternative to purely price-based strategies.

Submission Number: 2

Loading