Prioritizing Financially Informative Comments via Group Relative Policy Optimization

Prioritizing Financially Informative Comments via Group Relative Policy Optimization

ACL ARR 2025 May Submission7503 Authors

20 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We propose a reinforcement learning (RL) framework for ranking and prioritizing social media comments to inform algorithmic trading decisions. Focusing on Twitter, a high-frequency platform for market discourse, we introduce a market-aligned reward signal that directly links comment relevance to real-world financial outcomes—bypassing shallow engagement metrics such as likes or retweets. To address the challenges of sparse and delayed feedback, we adopt Group Relative Policy Optimization (GRPO), a sample-efficient RL method that eliminates the need for a critic network.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: Financial/business NLP

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 7503

Loading