Prioritizing Financially Informative Comments via Group Relative Policy Optimization

ACL ARR 2025 May Submission7503 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We propose a reinforcement learning (RL) framework for ranking and prioritizing social media comments to inform algorithmic trading decisions. Focusing on Twitter, a high-frequency platform for market discourse, we introduce a market-aligned reward signal that directly links comment relevance to real-world financial outcomes—bypassing shallow engagement metrics such as likes or retweets. To address the challenges of sparse and delayed feedback, we adopt Group Relative Policy Optimization (GRPO), a sample-efficient RL method that eliminates the need for a critic network.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: Financial/business NLP
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 7503
Loading