OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Yu Yue
Researcher, Seed, ByteDance Inc.
Joined
March 2025
Names
Yu Yue
(Preferred)
,
YuYue
Emails
****@bytedance.com
(Confirmed)
Personal Links
LinkedIn
Career & Education History
Researcher
Seed,
ByteDance Inc.
(bytedance.com)
2020
–
Present
MS student
Electronic engineering,
Peking University
(pku.edu.cn)
2017
–
2020
Advisors, Relations & Conflicts
No relations added
Expertise
No areas of expertise listed
Publications
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents
Jiawei Wang
,
Jiacai Liu
,
Yuqian Fu
,
Yingru Li
,
Xintao Wang
,
Yuan Lin
,
Yu Yue
,
Lin Zhang
,
Yang Wang
,
WANG KE
ICML 2026 regular
Readers:
Everyone
PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier
Yuhua Jiang
,
Yuwen Xiong
,
Yufeng Yuan
,
Chao Xin
,
Wenyuan Xu
,
YuYue
,
Qianchuan Zhao
,
Lin Yan
MATH-AI 2025 Poster
Readers:
Everyone
Risk-Sensitive Reinforcement Learning for Alleviating Exploration Dilemmas in Large Language Models
Yuhua Jiang
,
Jiawei Huang
,
Yufeng Yuan
,
Xin Mao
,
YuYue
,
Qianchuan Zhao
,
Lin Yan
MATH-AI 2025 Poster
Readers:
Everyone
PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier
Yuhua Jiang
,
Yuwen Xiong
,
Yufeng Yuan
,
Chao Xin
,
Wenyuan Xu
,
YuYue
,
Qianchuan Zhao
,
Lin Yan
ICLR 2026 Conference Desk Rejected Submission
Readers:
Everyone
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents
Jiawei Wang
,
Jiacai Liu
,
Yuqian Fu
,
Yingru Li
,
Xintao Wang
,
Yuan Lin
,
Lin Zhang
,
YuYue
,
Yang Wang
,
WANG KE
Submitted to ICLR 2026
Readers:
Everyone
Risk-Sensitive Reinforcement Learning for Alleviating Exploration Dilemmas in Large Language Models
Yuhua Jiang
,
Jiawei Huang
,
Yufeng Yuan
,
Xin Mao
,
YuYue
,
Qianchuan Zhao
,
Lin Yan
ICLR 2026 Poster
Readers:
Everyone
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
,
Zheng Zhang
,
Ruofei Zhu
,
Yufeng Yuan
,
Xiaochen Zuo
,
YuYue
,
Weinan Dai
,
Tiantian Fan
,
Gaohong Liu
,
Juncai Liu
,
LingJun Liu
,
Xin Liu
,
Haibin Lin
,
Zhiqi Lin
,
Bole Ma
,
Guangming Sheng
,
Yuxuan Tong
,
Chi Zhang
,
Mofan Zhang
,
Ru Zhang
et al. (16 additional authors not shown)
NeurIPS 2025 poster
Readers:
Everyone
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Wei Shen
,
Guanlin Liu
,
YuYue
,
Ruofei Zhu
,
Qingping Yang
,
Chao Xin
,
Lin Yan
NeurIPS 2025 poster
Readers:
Everyone
Co-Authors
Bole Ma
Chao Xin
Chengyi Wang
Chi Zhang
Gaohong Liu
Guangming Sheng
Guanlin Liu
Haibin Lin
Hang Zhu
Hao Zhou
Hongli Yu
Jiacai Liu
Jiangjie Chen
Jiawei Huang
Jiawei Wang
Jiaze Chen
Jingjing Liu
Jinhua Zhu
Juncai Liu
Lin Yan
Lin Zhang
LingJun Liu
Mingxuan Wang
Mofan Zhang
Qianchuan Zhao
View all 54 co-authors