OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Yuki Ichihara
PhD student, Nara Institute of Science and Technology, Japan
Joined
September 2023
Names
Yuki Ichihara
(Preferred)
,
YUKI ICHIHARA
Emails
****@is.naist.jp
(Confirmed)
,
****@gmail.com
(Confirmed)
Personal Links
Homepage
DBLP
Semantic Scholar
ACL Anthology
Career & Education History
PhD student
Nara Institute of Science and Technology, Japan
(naist.jp)
2025
–
Present
MS student
Nara Institute of Science and Technology
(naist.jp)
2023
–
2025
Advisors, Relations & Conflicts
No relations added
Expertise
Large Language Models
2025
–
Present
Reinforcement Learning
2024
–
Present
Publications
Consensus Group Relative Policy Optimization for Text Generation
Yuki Ichihara
,
Yuu Jinnai
,
Kaito Ariu
,
Eiji Uchibe
CoRR 2026
Readers:
Everyone
MO-GRPO: Mitigating Reward Hacking of Group Relative Policy Optimization on Multi-Objective Problems
Yuki Ichihara
,
Yuu Jinnai
,
Tetsuro Morimura
,
Mitsuki Sakamoto
,
Ryota Mitsuhashi
,
Eiji Uchibe
CoRR 2025
Readers:
Everyone
Theoretical Guarantees for Minimum Bayes Risk Decoding
Yuki Ichihara
,
Yuu Jinnai
,
Kaito Ariu
,
Tetsuro Morimura
,
Eiji Uchibe
CoRR 2025
Readers:
Everyone
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment
Yuki Ichihara
,
Yuu Jinnai
,
Tetsuro Morimura
,
Kaito Ariu
,
Kenshi Abe
,
Mitsuki Sakamoto
,
Eiji Uchibe
CoRR 2025
Readers:
Everyone
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment
Yuki Ichihara
,
Yuu Jinnai
,
Tetsuro Morimura
,
Kenshi Abe
,
Kaito Ariu
,
Mitsuki Sakamoto
,
Eiji Uchibe
Trans. Mach. Learn. Res. 2025
Readers:
Everyone
Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks
Yuki Ichihara
,
Yuu Jinnai
EMNLP (Industry Track) 2025
Readers:
Everyone
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment
Yuki Ichihara
,
Yuu Jinnai
,
Tetsuro Morimura
,
Kenshi Abe
,
Kaito Ariu
,
Mitsuki Sakamoto
,
Eiji Uchibe
Accepted by TMLR
Readers:
Everyone
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
,
Tadashi Kozuno
,
Masahiro Kato
,
YUKI ICHIHARA
,
Soichiro Nishimori
,
Akiyoshi Sannai
,
Sho Sonoda
,
Wataru Kumagai
,
Yutaka Matsuo
RLSW 2024 TalkPoster
Readers:
Everyone
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
,
Tadashi Kozuno
,
Masahiro Kato
,
Yuki Ichihara
,
Soichiro Nishimori
,
Akiyoshi Sannai
,
Sho Sonoda
,
Wataru Kumagai
,
Yutaka Matsuo
CoRR 2024
Readers:
Everyone
Co-Authors
Akiyoshi Sannai
Eiji Uchibe
Kaito Ariu
Kenshi Abe
Masahiro Kato
Mitsuki Sakamoto
Ryota Mitsuhashi
Sho Sonoda
Soichiro Nishimori
Tadashi Kozuno
Tetsuro Morimura
Toshinori Kitamura
Wataru Kumagai
Yutaka Matsuo
Yuu Jinnai