OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
sheng zhao
Researcher, Speech, Microsoft
Joined
October 2018
Names
sheng zhao
(Preferred)
,Â
Sheng Zhao
Emails
****@microsoft.com
(Confirmed)
,Â
****@microsoft.com
(Confirmed)
Personal Links
Homepage
Google Scholar
Career & Education History
Researcher
Speech,
Microsoft
(microsoft.com)
2003
–
Present
MS student
Computer Science,
Tsinghua University
(tsinghua.edu.cn)
2000
–
2003
Advisors, Relations & Conflicts
No relations added
Expertise
Speech recognition
Present
machine translation
Present
Speech synthesis
2000
–
Present
Publications
FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
Jiaqi Li
,
Yao Qian
,
Yuxuan Hu
,
leying zhang
,
Xiaofei Wang
,
Heng Lu
,
Manthan Thakker
,
Jinyu Li
,
sheng zhao
,
Zhizheng Wu
ICLR 2026 Poster
Readers:
Everyone
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
leying zhang
,
Yao Qian
,
Xiaofei Wang
,
Manthan Thakker
,
Dongmei Wang
,
Jianwei Yu
,
Haibin Wu
,
Yuxuan Hu
,
Jinyu Li
,
Yanmin Qian
,
sheng zhao
NeurIPS 2025 poster
Readers:
Everyone
Exploring the Potential of Large Multimodal Models as Effective Alternatives for Pronunciation Assessment
Ke Wang
,
Lei He
,
Kun Liu
,
Yan Deng
,
Wenning Wei
,
Sheng Zhao
CoRR 2025
Readers:
Everyone
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Xinfa Zhu
,
Lei He
,
Yujia Xiao
,
Xi Wang
,
Xu Tan
,
Sheng Zhao
,
Lei Xie
CoRR 2025
Readers:
Everyone
CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions
Xinfa Zhu
,
Wenjie Tian
,
Xinsheng Wang
,
Lei He
,
Xi Wang
,
Sheng Zhao
,
Lei Xie
CoRR 2025
Readers:
Everyone
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
Sanyuan Chen
,
Shujie LIU
,
Long Zhou
,
Eric Liu
,
Xu Tan
,
Jinyu Li
,
sheng zhao
,
Yao Qian
,
Furu Wei
Submitted to ICLR 2025
Readers:
Everyone
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Detai Xin
,
Xu Tan
,
Kai Shen
,
Zeqian Ju
,
Dongchao Yang
,
Yuancheng Wang
,
Shinnosuke Takamichi
,
Hiroshi Saruwatari
,
Shujie LIU
,
Jinyu Li
,
sheng zhao
ICLR 2025 Conference Desk Rejected Submission
Readers:
Everyone
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment
Bing Han
,
Long Zhou
,
Shujie LIU
,
Sanyuan Chen
,
Lingwei Meng
,
Yanmin Qian
,
Eric Liu
,
sheng zhao
,
Jinyu Li
,
Furu Wei
Audio Imagination: NeurIPS 2024 Workshop
Readers:
Everyone
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
leying zhang
,
Yao Qian
,
Long Zhou
,
Shujie LIU
,
Dongmei Wang
,
Xiaofei Wang
,
Midia Yousefi
,
Yanmin Qian
,
Jinyu Li
,
Lei He
,
sheng zhao
,
Michael Zeng
OpenReview Archive Direct Upload
Readers:
Everyone
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
Chenyang Le
,
Yao Qian
,
Dongmei Wang
,
Long Zhou
,
Shujie LIU
,
Xiaofei Wang
,
Midia Yousefi
,
Yanmin Qian
,
Jinyu Li
,
sheng zhao
,
Michael Zeng
OpenReview Archive Direct Upload
Readers:
Everyone
View all 165 publications
Co-Authors
Aakash Lakhera
Anni Tang
Arul Menezes
Bai-gen Cai
Bing Han
Bohan Li
Botao Yu
Brendan Walsh
Canrun Li
Cathy H. Wu
Chen Zhang
Chengyi Wang
Chenpeng Du
Chenxu Hu
Chenyang Le
Chetan Matad
Chin-Ju Lo
Chong Luo
Chuanxin Tang
Chun Yuan
Chung-Hsien Tsai
Chunyu Wang
Congru Yuan
Dacheng Yin
DanLing Jiang
View all 253 co-authors