sheng zhao

Researcher, Speech, Microsoft

Joined

October 2018

Names

sheng zhao (Preferred)

Sheng Zhao

Emails

****@microsoft.com (Confirmed)

Personal Links

Homepage

Google Scholar

Career & Education History

Researcher

Speech, Microsoft (microsoft.com)

2003 – Present

MS student

Computer Science, Tsinghua University (tsinghua.edu.cn)

2000 – 2003

Advisors, Relations & Conflicts

No relations added

Expertise

Speech recognition

Present

machine translation

Present

Speech synthesis

2000 – Present

Publications

FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
Jiaqi Li, Yao Qian, Yuxuan Hu, leying zhang, Xiaofei Wang, Heng Lu, Manthan Thakker, Jinyu Li, sheng zhao, Zhizheng Wu
- ICLR 2026 Poster
- Readers: Everyone
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
leying zhang, Yao Qian, Xiaofei Wang, Manthan Thakker, Dongmei Wang, Jianwei Yu, Haibin Wu, Yuxuan Hu, Jinyu Li, Yanmin Qian, sheng zhao
- NeurIPS 2025 poster
- Readers: Everyone
Exploring the Potential of Large Multimodal Models as Effective Alternatives for Pronunciation Assessment
Ke Wang, Lei He, Kun Liu, Yan Deng, Wenning Wei, Sheng Zhao
- CoRR 2025
- Readers: Everyone
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Xinfa Zhu, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie
- CoRR 2025
- Readers: Everyone
CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions
Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Xi Wang, Sheng Zhao, Lei Xie
- CoRR 2025
- Readers: Everyone
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
Sanyuan Chen, Shujie LIU, Long Zhou, Eric Liu, Xu Tan, Jinyu Li, sheng zhao, Yao Qian, Furu Wei
- Submitted to ICLR 2025
- Readers: Everyone
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Detai Xin, Xu Tan, Kai Shen, Zeqian Ju, Dongchao Yang, Yuancheng Wang, Shinnosuke Takamichi, Hiroshi Saruwatari, Shujie LIU, Jinyu Li, sheng zhao
- ICLR 2025 Conference Desk Rejected Submission
- Readers: Everyone
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment
Bing Han, Long Zhou, Shujie LIU, Sanyuan Chen, Lingwei Meng, Yanmin Qian, Eric Liu, sheng zhao, Jinyu Li, Furu Wei
- Audio Imagination: NeurIPS 2024 Workshop
- Readers: Everyone
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
leying zhang, Yao Qian, Long Zhou, Shujie LIU, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, sheng zhao, Michael Zeng
- OpenReview Archive Direct Upload
- Readers: Everyone
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie LIU, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, sheng zhao, Michael Zeng
- OpenReview Archive Direct Upload
- Readers: Everyone

View all 165 publications

Co-Authors

View all 253 co-authors