OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Kevin J. Shih
Research Scientist, NVIDIA
Joined
September 2018
Names
Kevin J. Shih
(Preferred)
,
Kevin Jonathan Shih
Emails
****@illinois.edu
(Confirmed)
,
****@nvidia.com
(Confirmed)
,
****@gmail.com
(Confirmed)
Personal Links
Google Scholar
DBLP
Career & Education History
Research Scientist
NVIDIA
(nvidia.com)
2017
–
Present
PhD student
University of Illinois, Urbana Champaign
(illinois.edu)
2011
–
2017
Advisors, Relations & Conflicts
Coauthor
Bryan Plummer
Present
Coworker
Anthea Li
2020
–
Present
Coworker
Bryan Catanzaro
2017
–
Present
Coauthor
Rafael Valle
2017
–
Present
Coauthor
Saurabh Singh
2015
–
Present
Advisor
Derek Hoiem
2011
–
2017
Coauthor
Arun Mallya
2015
–
2015
Coauthor
Derek Hoiem
2011
–
2015
Expertise
Vision-Language
Present
Visual Attention
Present
Video Prediction
Present
Fine-Grained Object Classification
Present
Keypoint Localization
Present
Object Detection
Present
Unsupervised Landmarks
Present
Generative Modeling
2019
–
Present
Audio Synthesis
2019
–
Present
Visual Question Answering
2015
–
2018
Publications
Audio-to-Audio Schrodinger Bridges
Kevin J. Shih
,
Zhifeng Kong
,
Weili Nie
,
Arash Vahdat
,
Sang-gil Lee
,
Joao Felipe Santos
,
Ante Jukić
,
Rafael Valle
,
Bryan Catanzaro
AI4Music
Readers:
Everyone
Fugatto 1: Foundational Generative Audio Transformer Opus 1
Rafael Valle
,
Rohan Badlani
,
Zhifeng Kong
,
Sang-gil Lee
,
Arushi Goel
,
Sungwon Kim
,
Joao Felipe Santos
,
Shuqi Dai
,
Siddharth Gururani
,
Aya Aljafari
,
Alexander H. Liu
,
Kevin J. Shih
,
Ryan Prenger
,
Wei Ping
,
Chao-Han Huck Yang
,
Bryan Catanzaro
ICLR 2025 Poster
Readers:
Everyone
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li
,
Kevin J. Shih
,
Bryan A. Plummer
ICLR 2025 Conference Withdrawn Submission
Readers:
Everyone
P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting
Sungwon Kim
,
Kevin J. Shih
,
Rohan Badlani
,
Joao Felipe Santos
,
Evelina Bakhturina
,
Mikyas T. Desta
,
Rafael Valle
,
Sungroh Yoon
,
Bryan Catanzaro
NeurIPS 2023 poster
Readers:
Everyone
Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos
Aysegul Dundar
,
Kevin J. Shih
,
Animesh Garg
,
Robert Pottorff
,
Andrew Tao
,
Bryan Catanzaro
16 Nov 2022
OpenReview Archive Direct Upload
Readers:
Everyone
Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures
Nannan Li
,
Kevin J. Shih
,
Bryan A. Plummer
22 Sept 2022 (modified: 15 Jan 2026)
ICLR 2023 Conference Withdrawn Submission
Readers:
Everyone
RAD-TTS: Parallel Flow-Based TTS with Robust Alignment Learning and Diverse Synthesis
Kevin J. Shih
,
Rafael Valle
,
Rohan Badlani
,
Adrian Lancucki
,
Wei Ping
,
Bryan Catanzaro
Published: 15 Jun 2021, Last Modified: 05 May 2023
INNF+ 2021 poster
Readers:
Everyone
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
,
Kevin J. Shih
,
Ryan Prenger
,
Bryan Catanzaro
Published: 12 Jan 2021, Last Modified: 12 Oct 2025
ICLR 2021 Poster
Readers:
Everyone
Graphical Contrastive Losses for Scene Graph Parsing.
Ji Zhang
,
Kevin J. Shih
,
Ahmed Elgammal
,
Andrew Tao
,
Bryan Catanzaro
2019 (modified: 10 Nov 2022)
CVPR2019
Readers:
Everyone
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Yonatan Bisk
,
Kevin J. Shih
,
Yejin Choi
,
Daniel Marcu
2018 (modified: 16 Jul 2019)
AAAI 2018
Readers:
Everyone
View all 16 publications
Co-Authors
Adrian Lancucki
Ahmed Elgammal
Alexander H. Liu
Andrew Tao
Animesh Garg
Ante Jukić
Arash Vahdat
Arushi Goel
Aya Aljafari
Aysegul Dundar
Bryan A. Plummer
Bryan Catanzaro
Chao-Han Huck Yang
Daniel Marcu
David Tarjan
Derek Hoiem
Evelina Bakhturina
Fitsum A. Reda
Guilin Liu
Ian Endres
Ji Zhang
Joao Felipe Santos
Johnston Jiaa
Jon Barker
Mikyas T. Desta
View all 44 co-authors