Abstract: Video conferencing tools have seen a significant increase in usage in the past few years but they still consume a significant bandwidth of $\sim 100$ Kbps to a few Mbps. In this work, we present Txt2Vid-Web: a practical, web-based, low bandwidth, video conferencing platform building upon the Txt2Vid work [1]. We introduce multiple improvements over the existing Txt2Vid framework – implementing it on browser application stack making it much more accessible and portable, reducing the implementation complexity of the plat-form via WebGL, and implementing a new WebGL shader for ConvTranspose in ONNX runtime – thereby enabling it to run on web-browsers of modern laptops (Fig. 1a). We use WebRTC to establish peer-to-peer data channels over network connections and utilize SDP’s multimedia negotiation scheme over SRTP connections, allowing our platform to provide high-quality video calls for the majority of connections and fall back to Txt2Vid when bandwidth limitations overconstrain SDP’s chosen codecs. We verified our plat-form via subjective study $(n =126)$ consisting of comparison of five different audio-video (AV) contents compressed via standard codecs and Txt2Vid-Web. We choose bitrates of {6 kbps, 10 kbps} for encoding the audio and bitrates of {15 kbps, 35 kbps, 100 kbps} for encoding the video using standard AV codec. Results show that at similar quality of experience our platform requires $100 - 500 \times$ less bandwidth than H.264 and VP9 as video codec and OPUS as audio codec (Fig. 1b). We envision our platform can open up many novel applications. Towards this end, we also open-source both our new tool as a Github repository (https://github.com/tpulkit/txt2vid_browser), and our subjective study dataset (https://tinyurl.com/SubjectiveStudyDataset).
Loading