Open-source ultra-low bitrate compression for video conferencing without a camera
18 Sep 2023
Content Everywhere Stage 2
Free to AttendHall 5
Free to AttendHall 5
Showfloor Stages
Over the years, different video codecs have been developed to meet the needs of various broadcasting applications. The quality of these codecs is excellent at medium-to-low bitrates, but it degrades when operating at low bitrates. There has been a massive interest in replacing these methods with machine-learning approaches. Using open-source software, Collabora has developed a compression pipeline that enables a face video broadcasting system that achieves the same visual quality as H.264 while using a fraction of the bandwidth. Our pipeline uses a speech-to-text model to transcribe the audio feed. A generative text-to-speech model is used to recover audio from the text on the receiver side, followed by a lipsyncing model to reconstruct the face with the generated audio. This enables communication at lower bitrates in remote terrains with limited bandwidth, and frees bandwidth for error correction during broadcasting. We'll present the pipeline, and it's use-case for the broadcasting world.