Vendredi 16 février 2024

Efficient ML/DSP Algorithms for Real-Time Speech Communication Systems
Jean-Marc Valin
Xiph.Org Foundation

Heure: 13h30



Résumé: Over the past years, machine learning techniques have been responsible for significant improvements to the state-of-the-art in speech and audio processing. Among potential applications, real-time communication systems impose more stringent requirements, both for processing latency, and for computational resources. In this talk, I will discuss how these constraints can be satisfied while still achieving very high quality. I will then introduce some of our work on improving the Opus audio codec using machine learning. I will discuss how to improve the coded speech quality at low bitrate and how to increase robustness to packet loss through Deep REDundancy (DRED).

Biographie: Jean-Marc Valin received his Ph.D. in Electrical Engineering from the University of Sherbrooke in 2005. He worked as a postdoc in the CSIRO ICT Centre in Sydney, Australia. He is the primary author of the Speex speech codec and one of the main authors of the Opus audio codec. He also contributed to the AV1 video codec. He has volunteered with the Xiph.Org Foundation since 2002. His research interests include real-time communication systems, speech and audio coding, as well as applications of machine learning to speech and audio processing.

Note: La présentation sera donnée en français uniquement par Zoom.