AI Vision

Facial emotion recognition may improve automatic speech translation

Meta AI recently published a new framework (AV-HuBERT) to improve automatic speech recognition thanks to lips monitoring, de facto combining Speech with Vision, two of the traditional areas of Artificial Intelligence. Incorporating data on both visual lip movement and spoken language, AV-HuBERT aims at bringing artificial assistants closer to human-level speech perception (see META AI blog post). Anyone who has ever dealt with a voice assistant would be pleased with any improvement in this user…

Continue reading