News » Science & Technology
Imperial College London researchers have developed a new AI model that can solve VSR problems – visually recognize speech by “reading” lips in multiple languages.
According to a Nature Machine Intelligence article, the improved AI algorithm has outperformed some proposed models.
Despite the active development of the VSR field, scientists say the possibilities of visual recognition are quite limited – most existing datasets only analyze English broadcasts. This drastically narrows their potential user base, which is why Imperial College researchers taught AI to “read” lips and other languages.
“We have found that we can use the same methods to train VSR models in more than just English”, – explained one of the scientists Pingchuan Ma. “Our model takes raw images as input and then automatically learns from lip movements what information needs to be extracted from those images to perform VSR tasks”.
Follow us on Telegram