Openai introduced a system recognition system Whisper
The Openai organization introduced a system of speech recognition with an open source code Whisper, which provides transcription in several languages.
According to the announcement, 680,000 hours of multilingual and multi -sized data collected from the Internet were used to teach the model. Thanks to this, the system recognizes unique accents, background noise and technical jargon, the researchers said.
Whisper transcribes an audio track in English with a pronounced accent. Data: Openai.
According to the developers, Whisper demonstrated good speech recognition results in about 10 languages.
The company believes that the model will be useful to researchers of AI who study the reliability, opportunities, limitations and prejudices of modern models.
“Whisper is also potentially very useful as a solution for automatic speech recognition for developers, especially for recognizing English speech,” Openai said.
Researchers recognized that the model has its own restrictions, especially in the field of prediction of the text. Due to the use of “bumbal” data in the Whisper training set may include words that were not actually uttered in the transcription. The developers suggested that this is due to the attempt to predict the next word in audio and decipher the sound itself.
WHISPER also does not work equally well in different languages. The system is subject to more mistakes for those carriers whose speech is not sufficiently presented in the set of training data.
The source code of the model is available on github.
Recall that in September Openai allowed to edit faces in Dall-E 2. However, the developers banned the loading of famous people Visa запускает into the image system.
In January, the organization introduced a less toxic version of GPT-3, which produces less insulting expressions, disinformation and errors in general.
Subscribe to FORKLOG news at Telegram: Forklog AI – all news from the world of AI!
No Comments