Meta unveils speech generation AI: Voicebox Meta unveils speech generation AI: Voicebox

Meta unveils speech generation AI: Voicebox

The AI model does far more than convert text to speech.

Meta unveils speech generation AI: Voicebox

Cover art/illustration via CryptoSlate. Image includes combined content which may include AI-generated content.

Meta, the parent company of Facebook and Instagram, announced a speech-generation AI model called Voicebox on June 16.

The company said Voicebox could generate speech from text and noted that the model could match an audio style based on a sample just two seconds long.

Voicebox can also convert a text sample to another language and, given a separate speech sample, read the translated text in the speaker’s original voice. This capability supports six languages: English, French, German, Spanish, Polish, and Portuguese.

The AI model can additionally edit existing recordings to remove background noise. More generally, it can create speech that is modeled on diverse speech samples.

Voicebox could be leveraged by various users

Meta said that Voicebox and other similar AI models could allow virtual assistants and non-player characters in its metaverse to have realistic voices. The tool could also be of use to content creators and to users with accessibility needs, it said.

Meta said that Voicebox is currently a research project. It did not say when the feature might be publicly available, but it shared a demo video.

Meta announced several consumer AI tools earlier in June, revealed details about its AI chips in May, and discussed internal AI applications in an April investor call.

Mentioned in this article
Posted In: AI