Science

Meta releases SeamlessM4T AI model capable of translating nearly 100 languages

This technology will help users from all over the world communicate in the metaverse, says Mark Zuckerberg

By Web Desk

Published August 22, 2023

The image is an illustration depicting Meta Translation. —Meta/File

Nothing will be lost in translation anymore as Facebook's parent company Meta has introduced an AI model that can translate and transcribe speech in numerous languages, serving as a possible cornerstone for products that enable real-time communication across language barriers.

The business claimed in a blog post that their SeamlessM4T model could combine technology that was previously only accessible in separate models to provide translations between text and speech in roughly 100 languages as well as complete speech-to-speech translation for 35 languages.

According to CEO Mark Zuckerberg, these technologies will help users from all over the world communicate in the metaverse, a collection of interconnected virtual worlds, on which he is pinning the company's future.

Meta is making the model available to the public for non-commercial use, the blog post said.

Meta published a flurry of primarily free AI models this year, including a sizable language model dubbed Llama that directly competes with the proprietary models supplied by Google and OpenAI, which is sponsored by Microsoft.

Zuckerberg claims that Meta benefits from an open AI ecosystem because it stands to gain more from essentially crowdsourcing the development of user-facing tools for its social platforms rather than charging for access to the models.

However, due to the training data used to create its models, Meta faces the same legal problems as the rest of the industry.

In a complaint filed in July against Meta and OpenAI, comedian Sarah Silverman and two other authors claimed that the companies had used their works as training data without their consent.

In a research publication, Meta researchers claimed that they had collected audio training data for the SeamlessM4T model using 4 million hours of "raw audio originating from a publicly available repository of crawled web data," without identifying the repository in question.

A Meta spokesperson did not respond to questions on the provenance of the audio data.

According to the research article, text data was derived from datasets established last year that included content retrieved from Wikipedia and related websites.

Share this story: