Meta open sources early-stage AI translation tool that works across 200 languages

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,764
Reputation
10,607
Daps
185,939

Meta’s ambitions to build a ‘universal translator’ continue
By James Vincent Jul 6, 2022, 9:02am EDT
acastro_211101_1777_meta_0002.0.jpg


Illustration by Alex Castro / The Verge

Social media conglomerate Meta has created a single AI model capable of translating across 200 different languages, including many not supported by current commercial tools. The company is open-sourcing the project in the hopes that others will build on its work.

The AI model is part of an ambitious R&D project by Meta to create a so-called “universal speech translator,” which the company sees as important for growth across its many platforms — from Facebook and Instagram, to developing domains like VR and AR. Machine translation not only allows Meta to better understand its users (and so improve the advertising systems that generate 97 percent of its revenue) but could also be the foundation of a killer app for future projects like its augmented reality glasses.
THE MODEL’S TRANSLATIONS DEFINITELY WON’T BE FLAWLESS

Experts in machine translation told The Verge that Meta’s latest research was ambitious and thorough, but noted that the quality of some of the model’s translations would likely be well below that of better-supported languages like Italian or German.

“The major contribution here is data,” Professor Alexander Fraser, an expert in computational linguistics at LMU Munich in Germany, told The Verge. “What is significant is 100 new languages [that can be translated by Meta’s model].”

Meta’s achievements stem, somewhat paradoxically, from both the scope and focus of its research. While most machine translation models handle only a handful of languages, Meta’s model is all-encapsulating: it’s a single system able to translate in more than 40,000 different directions between 200 different languages. But Meta is also interested in including “low-resource languages” in the model — languages with fewer than 1 million publicly-available translated sentence-pairs. These include many African and Indian languages not usually supported by commercial machine translation tools.

“WHAT WOULD IT TAKE TO PRODUCE TRANSLATION TECHNOLOGY THAT WORKS FOR EVERYBODY?”

Meta AI research scientist Angela Fan, who worked on the project, told The Verge that the team was inspired by the lack of attention paid to such lower-resource languages in this field. “Translation doesn’t even work for the languages we speak, so that’s why we started this project,” said Fan. “We have this inclusion motivation of like — ‘what would it take to produce translation technology that works for everybody’?”

Fan says the model, described in a research paper here, is already being tested to support a project that helps Wikipedia editors translate articles into other languages. The techniques developed in creating the model will also be integrated into Meta’s translation tools soon.

HOW DO YOU JUDGE A TRANSLATION?​

Translation is a difficult task at the best of times, and machine translation can be notoriously flaky. When applied at scale on Meta’s platforms, even a small number of errors can produce disastrous results — as, for example, when Facebook mistranslated a post by a Palestinian man from “good morning” to “hurt them,” leading to his arrest by Israeli police.

To evaluate the quality of the new model’s output, Meta created a test dataset consisting of 3001 sentence-pairs for each language covered by the model, each translated from English into a target language by someone who is both a professional translator and native speaker.

The researchers ran these sentences through their model, and compared the machine’s translation with the human reference sentences using a benchmark common in machine translation known as BLEU (which stands for BiLingual Evaluation Understudy).

META’S MODEL DELIVERS IMPROVED BENCHMARKS, BUT THEY CAN’T TELL THE WHOLE STORY

BLEU allows researchers to assign numerical scores measuring the overlap between pairs of sentences, and Meta says its model produces an improvement of 44 percent in BLEU scores across supported languages (compared to previous state-of-the-art work). However, as is often the case in AI research, judging progress based on benchmarks requires context.

Although BLEU scores allow researchers to compare the relative progress of different machine translation models, they do not offer an absolute measure of software’s ability to produce human-quality translations.

Remember: Meta’s dataset consists of 3001 sentences, and each has been translated only by a single individual. This provides a baseline for judging translation quality, but the total expressive power of an entire language cannot be captured by such a small sliver of actual language. This problem is in no way limited to Meta — it’s something that affects all machine translation work, and is particularly acute when assessing low-resource languages — but it shows the scope of the challenges facing the field.

Christian Federmann, a principal research manager who works on machine translation at Microsoft, said the project as a whole was “commendable” in its desire to expand the scope of machine translation software to lesser-covered languages, but noted that BLEU scores by themselves can only provide a limited measure of output quality.

“Translation is a creative, generative process which may result in many different translations which are all equally good (or bad),” Federmann told The Verge. “It is impossible to provide general levels of ‘BLEU score goodness’ as they are dependent on the test set used, its reference quality, but also inherent properties of the language pair under investigation.”

Fan said that BLEU scores had also been complemented with human evaluation, and that this feedback was very positive, and also produced some surprising reactions.

“One really interesting phenomenon is that people who speak low-resource languages often have a lower bar for translation quality because they don’t have any other tool,” said Fan, who is herself a speaker of a low-resource language, Shanghainese. “They’re super generous, and so we actually have to go back and say ‘hey, no, you need to be very precise, and if you see an error, call it out.’”

THE POWER IMBALANCES OF CORPORATE AI​

Working on AI translation is often presented as an unambiguous good, but creating this software comes with particular difficulties for speakers of low-resource languages. For some communities, the attention of Big Tech is simply unwelcome: they don’t want the tools needed to preserve their language in anyone’s hands but their own. For others, the issues are less existential, but more concerned with questions of quality and influence.

SOME COMMUNITIES JUST DON’T WANT BIG TECH CONTROLLING THEIR LANGUAGE

Meta’s engineers explored some of these questions by conducting interviews with 44 speakers of low-resource languages. These interviewees raised a number of positive and negative affects of opening up their languages to machine translation.

One positive, for example, is that such tools allow speakers to access more media and information. They can be used to translate rich resources, like English-language Wikipedia and educational texts. At the same time, though, if low-resource language speakers consume more media generated by speakers of better-supported languages, this could diminish the incentives to create such materials in their own language.

Balancing these issues is challenging, and the problems encountered even within this recent project show why. Meta’s researchers note, for example, that of the 44 low-resource language speakers they interviewed to explore these questions, the majority of these interviewees were “immigrants living in the US and Europe, and about a third of them identify as tech workers” — meaning their perspectives are likely different to those of their home communities and biased from the start.

Professor Fraser of LMU Munich said that despite this, the research was certainly conducted “in a way that is becoming more of involving native speakers” and that such efforts were “laudable.”

“OVERALL, I’M GLAD THAT META HAS BEEN DOING THIS.”

“Overall, I’m glad that Meta has been doing this. More of this from companies like Google, Meta, and Microsoft, all of whom have substantial work in low resource machine translation, is great for the world,” said Fraser. “And of course some of the thinking behind why and how to do this is coming out of academia as well, as well as the training of most of the listed researchers.”

Fan said Meta attempted to preempt many of these social challenges by broadening the expertise they consulted on the project. “I think when AI is developing it’s often very engineering — like, ‘Okay, where are my computer science PhDs? Let’s get together and build it just because we can.’ But actually, for this, we worked with linguists, sociologists, and ethicists,” she said. “And I think this kind of interdisciplinary approach focuses on the human problem. Like, who wants this technology to be built? How do they want it to be built? How are they going to use it?”

Just as important, says Fan, is the decision to open-source as many elements of the project as possible — from the model to the evaluation dataset and training code — which should help redress the power imbalance inherent in a corporation working on such an initiative. Meta also offers grants to researchers who want to contribute to such translation projects but are unable to finance their own projects.

“I think that’s really, really important, because it’s not like one company will be able to holistically solve the problem of machine translation,” said Fan. “It’s everyone — globally — and so we’re really interested in supporting these types of community efforts.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,764
Reputation
10,607
Daps
185,939








Meta Debuts SeamlessM4T, the Swiss Army Knife of Translation Models​

Meta SeamlessM4T


It recognizes speech (that is, automatically — as in automatic speech recognition). It translates speech into speech (or text), and text into text (or speech) — in 100+ languages. Meta’s new Massively Multilingual & Multimodal Machine Translation (SeamlessM4T) is the Swiss army knife of language models. Proud parent Meta introduced the new model in a blog post published on August 22, 2023.

The SeamlessM4T launch follows a number of language technology announcements by Meta over the past 12 months. These include low resource massively multilingual MT in mid 2022, massively multilingual speech translation in May 2023, and multilingual speech model Voicebox in June 2023. The social media giant is spending considerable resources on tackling the language problem of its metaverse vision.

On X, one observer described SeamlessM4T as “revolutionary” and called it a “game-changer.” Another gushed, “It’s not just a tool; it’s a step towards a world where everyone can be understood, regardless of language.”


“The code switching support of SeamlessM4T is pretty cool!” shared a fan with a sense of humor. “It doesn’t do very well with my French or Japanese, but then again neither is very good.”

One Dr. Hubertus Becker questioned the model’s reliability for critical translations, noting, “It’s concerning that an experimental demo can alter the meaning of input words.”

Kalev Leetaru, reporting on SeamlessM4T’s performance in translating Weibo social media posts, cited inconsistent results.

“For some posts it yields translations that compare favorably to both NMT and LLM translations, but with the added cost of having to use language-specific punctuation rules to split into sentences to translate a sentence at a time,” Leetaru explained. “For other posts, it yields subpar translations that can remove or truncate key details, suggesting promise but that it is not quite ready for production use.”

Better than Whisper?​

Of course, the more than 60 authors behind the August 22, 2023 paper introducing SeamlessM4T, believe in what they dubbed “the first multilingual system” to translate from and into English for both speech and text.


If the stats behind SeamlessM4T’s training seem somewhat disparate, that might be because the model required training in so many (formerly) separate and siloed tasks. Similarly, the number of languages handled by the model varies by task.

SeamlessM4T can provide automatic speech recognition (ASR) for almost 100 languages; speech-to-text (STT) translation for nearly 100 input and output languages; speech-to-speech translation and text-to-speech translation for nearly 100 input languages and 36 output languages (including English); and traditional “text” translation for close to 100 languages.

According to the authors, Meta’s motivation for the new model was to work around the existing separate systems that can complete the above tasks — but generally perform well in only one modality per system.

SeamlessM4T, by contrast, reportedly achieves state-of-the-art results for all these languages while offering “multitask support” in a single model. The paper also asserts that SeamlessM4T outperforms its previous SOTA competitors, namely Whisper and AudioPaLM-2.

Meta has publicly released the contributions to its new model, and encourages researchers and developers to build on this first iteration.
 
Top