Meta’s new AI lets people make chatbots. They’re using it for sex.

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,028
Reputation
7,364
Daps
133,789

Meta’s new AI lets people make chatbots. They’re using it for sex.​

From X-rated chats to cancer research, “open-source” models are challenging tech giants’ control over the AI revolution — to the promise or peril of society​

By Pranshu Verma
and
Will Oremus
June 26, 2023 at 6:00 a.m. EDT



Allie is an 18-year old with long brown hair who boasts “tons of sexual experience.” Because she “lives for attention,” she’ll “share details of her escapades” with anyone for free.
But Allie is fake, an artificial intelligence chatbot created for sexual play — which sometimes carries out graphic rape and abuse fantasies.


Tech is not your friend. We are. Sign up for The Tech Friend newsletter.

While firms like OpenAI, Microsoft and Google rigorously train their AI models to avoid a host of taboos, including overly intimate conversations, Allie was built using open-source technology — code that’s freely available to the public and has no such restrictions. Based on a model created by Meta, called LLaMA, Allie is part of a rising tide of specialized AI products anyone can build, from writing tools to chatbots to data analysis applications.

Advocates see open-source AI as a way around corporate control, a boon to entrepreneurs, academics, artists and activists who can experiment freely with transformative technology.

“The overall argument for open-source is that it accelerates innovation in AI,” said Robert Nishihara, CEO and co-founder of the start-up Anyscale, which helps companies run open-source AI models.
A curious person’s guide to artificial intelligence
Anyscale’s clients use AI models to discover new pharmaceuticals, reduce the use of pesticides in farming, and identify fraudulent goods sold online, he said. Those applications would be pricier and more difficult, if not impossible, if they relied on the handful of products offered by the largest AI firms.

Yet that same freedom could also be exploited by bad actors. Open-source models have been used to create artificial child pornography using images of real children as source material. Critics worry it could also enable fraud, cyber hacking and sophisticated propaganda campaigns.
Earlier this month, a pair of U.S. senators, Richard Blumenthal (D-Conn.) and Josh Hawley (R-Mo.) sent a letter to Meta CEO Mark Zuckerberg warning that the release of LLaMA might lead to “its misuse in spam, fraud, malware, privacy violations, harassment, and other wrongdoing and harms.” They asked what steps Meta was taking to prevent such abuse.

Allie’s creator, who spoke on the condition of anonymity for fear of harming his professional reputation, said commercial chatbots such as Replika and ChatGPT are “heavily censored” and can’t offer the type of sexual conversations he desires. With open-source alternatives, many based on Meta’s LLaMA model, the man said he can build his own, uninhibited conversation partners.

“It’s rare to have the opportunity to experiment with ‘state of the art’ in any field,” he said in an interview.
Allie’s creator argued that open-source technology benefits society by allowing people to build products that cater to their preferences without corporate guardrails.
“I think it’s good to have a safe outlet to explore,” he said. “Can’t really think of anything safer than a text-based role-play against a computer, with no humans actually involved.”

On YouTube, influencers offer tutorials on how to build “uncensored” chatbots. Some are based on a modified version of LLaMA, called Alpaca AI, which Stanford University researchers released in March, only to remove it a week later over concerns of cost and “the inadequacies of our content filters.”

Nisha Deo, a spokeswoman for Meta, said the particular model referenced in the YouTube videos, called GPT-4 x Alpaca, “was obtained and made public outside of our approval process.” Representatives from Stanford did not return a request for comment.
AI-generated child sex images spawn new nightmare for the web
Open-source AI models, and the creative applications that build on them, are often published on Hugging Face, a platform for sharing and discussing AI and data science projects.
During a Thursday House science committee hearing, Clem Delangue, Hugging Face’s CEO, urged Congress to consider legislation supporting and incentivizing open-source models, which he argued are “extremely aligned with American values.”


In an interview after the hearing, Delangue acknowledged that open-source tools can be abused. He noted a model intentionally trained on toxic content, GPT-*****, that Hugging Face had removed. But he said he believes open-source approaches allow for both greater innovation and more transparency and inclusivity than corporate-controlled models.
“I would argue that actually most of the harm today is done by black boxes,” Delangue said, referring to AI systems whose inner workings are opaque, “rather than open-source.”
Hugging Face’s rules don’t prohibit AI projects that produce sexually explicit outputs. But they do prohibit sexual content that involves minors, or that is “used or created for harassment, bullying, or without explicit consent of the people represented.” Earlier this month, the New York-based company published an update to its content policies, emphasizing “consent” as a “core value” guiding how people can use the platform.


As Google and OpenAI have grown more secretive about their most powerful AI models, Meta has emerged as a surprising corporate champion of open-source AI. In February it released LLaMA, a language model that’s less powerful than GPT-4, but more customizable and cheaper to run. Meta initially withheld key parts of the model’s code and planned to limit access to authorized researchers. But by early March those parts, known as the model’s “weights,” had leaked onto public forums, making LLaMA freely accessible to all.
“Open source is a positive force to advance technology,” Meta’s Deo said. “That’s why we shared LLaMA with members of the research community to help us evaluate, make improvements and iterate together.”
Since then, LLaMA has become perhaps the most popular open-source model for technologists looking to develop their own AI applications, Nishihara said. But it’s not the only one. In April, the software firm Databricks released an open-source model called Dolly 2.0. And last month, a team based in Abu Dhabi released an open-source model called Falcon that rivals LLaMA in performance.


Marzyeh Ghassemi, an assistant professor of computer science at MIT, said she’s an advocate for open-source language models, but with limits.
Ghassemi said it’s important to make the architecture behind powerful chatbots public, because that allows people to scrutinize how they’re built. For example, if a medical chatbot was created on open-source technology, she said, researchers could see if the data it’s trained on incorporated sensitive patient information, something that would not be possible on chatbots using closed software.
But she acknowledges this openness comes with risk. If people can easily modify language models, they can quickly create chatbots and image makers that churn out disinformation, hate speech and inappropriate material of high quality.

Ghassemi said there should be regulations governing who can modify these products, such as a certifying or credentialing process.

“Like we license people to be able to use a car,” she said, “we need to think about similar framings [for people] … to actually create, improve, audit, edit these open-trained language models.”
Some leaders at companies like Google, which keeps its chatbot Bard under lock and key, see open-source software as an existential threat to their business, because the large language models that are available to the public are becoming nearly as proficient as theirs.
“We aren’t positioned to win this [AI] arms race and neither is OpenAI,” a Google engineer wrote in a memo posted by the tech site Semianalysis in May. “I’m talking, of course, about open source. Plainly put, they are lapping us … While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly.”
The debate over whether AI will destroy us is dividing Silicon Valley
Nathan Benaich, a general partner at Air Street Capital, a London-based venture investing firm focused on AI, noted that many of the tech industry’s greatest advances over the decades have been made possible by open-source technologies — including today’s AI language models.

“If there’s only a few companies” building the most powerful AI models, “they’re only going to be targeting the biggest-use cases,” Benaich said, adding that the diversity of inquiry is an overall boon for society.
Gary Marcus, a cognitive scientist who testified to Congress on AI regulation in May, countered that accelerating AI innovation might not be a good thing, considering the risks the technology could pose to society.
“We don’t open-source nuclear weapons,” Marcus said. “Current AI is still pretty limited, but things might change.”
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,028
Reputation
7,364
Daps
133,789

Meta’s battle with ChatGPT begins now​


Meta’s AI assistant is being put everywhere across Instagram, WhatsApp, and Facebook. Meanwhile, the company’s next major AI model, Llama 3, has arrived.​

By Alex Heath, a deputy editor and author of the Command Linenewsletter. He has over a decade of experience covering the tech industry.

Apr 18, 2024, 11:59 AM EDT

75 Comments

Mark Zuckerberg onstage at Meta Connect 2023.

Mark Zuckerberg announcing Meta’s AI assistant at Connect 2023. Image: Meta

ChatGPT kicked off the AI chatbot race. Meta is determined to win it.

To that end: the Meta AI assistant, introduced last September, is now being integrated into the search box of Instagram, Facebook, WhatsApp, and Messenger. It’s also going to start appearing directly in the main Facebook feed. You can still chat with it in the messaging inboxes of Meta’s apps. And for the first time, it’s now accessible via a standalone website at Meta.ai.

For Meta’s assistant to have any hope of being a real ChatGPT competitor, the underlying model has to be just as good, if not better. That’s why Meta is also announcing Llama 3, the next major version of its foundational open-source model. Meta says that Llama 3 outperforms competing models of its class on key benchmarks and that it’s better across the board at tasks like coding. Two smaller Llama 3 models are being released today, both in the Meta AI assistant and to outside developers, while a much larger, multimodal version is arriving in the coming months.

The goal is for Meta AI to be “the most intelligent AI assistant that people can freely use across the world,” CEO Mark Zuckerberg tells me. “With Llama 3, we basically feel like we’re there.”

Screenshots of Meta AI in Instagram.

In the US and a handful of other countries, you’re going to start seeing Meta AI in more places, including Instagram’s search bar. Image: Meta

A screenshot of Meta AI’s chatbot.

How Google results look in Meta AI. Meta

The Meta AI assistant is the only chatbot I know of that now integrates real-time search results from both Bing and Google — Meta decides when either search engine is used to answer a prompt. Its image generation has also been upgraded to create animations (essentially GIFs), and high-res images now generate on the fly as you type. Meanwhile, a Perplexity-inspired panel of prompt suggestions when you first open a chat window is meant to “demystify what a general-purpose chatbot can do,” says Meta’s head of generative AI, Ahmad Al-Dahle.

While it has only been available in the US to date, Meta AI is now being rolled out in English to Australia, Canada, Ghana, Jamaica, Malawi, New Zealand, Nigeria, Pakistan, Singapore, South Africa, Uganda, Zambia, and Zimbabwe, with more countries and languages coming. It’s a far cry from Zuckerberg’s pitch of a truly global AI assistant, but this wider release gets Meta AI closer to eventually reaching the company’s more than 3 billion daily users.

Meta AI image generation.

Meta AI’s image generation can now render images in real time as you type. Meta

There’s a comparison to be made here to Stories and Reels, two era-defining social media formats that were both pioneered by upstarts — Snapchat and TikTok, respectively — and then tacked onto Meta’s apps in a way that made them even more ubiquitous.

“I expect it to be quite a major product”

Some would call this shameless copying. But it’s clear that Zuckerberg sees Meta’s vast scale, coupled with its ability to quickly adapt to new trends, as its competitive edge. And he’s following that same playbook with Meta AI by putting it everywhere and investing aggressively in foundational models.

“I don’t think that today many people really think about Meta AI when they think about the main AI assistants that people use,” he admits. “But I think that this is the moment where we’re really going to start introducing it to a lot of people, and I expect it to be quite a major product.”


Command Line​

/ A newsletter from Alex Heath about the tech industry’s inside conversation.

Email (required)SIGN UP

By submitting your email, you agree to our Terms and Privacy Notice. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

“Compete with everything out there”​

What Meta AI’s web app looks like on a MacBook screen.

The new web app for Meta AI. Image: Meta

Today Meta is introducing two open-source Llama 3 models for outside developers to freely use. There’s an 8-billion parameter model and a 70-billion parameter one, both of which will be accessible on all the major cloud providers. (At a very high level, parameters dictate the complexity of a model and its capacity to learn from its training data.)

Llama 3 is a good example of how quickly these AI models are scaling. The biggest version of Llama 2, released last year, had 70 billion parameters, whereas the coming large version of Llama 3 will have over 400 billion, Zuckerberg says. Llama 2 trained on 2 trillion tokens (essentially the words, or units of basic meaning, that compose a model), while the big version of Llama 3 has over 15 trillion tokens. (OpenAI has yet to publicly confirm the number of parameters or tokens in GPT-4.)

A key focus for Llama 3 was meaningfully decreasing its false refusals, or the number of times a model says it can’t answer a prompt that is actually harmless. An example Zuckerberg offers is asking it to make a “killer margarita.” Another is one I gave him during an interview last year, when the earliest version of Meta AI wouldn’t tell me how to break up with someone.

Meta has yet to make the final call on whether to open source the 400-billion-parameter version of Llama 3 since it’s still being trained. Zuckerberg downplays the possibility of it not being open source for safety reasons.

“I don’t think that anything at the level that what we or others in the field are working on in the next year is really in the ballpark of those type of risks,” he says. “So I believe that we will be able to open source it.”

Related​



Before the most advanced version of Llama 3 comes out, Zuckerberg says to expect more iterative updates to the smaller models, like longer context windows and more multimodality. He’s coy on exactly how that multimodality will work, though it sounds like generating video akin to OpenAI’s Sora isn’t in the cards yet. Meta wants its assistant to become more personalized, and that could mean eventually being able to generate images in your own likeness.

Charts showing how Meta’s Llama 3 performs on benchmarks against competing models.

Here, it’s worth noting that there isn’t yet a consensus on how to properly evaluate the performance of these models in a truly standardized way. Image: Meta

Meta gets hand-wavy when I ask for specifics on the data used for training Llama 3. The total training dataset is seven times larger than Llama 2’s, with four times more code. No Meta user data was used, despite Zuckerberg recently boasting that it’s a larger corpus than the entirety of Common Crawl. Otherwise, Llama 3 uses a mix of “public” internet data and synthetic AI-generated data. Yes, AI is being used to build AI.

The pace of change with AI models is moving so fast that, even if Meta is reasserting itself atop the open-source leaderboard with Llama 3 for now, who knows what tomorrow brings. OpenAI is rumored to be readying GPT-5, which could leapfrog the rest of the industry again. When I ask Zuckerberg about this, he says Meta is already thinking about Llama 4 and 5. To him, it’s a marathon and not a sprint.

“At this point, our goal is not to compete with the open source models,” he says. “It’s to compete with everything out there and to be the leading AI in the world.”
 
Top