Large Language Models News & Discussions

bnew · Apr 19, 2024

Meta’s battle with ChatGPT begins now

Mark Zuckerberg says Meta AI is now “the most intelligent AI assistant” that’s available for free.

www.theverge.com

Meta’s battle with ChatGPT begins now

Meta’s AI assistant is being put everywhere across Instagram, WhatsApp, and Facebook. Meanwhile, the company’s next major AI model, Llama 3, has arrived.

By Alex Heath, a deputy editor and author of the Command Linenewsletter. He has over a decade of experience covering the tech industry.

Apr 18, 2024, 11:59 AM EDT

75 Comments

Mark Zuckerberg onstage at Meta Connect 2023.

Mark Zuckerberg announcing Meta’s AI assistant at Connect 2023. Image: Meta

ChatGPT kicked off the AI chatbot race. Meta is determined to win it.

To that end: the Meta AI assistant, introduced last September, is now being integrated into the search box of Instagram, Facebook, WhatsApp, and Messenger. It’s also going to start appearing directly in the main Facebook feed. You can still chat with it in the messaging inboxes of Meta’s apps. And for the first time, it’s now accessible via a standalone website at Meta.ai.

For Meta’s assistant to have any hope of being a real ChatGPT competitor, the underlying model has to be just as good, if not better. That’s why Meta is also announcing Llama 3, the next major version of its foundational open-source model. Meta says that Llama 3 outperforms competing models of its class on key benchmarks and that it’s better across the board at tasks like coding. Two smaller Llama 3 models are being released today, both in the Meta AI assistant and to outside developers, while a much larger, multimodal version is arriving in the coming months.

The goal is for Meta AI to be “the most intelligent AI assistant that people can freely use across the world,” CEO Mark Zuckerberg tells me. “With Llama 3, we basically feel like we’re there.”

In the US and a handful of other countries, you’re going to start seeing Meta AI in more places, including Instagram’s search bar. Image: Meta

How Google results look in Meta AI. Meta

The Meta AI assistant is the only chatbot I know of that now integrates real-time search results from both Bing and Google — Meta decides when either search engine is used to answer a prompt. Its image generation has also been upgraded to create animations (essentially GIFs), and high-res images now generate on the fly as you type. Meanwhile, a Perplexity-inspired panel of prompt suggestions when you first open a chat window is meant to “demystify what a general-purpose chatbot can do,” says Meta’s head of generative AI, Ahmad Al-Dahle.

While it has only been available in the US to date, Meta AI is now being rolled out in English to Australia, Canada, Ghana, Jamaica, Malawi, New Zealand, Nigeria, Pakistan, Singapore, South Africa, Uganda, Zambia, and Zimbabwe, with more countries and languages coming. It’s a far cry from Zuckerberg’s pitch of a truly global AI assistant, but this wider release gets Meta AI closer to eventually reaching the company’s more than 3 billion daily users.

Meta AI’s image generation can now render images in real time as you type. Meta

There’s a comparison to be made here to Stories and Reels, two era-defining social media formats that were both pioneered by upstarts — Snapchat and TikTok, respectively — and then tacked onto Meta’s apps in a way that made them even more ubiquitous.

“I expect it to be quite a major product”

Some would call this shameless copying. But it’s clear that Zuckerberg sees Meta’s vast scale, coupled with its ability to quickly adapt to new trends, as its competitive edge. And he’s following that same playbook with Meta AI by putting it everywhere and investing aggressively in foundational models.

“I don’t think that today many people really think about Meta AI when they think about the main AI assistants that people use,” he admits. “But I think that this is the moment where we’re really going to start introducing it to a lot of people, and I expect it to be quite a major product.”

Command Line

/ A newsletter from Alex Heath about the tech industry’s inside conversation.

Email (required)SIGN UP

By submitting your email, you agree to our Terms and Privacy Notice. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

“Compete with everything out there”

What Meta AI’s web app looks like on a MacBook screen.

The new web app for Meta AI. Image: Meta

Today Meta is introducing two open-source Llama 3 models for outside developers to freely use. There’s an 8-billion parameter model and a 70-billion parameter one, both of which will be accessible on all the major cloud providers. (At a very high level, parameters dictate the complexity of a model and its capacity to learn from its training data.)

Llama 3 is a good example of how quickly these AI models are scaling. The biggest version of Llama 2, released last year, had 70 billion parameters, whereas the coming large version of Llama 3 will have over 400 billion, Zuckerberg says. Llama 2 trained on 2 trillion tokens (essentially the words, or units of basic meaning, that compose a model), while the big version of Llama 3 has over 15 trillion tokens. (OpenAI has yet to publicly confirm the number of parameters or tokens in GPT-4.)

A key focus for Llama 3 was meaningfully decreasing its false refusals, or the number of times a model says it can’t answer a prompt that is actually harmless. An example Zuckerberg offers is asking it to make a “killer margarita.” Another is one I gave him during an interview last year, when the earliest version of Meta AI wouldn’t tell me how to break up with someone.

Meta has yet to make the final call on whether to open source the 400-billion-parameter version of Llama 3 since it’s still being trained. Zuckerberg downplays the possibility of it not being open source for safety reasons.

“I don’t think that anything at the level that what we or others in the field are working on in the next year is really in the ballpark of those type of risks,” he says. “So I believe that we will be able to open source it.”

Before the most advanced version of Llama 3 comes out, Zuckerberg says to expect more iterative updates to the smaller models, like longer context windows and more multimodality. He’s coy on exactly how that multimodality will work, though it sounds like generating video akin to OpenAI’s Sora isn’t in the cards yet. Meta wants its assistant to become more personalized, and that could mean eventually being able to generate images in your own likeness.

Charts showing how Meta’s Llama 3 performs on benchmarks against competing models.

Here, it’s worth noting that there isn’t yet a consensus on how to properly evaluate the performance of these models in a truly standardized way. Image: Meta

Meta gets hand-wavy when I ask for specifics on the data used for training Llama 3. The total training dataset is seven times larger than Llama 2’s, with four times more code. No Meta user data was used, despite Zuckerberg recently boasting that it’s a larger corpus than the entirety of Common Crawl. Otherwise, Llama 3 uses a mix of “public” internet data and synthetic AI-generated data. Yes, AI is being used to build AI.

The pace of change with AI models is moving so fast that, even if Meta is reasserting itself atop the open-source leaderboard with Llama 3 for now, who knows what tomorrow brings. OpenAI is rumored to be readying GPT-5, which could leapfrog the rest of the industry again. When I ask Zuckerberg about this, he says Meta is already thinking about Llama 4 and 5. To him, it’s a marathon and not a sprint.

“At this point, our goal is not to compete with the open source models,” he says. “It’s to compete with everything out there and to be the leading AI in the world.”

bnew · Apr 23, 2024

The Languages AI Is Leaving Behind

The generative-AI boom looks very different for non-English speakers.

www.theatlantic.com

The Languages AI Is Leaving Behind

The generative-AI boom looks very different for non-English speakers.

By Damon Beres

Illustration by The Atlantic

APRIL 19, 2024

This is Atlantic Intelligence, a limited-run series in which our writers help you wrap your mind around artificial intelligence and a new machine age. Sign up here.

Generative AI is famously data-hungry. The technology requires huge troves of digital information—text, photos, video, audio—to “learn” how to produce convincingly humanlike material. The most powerful large language models have effectively “read” just about everything; when it comes to content mined from the open web, this means that AI is especially well versed in English and a handful of other languages, to the exclusion of thousands more that people speak around the world.

In a recent story for The Atlantic, my colleague Matteo Wong explored what this might mean for the future of communication. AI is positioned more and more as the portal through which billions of people might soon access the internet. Yet so far, the technology has developed in such a way that will reinforce the dominance of English while possibly degrading the experience of the web for those who primarily speak languages with less minable data. “AI models might also be void of cultural nuance and context, no matter how grammatically adept they become,” Matteo writes. “Such programs long translated ‘good morning’ to a variation of ‘someone has died’ in Yoruba,” David Adelani, a DeepMind research fellow at University College London told Matteo, “because the same Yoruba phrase can convey either meaning.”

But Matteo also explores how generative AI might be used as a tool to preserve languages. The grassroots efforts to create such applications move slowly. Meanwhile, tech giants charge ahead to deploy ever more powerful models on the web—crystallizing a status quo that doesn’t work for all.

— Damon Beres, senior editor

Illustration by Matteo Giuseppe Pani. Source: Getty.

The AI Revolution Is Crushing Thousands of Languages

By Matteo Wong

Recently, Bonaventure Dossou learned of an alarming tendency in a popular AI model. The program described Fon—a language spoken by Dossou’s mother and millions of others in Benin and neighboring countries—as “a fictional language.”

This result, which I replicated, is not unusual. Dossou is accustomed to the feeling that his culture is unseen by technology that so easily serves other people. He grew up with no Wikipedia pages in Fon, and no translation programs to help him communicate with his mother in French, in which he is more fluent. “When we have a technology that treats something as simple and fundamental as our name as an error, it robs us of our personhood,” Dossou told me.

The rise of the internet, alongside decades of American hegemony, made English into a common tongue for business, politics, science, and entertainment. More than half of all websites are in English, yet more than 80 percent of people in the world don’t speak the language. Even basic aspects of digital life—searching with Google, talking to Siri, relying on autocorrect, simply typing on a smartphone—have long been closed off to much of the world. And now the generative-AI boom, despite promises to bridge languages and cultures, may only further entrench the dominance of English in life on and off the web.

Read the full article.

bnew · Apr 23, 2024

1/1
phi-3-mini: 3.8B model matching Mixtral 8x7B and GPT-3.5

Plus a 7B model that matches Llama 3 8B in many benchmarks.

Plus a 14B model.

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on...

arxiv.org

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/5
Microsoft just released Phi-3

Phi-3 14B beats Llama-3 8B, GPt-3.5 and Mixtral 8x7b MoE in most of the benchmarks.

Even the Phi-3 mini beats Llama-3 8B in MMLU and HellaSwag.

2/5
More details and insights to follow in tomorrow's AI newsletter.

Subscribe now to get it delivered to your inbox first thing in the morning tomorrow: Unwind AI | Shubham Saboo | Substack

3/5
Reaserch Paper:

4/5
This is absolutely insane speed of Opensource AI development.

5/5
True, all of this is happening so fast!!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/2
phi-3 is out! never would have guessed that our speculative attempt at creating synthetic python code for phi-1 (following TinyStories) would eventually lead to a gpt-3.5-level SLM. defly addicted to generating synth data by now...

2/2
hf:

https://langfuse.datastrain.io/project/cluve1io10001y0rnqesj5bz4/traces/484ec288-6a9d-480b-bcca-9c8dfd965325?observation=b86ac62c-9ac2-40d2-9916-63e4c3e11b17…
@Prince_Canuma
mlx-community (MLX Community)
@answerdotai
@AIatMeta
@huggingface
@MicrosoftAI

microsoft/Phi-3-mini-4k-instruct · Hugging Face

1/5
Amazing numbers. Phi-3 is topping GPT-3.5 on MMLU at 14B. Trained on 3.3 trillion tokens. They say in the paper 'The innovation lies entirely in our dataset for training - composed of heavily filtered web data and synthetic data.'

2/5
Small is so big right now!

3/5
phi-3-mini: 3.8B model matching Mixtral 8x7B and GPT-3.5

Plus a 7B model that matches Llama 3 8B in many benchmarks.

Plus a 14B model.

[2404.14219] Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

4/5
The prophecy has been fulfilled!

5/5
Wow.
'Phi-3-mini can be quantized to 4-bits so that it only occupies ≈ 1.8GB of memory. We tested the quantized model by deploying phi-3-mini on iPhone 14 with A16 Bionic chip running natively on-device and fully offline achieving more than 12 tokens per second.'

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/9
Run Microsoft Phi-3 locally in 3 simple steps (100% free and without internet):

2/9
1. Install Ollama on your Desktop

- Go to https://ollama.com/ - Download Ollama on your computer (works on Mac, Windows and Linux)
- Open the terminal and type this command: 'ollama run phi3'

3/9
2. Install OpenWeb UI (ChatGPT like opensource UI)

- Go to https://docs.openwebui.com - Install the docker image of Open Web UI with a single command
- Make sure Docker is installed and running on your computer

4/9
3. Run the model locally like ChatGPT

- Open the ChatGPT like UI locally by going to this link: http://localhost:3000 - Select the model from the top.
- Query and ask questions like ChatGPT

This is running on Macbook M1 Pro 16GB machine.

5/9
If you find this useful, RT to share it with your friends.

Don't forget to follow me
@Saboo_Shubham_ for more such LLMs tips and resources.

6/9
Run Microsoft Phi-3 locally in 3 simple steps (100% free and without internet):

7/9
Not checked yet.
@ollama was the first to push the updates!

8/9
That would be fine-tuning. You can try out
@monsterapis for nocode finetuning of LLMs.

9/9
Series of language models that pretty much outperformed llama-3 even with the small size.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

microsoft/Phi-3-mini-4k-instruct · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Also 128k-instruct: microsoft/Phi-3-mini-128k-instruct-onnx · Hugging Face
Edit: All versions: Phi-3 - a microsoft Collection

https://arxiv.org/pdf/2404.14219.pdf

PoorAndDangerous · Apr 23, 2024

GPT has made me so much more productive at work and has increased my learning rate exponentially. Been using it for a lot of javascript; useful for some of the work I do for the banking industry as a lot of their cores use HTML and scripting.

bnew · Apr 23, 2024

1/10
Can AI rewrite our human genome?

Today, we announce the successful editing of DNA in human cells with gene editors fully designed with AI. Not only that, we've decided to freely release the molecules under the
@ProfluentBio OpenCRISPR initiative.

Lots to unpack

2/10
AI has become increasingly pervasive in our daily lives from how we sift through information, produce content, and interact with the world. This marks a new chapter where AI is used to alter the fundamental blueprint of who we are - our DNA.

3/10
We were immediately drawn to gene editing due to the pressing societal needs, potential for one-and-done cures to disease, and the scientific challenge + complex biology involving protein, RNA, and DNA.

4/10
Our LLMs were trained on massive scale sequence and biological context to generate millions of diverse CRISPR-like proteins that do not occur in nature, thereby exponentially expanding virtually all known CRISPR families at-will.

5/10
We then focus on type II effector complexes, generating cas9-like proteins and gRNAs. These proteins are hundreds of mutations away from anything in nature.

6/10
We then characterized our generations in the wet lab and found that the AI-designed gene editors show comparable or improved activity and specificity relative to SpCas9, the prototypical gene editing effector. More characterization is underway but we're already impressed.

7/10
We also created an AI-designed base editor which exhibited really exciting performance in precise A->G edits.

8/10
The results point to a future where AI precisely designs what is needed to create a range of bespoke cures for disease. There is still much to build to achieve this vision. To spur innovation and democratization, we are freely releasing OpenCRISPR-1. Try it out!

9/10
This was truly a team effort across all disciplines of the company.
@jeffruffolo SNayfach JGallagher @AadyotB JBeazer RHussain JRuss JYip EHill @MartinPacesa @alexjmeeske PCameron and the broader Profluent team. If you want to build with us, join. We’re hiring.

10/10
Paper: https://biorxiv.org/content/10.1101/2024.04.22.590591v1 Blog: https://profluent.bio/blog/editing-the-human-genome-with-ai NYTimes: https://nytimes.com/2024/04/22/tech...-ios-share&referringSource=articleShare Press Release: https://businesswire.com/news/home/...AI-Created-and-Open-Source-Gene-Editor Access OpenCRISPR-1: OpenCRISPR

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 23, 2024

Meet Phi-3: Microsoft's New LLM That Can Run On Your Phone

Microsoft unveils Phi-3 model family, a collection of tiny LLMs that are highly powerful and can run on smartphones.

favtutor.com

Meet Phi-3: Microsoft’s New LLM That Can Run On Your Phone

by Saptorshee Nag

April 23, 2024

Reading Time: 6 mins read

Who thought that one day we would have tiny LLMs that can cope up with highly powerful ones such as Mixtral, Gemma, and GPT? Microsoft Research announced a powerful family of small LLMs called the Phi-3 model family.

Highlights:

Microsoft unveils Phi-3 model family, a collection of tiny LLMs that are highly powerful and can run on smartphones.
The model family is composed of 3 models namely Phi-3-mini, Phi-3-small and Phi-3-medium.
Shows impressive benchmark results and highly rivals models like Mixtral 8x7B and GPT-3.5.

Microsoft’s Phi-3 LLM Family

Microsoft has leveled up its Generative AI game once again. It released Phi-2 back in December 2023, which had 2.7 billion parameters and provided state-of-the-art performance compared to base language models with less than 13 billion parameters.

However many LLMs have been released since then which have outperformed Phi-2 on several benchmarks and evaluation metrics.

This is why Microsoft has released Phi-3, as the latest competitor in the Gen AI market, and the best thing about this model family is that you can run it on your smartphones!

So how powerful and efficient is this state-of-the-art model family? And what are its groundbreaking features? Let’s explore all these topics in-depth through this article.

Microsoft introduced the Model family in the form of three models: Phi-3-mini, Phi-3-small, and Phi-3-medium.

Let’s study all these models separately.

1) Phi-3-mini 3.8b

Phi-3-Mini is a 3.8 billion parameter language model that was trained on 3.3 trillion tokens in a large dataset. It has performance levels comparable to larger versions like Mixtral 8x7B and GPT-3.5, despite its small size.

Because Mini is so powerful, it can operate locally on a mobile device. Because of its modest size, it can be quantized to 4 bits, requiring about 1.8GB of memory. Microsoft used Phi-3-Mini, which runs natively on the iPhone 14 with an A16 Bionic CPU and achieves more than 12 tokens per second while entirely offline, to test the quantized model.

The transformer decoder architecture used in the phi-3-mini model has a default context length of 4K. With a vocabulary size of 320641, phi-3-mini employs the same tokenizer as Llama 2, and it is constructed on a similar block structure.

Thus, any package created for the Llama-2 model family can be immediately converted to phi-3-mini. 32 heads, 32 layers, and 3072 hidden dimensions are used in the model.

The training dataset for Phi-3-Mini, which is an enlarged version of the one used for Phi-2, is what makes it innovative. This dataset includes both synthetic and highly filtered online data. Additionally, the model’s resilience, safety, and chat structure have all been optimized.

2) Phi-3-Small and Phi-3-Medium

Additionally, Phi-3-Small and Phi-3-Medium versions from Microsoft have been released; these are both noticeably more powerful than Phi-3-Mini. Using the tiktoken tokenizer, Phi-3-Small, with its 7 billion parameters, achieves better multilingual tokenization. It has an impressive 100,352-word vocabulary and an 8K default context.

The Phi-3-small model has 32 layers and a hidden size of 4096, following the typical decoder design of a 7B model class. Phi-3-Small uses a grouped-query attention system, where four queries share a single key, to reduce the KV cache footprint.

Additionally, phi-3-small maintains lengthy context retrieval speed while further optimizing KV cache savings with the usage of new block sparse attention and alternate layers of dense attention. For this model, an extra 10% of multilingual data was also used.

Using the same tokenizer and architecture as phi-3-mini, Microsoft researchers also trained phi-3-medium, a model with 14B parameters, on the same data for a slightly longer number of epochs (4.8T tokens overall as compared to phi-3-small). The model has an embedding dimension of 5120 and 40 heads and 40 layers.

Looking At the Benchmarks

The typical open-source benchmarks testing the model’s reasoning capacity (both common sense reasoning and logical reasoning) were used to test the phi-3-mini, phi-3-small, and phi-3-medium versions. They are contrasted with GPT-3.5, phi-2, Mistral-7b-v0.1, Mixtral-8x7b, Gemma 7B, and Llama-3-instruct8b.

Phi-3-Mini is suited for mobile phone deployment, scoring 69% on the MMLU test and 8.38 on the MT bench.

With an MMLU score of 75.3, the Phi-3-small 7 billion parameter model performs better than Meta’s newly released Llama 3 8B Instruction, which has a score of 66.

However, the biggest difference was observed when Phi-3-medium was compared to all the models. It defeated several models including Mixtral 8x7B, GPT-3.5, and even Meta’s newly launched Llama 3 on several benchmark metrics such as MMLU, HellaSwag, ARC-C, and Big-Bench Hard. The differences were huge where Phi-3-medium highly outperformed all the competitors.

This just goes to show how powerful these tiny mobile LLMs are compared to all these large language models which need powerful GPUs and CPUs to operate. The benchmarks give us an idea that the Phi-3 model family will do quite well in coding-related tasks, common sense reasoning tasks, and general knowledge capabilities.

Are there any Limitations?

Even though it is too powerful for its size and deployment device, the Phi-3 model family has one major limitation. Its size essentially limits it for some tasks, even if it shows a comparable level of language understanding and reasoning capacity to much larger models. For instance, its inability to store large amounts of factual knowledge causes it to perform worse on tests like TriviaQA.

“Exploring multilingual capabilities for Small Language Models is an important next step, with some initial promising results on phi-3-small by including more multilingual data. The use of carefully curated training data, targeted post-training, and improvements from red-teaming insights significantly mitigates these issues across all dimensions. However, there is significant work ahead to fully address these challenges.”

Microsoft also provided a potential solution to this drawback. It thinks a search engine added to the model can help with these flaws. Furthermore, the model’s limited language proficiency in English emphasizes the necessity of investigating multilingual capabilities for Small Language Models.

Is Phi-3 Safe?

The responsible AI guidelines of Microsoft were followed in the development of Phi-3-mini.

The total strategy included automated testing, evaluations across hundreds of RAI harm categories, red-teaming, and safety alignment in post-training. Several in-house created datasets and datasets with adjustments influenced by helpfulness and harmlessness preferences were used to address the RAI harm categories in safety post-training.

To find further areas for improvement during the post-training phase, a Microsoft independent red team conducted an iterative analysis of phi-3-mini. Microsoft refined the post-training dataset by selecting new datasets that addressed their insights, as a result of receiving input from them. The procedure significantly reduced the rates of adverse responses.

Conclusion

Are we on the verge of a new era for Mobile LLMs? Phi-3 is here to answer this question. The mobile developer community will be highly benefit from Phi-3 models, especially Small and Medium. Recently, Microsoft has also working on the VASA-1 image to video Model, which is also a big thing in the gen AI space.

bnew · Apr 25, 2024

1/27
Llama 3 surprised everyone less than a week ago, but Microsoft just dropped Phi-3 and it's incredibly capable small AI model.

We may soon see 7B models that can beat GPT-4. People are already coming up with incredible use cases.

10 wild examples:

2/27
1. Phi-3 Mini running on Raspberry Pi 5 h/t
@00_brad

3/27
Getting over 4 tokens per second on a Raspberry Pi 5 with Microsoft's Phi-3 Mini! Great model to run entirely locally! Model link in comments!

4/27
2. Phi-3 Mini with 128k context window on NVIDIA TensorRT-LLM

5/27
Announcing our collaboration to accelerate @Microsoft's new Phi-3 Mini open language model with NVIDIA TensorRT-LLM. https://nvda.ws/3xJ6zR0 Developers can try Phi-3 Mini with the 128K context window at Production-Ready APIs That Run Anywhere.

6/27
3. Phi-3 running locally on Vision Pro h/t
@ivanfioravanti

7/27
Apple MLX: Phi-3 running locally on a VisionPro with VisionOS 1.2 Beta 3!

Fully offline, pretty fast! 22.25 t/s

Credit to @awnihannun for the special quantized version for MLX

In the code I used displayEveryNTokens = 3 to make streaming more "continuous".

8/27
4. Comparing Llama 3 & Phi-3 using RAG h/t
@akshay_pachaar

9/27
Let's compare Llama-3 & Phi-3 using RAG:

10/27
5. Phi-3 Mini running on iPhone 14 Pro h/t
@wattmaller1

11/27
Well, I did it. I ran Phi 3 on a phone. It was slow the first time, but I guess it cached something because then it went faster, as seen below. That's an iPhone 14 Pro. It's Phi 3 mini, 4k context. Via llama.cpp library

12/27
6. Phi-3 running locally on iPhone
@ac_crypto

13/27
Phi-3 running locally on an iPhone using MLX

Fully offline, it’s fast!

Credit to @exolabs_ team @mo_baioumy h/t @awnihannun for the speedy model impl in MLX

14/27
7. RAG with Phi-3 on
@ollama h/t
@ashpreetbedi

15/27
RAG with Phi-3 on @ollama: I dont trust the benchmarks, so I recorded my very first test run. Completely unedited, each question asked for the first time. First impression is that it is good, very very good for its size.

Try it yourself: phidata/cookbook/llms/ollama/rag at main · phidatahq/phidata

16/27
8. Phi-3-mini-128-instruct on MLX h/t
@ShellZero

17/27
Phi3-mini-128-instruct on MLX. Blazing

Prompt - 131.917 tps
Generation - 43.387 tps

M3 Max - 64GB memory.

@awnihannun #mlx

18/27
9. Phi 3 SLM with
@LMStudioAI on Windows h/t
@jamie_maguire1

19/27
Running the new Phi 3 SLM with @LMStudioAI on Windows.

I like it.

Only using 3GB of RAM.

20/27
10. Phi-3 on iPhone 15 Pro
@awnihannun

21/27
Using MLX Swift to generate text with 4-bit Phi-3 on iPhone 15 Pro.

Fully on device, runs pretty fast.

Example here: https://github.com/ml-explore/mlx-swift-examples Also all MIT!

22/27
If you want to keep up with the latest AI developments and tools, subscribe to The Rundown it's FREE:

23/27
If you enjoyed this thread,

Follow me
@minchoi and please Bookmark, Like, Comment & Repost the first Post below to share with your friends:

24/27
Llama 3 surprised everyone less than a week ago, but Microsoft just dropped Phi-3 and it's incredibly capable small AI model.

We may soon see 7B models that can beat GPT-4. People are already coming up with incredible use cases.

10 wild examples:

25/27
We are going to see many agentic workflow apps and designs, then running them on phone

26/27
Seeing a lot of focus shift towards smaller, mobile offline AI with these models.

27/27
Good coding benchmarks, thanks for sharing

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/4
No one is talking about this major LLM from China.

2 days ago, SenseTime launched SenseNova 5.0, which according to the report (translated from Chinese):

- Beats GPT-4T on nearly all benchmarks
- Has a 200k context window
- Is trained on more than 10TB tokens
- Has major advancements in knowledge, mathematics, reasoning, and coding capabilities

Crazy how much is happening in the world of AI in China that's going completely under the radar.

2/4
H/t to
@Ghost_Z12 for spotting this.

Here's the source (it's in Chinese): 商汤甩出大模型豪华全家桶！秀拳皇暴打GPT-4，首晒“文生视频”，WPS小米现场助阵 - 智东西

3/4
Sounds like we need to accelerate

4/4
A new model is coming this year 100%, but not sure if it'll be called GPT-5

Sam Altman on the Lex Fridman pod in March:

'We will release an amazing model this year. I don't know what we will call it'

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

SenseTime launches SenseNova 5.0 with comprehensive updates and the industry-leading "Cloud-to-Edge" full-stack large model product matrix-News and Stories-SenseTime

www.sensetime.com

SenseTime launches SenseNova 5.0 with comprehensive updates and the industry-leading "Cloud-to-Edge" full-stack large model product matrix

2024-04-24

23 April 2024, Shanghai – SenseTime launched its latest Large Model, the SenseNova 5.0, at its Tech Day event in Shanghai. With its cutting-edge technology accelerating the development of Generative AI, SenseTime also launched the industry-leading "Cloud-To-Edge" full-stack large model product matrix that is scalable and applicable across various scenarios.

Dr. Xu Li, Chairman of the Board and CEO of SenseTime, said, “In our pursuit to push the boundaries of SenseNova’s capabilities, SenseTime remains guided by the Scaling Law as we build upon our Large Model based on this three-tier architecture: Knowledge, Reasoning, and Execution (KRE)."

f6fb1eb6-c86f-4766-a49e-ce35f458ed26(1).jpg

Dr. Xu Li, Chairman of the Board and CEO of SenseTime, introduced the advancements of the SenseNova 5.0 Large Model at the event.

SenseNova 5.0: Linguistic, creative and scientific capabilities greatly improved; multimodal interactions added

Since its debut in April 2023, the SenseNova Large Model is currently in its fifth iteration. SenseNova 5.0 has undergone over 10TB of token training, covering a large amount of synthetic data. It adopts a Mixture of Experts, enabling effective context window coverage of approximately 200,000 during inference. The major advancements in SenseNova 5.0 focus on knowledge, mathematics, reasoning, and coding capabilities.

In terms of linguistic and creative capabilities, the creative writing, reasoning, and summary abilities of SenseNova 5.0 have significantly improved. Given the same knowledge input, it provides better comprehension, summarization, and question and answers, providing strong support for vertical applications such as education and the content industries. On its scientific capabilities, SenseNova 5.0 boasts best-in-class mathematical, coding and reasoning capabilities, providing a solid foundation for applications in finance and data analysis.

SenseNova 5.0 is also equipped with superior multimodal capabilities in product application. It supports high-definition image parsing and understanding, as well as text-to-image generation. In addition, it extracts complex data across-documents and summarizes answers to questions, possessing strong multimodal interaction capability. At present, SenseNova 5.0’s world-leading graphical and textual perception ranks first based on its aggregate score on MMBench, an authoritative multimodality benchmark. It has also achieved high scores in other well-known multimodal rankings such as MathVista, AI2D and ChartQA.

The industry-leading full-stack large model edge-side product matrix

SenseTime also launched the industry-leading edge-side full-stack large model product matrix, which includes the SenseTime Edge-side Large Model for terminal devices, and the SenseTime Integrated Large Model (Enterprise) edge device that can be applied to fields such as finance, coding, healthcare and government services.

The inference speed of the SenseNova Edge-side Large Language Model has achieved industry-leading performance. It can generate 18.3 words per second on the mid-range platforms, and an impressive 78.3 words per second on flagship platforms.

The diffusion model has also achieved the fastest inference speed in the industry. The inference speed of edge-side LDM-AI image diffusion technology takes less than 1.5 seconds on a mainstream platform, and supporting the output of high-definition images with resolution of 12 million pixels and above, as well as image editing functions such as proportional, free-form and rotation image expansion.

SenseTime conducted a live demonstration of its SenseNova Edge-side Large Model on image expansion.

The SenseTime Integrated Large Model (Enterprise) edge device was developed in response to the growing demand for AI from key fields such as finance, coding, healthcare and government services. Compared to other similar products, the device performs accelerated searches at only 50 percent CPU utilization, and reduces inference costs by approximately 80 percent.

Innovating product applications in the AI 2.0 era with ecosystem partners to further boost productivity

SenseTime has partnered with Kingsoft Office since 2023, leveraging SenseNova Large Model to empower the latter’s WPS 365 as a smart office platform that boosts office productivity and overall efficiency.

In the financial sector, Haitong Securities and SenseTime jointly released a full-stack large model for the industry. Through the large model, both parties facilitated business operations in areas such as intelligent customer service, compliance and risk control, and business development office assistants. They also jointly explored cutting-edge industry applications such as smart investment advisory and aggregation of public sentiments, to realize the full-stack capability of large models in the securities industry.

In the transportation industry, SenseTime’s large model technology is deployed in the smart cabin of the Xiaomi SU7 vehicle, providing car owners with an intelligent and enhanced driving experience.

SenseTime firmly takes the lead into the AGI era with text-to-video in the pipeline

SenseTime also displayed its breakthrough with its text-to-video platform, where users will soon be able to generate a video based on a detailed description or even a few phrases. In addition, the characters’ costumes, hairstyles, and scenarios can be preset to maintain the stylistics consistency of the video content.

bnew · Apr 26, 2024

1/3
It's been a week since LLaMA 3 dropped.

In that time, we've:
- extended context from 8K -> 128K
- trained multiple ridiculously performant fine-tunes
- got inference working at 800+ tokens/second

If Meta keeps releasing OSS models, closed providers won't be able to compete.

2/3
Not yet, though I'm sure some version of it will be at some point!

3/3
I believe that’s just a product decision by the API providers. No reason that can’t be extended. At HyperWrite, we often offer 5000 to 10,000 token outputs to users.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/1
It's been exactly one week since we released Meta Llama 3, in that time the models have been downloaded over 1.2M times, we've seen 600+ derivative models on
@HuggingFace and much more.

More on the exciting impact we're already seeing with Llama 3 A look at the early impact of Meta Llama 3

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/4
A 262k-token context finetune of Llama 3 8B:

gradientai/Llama-3-8B-Instruct-262k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

2/4
The longer your input the more memory you need.

3/4
If post-finetuned well, then the model will remain good on 8k and be quite good beyond that.

4/4
Nobody knows. But RoPE scaling is quite effective even without post-pretraining. I think the best quality will be in the official long-context model. In the meantime such unofficial models will do the job.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/3
Apple Has Open-Sourced Their On-Device Language Models And They Aren't Very Good!

Apple has uncharacteristically been open-sourcing its work around language models! Kudos to them for that

However, their models are really bad. Compare the 3B model's MMLU, which is 24.8, to the Phi-3B mini's MMLU, which is 68.8!

Apple's models are not useful in the real world, with an MMLU of 24.8!

Thanks to open source, they can use Phi-3 from Microsoft on their devices—at least until they train better small models in the future.

2/3
Yes, it's more about the new arch than anything else...but I think it would be better if they had SOTA numbers...

It could also be that they got scooped by the Phi-3 folks

3/3
We should welcome Apple to the open-source ecosystem.

IMO, they just got scooped

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/3
Ask any question — hear the answer.

Available on iOS for Pro users starting today.

2/3
Download the iOS app
http://pplx.ai/download

3/3
Correct!

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/1
It's been exactly one week since we released Meta Llama 3, in that time the models have been downloaded over 1.2M times, we've seen 600+ derivative models on
@HuggingFace and much more.

More on the exciting impact we're already seeing with Llama 3 A look at the early impact of Meta Llama 3

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/13
Introducing OpenBioLLM-Llama3-70B & 8B: The most capable openly available Medical-domain LLMs to date!

Outperforms industry giants like GPT-4, Gemini, Meditron-70B, Med-PaLM-1, and Med-PaLM-2 in the biomedical domain.

OpenBioLLM-70B delivers SOTA performance, setting a new state-of-the-art for models of its size.
OpenBioLLM-8B model and even surpasses GPT-3.5, Gemini, and Meditron-70B!

Today's release is just the beginning! In the coming months, we'll be introducing:

- Expanded medical domain coverage
- Longer context windows
- Better benchmarks
- Multimodal capabilities

Medical-LLM Leaderboard: Open Medical-LLM Leaderboard - a Hugging Face Space by openlifescienceai

#gpt #gpt4 #gemini #medical #llm #chatgpt #opensource #llama3 #meta

2/13
Fine-tuning details

The fine-tuning process was conducted in two phases to optimize the model's performance:

- Fine-tuned using the LLama-3 70B & 8B models as the base
- Utilized the Direct Preference Optimization: Your Language Model is Secretly a Reward Model (DPO) …

3/13
Dataset

Curating the custom dataset was a time-consuming process that spanned over ~4 months. We diligently collected data, collaborated with medical experts to review its quality, and filtered out subpar examples.

To enhance the dataset's diversity, we incorporated…

4/13
Results

OpenBioLLM-70B showcases remarkable performance, surpassing larger models such as GPT-4, Gemini, Meditron-70B, Med-PaLM-1, and Med-PaLM-2 across 9 diverse biomedical datasets.

Despite its smaller parameter count compared to GPT-4 & Med-PaLM, it achieves…

5/13
To gain a deeper understanding of the results, we also evaluated the top subject-wise accuracy of 70B.

6/13
Models
You can download the models directly from Huggingface today.

- 70B : aaditya/OpenBioLLM-Llama3-70B · Hugging Face

- 8B : aaditya/OpenBioLLM-Llama3-8B · Hugging Face

7/13
Here are the top medical use cases for OpenBioLLM-70B & 8B:

Summarize Clinical Notes

OpenBioLLM can efficiently analyze and summarize complex clinical notes, EHR data, and discharge summaries, extracting key information and generating concise, structured summaries

8/13
Answer Medical Questions

OpenBioLLM can provide answers to a wide range of medical questions.

9/13
Classification

OpenBioLLM can perform various biomedical classification tasks, such as disease prediction, sentiment analysis, medical document categorization

10/13
De-Identification

OpenBioLLM can detect and remove personally identifiable information (PII) from medical records, ensuring patient privacy and compliance with data protection regulations like HIPAA.

11/13
Advisory Notice!

While OpenBioLLM-70B & 8B leverages high-quality data sources, its outputs may still contain inaccuracies, biases, or misalignments that could pose risks if relied upon for medical decision-making without further testing and refinement.

The model's…

12/13
Thanks to
@malai_san for their guidance and the incredible team at @saamatechinc for their invaluable resources and support.

13/13
Thanks to
@winglian for the amazing Axolotl support and to @huggingface and @weights_biases for providing such awesome open-source tools :smile:

Thanks to
@Teknium1 for having a long discussion over Discord on fine-tuning and other topics. He is a really humble and awesome guy.…

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/4
Outrageous OpenAI Gossip

Wed Apr 24

> Sam said “GPT6” for the first time ever in a serious sense at Stanford MS&E 472
> was a standing room only crowd with an overflow
> lines stretched 3 city blocks to the famed Oval
> it was the largest gathering of AI engineering talent since Covid-19

This coming on the heels of “95% [of AI startups] must die. We will steamroll you”
@HarryStebbings podcast

> “GPT5, or whatever we choose to call it, is just that it’s going to be smarter”
=> confirms the name/potential architecture shift from “GPT”
> “we can say right now, with a high degree of scientific certainty, that GPT5 is going to be a lot smarter than GPT4, GPT6 is going to be a lot smarter than GPT5 and we are not going to get off this curve”
=> indicates GPT5 must be in final stages pre release as early research team has dialed off to GPT6 already

Hearsay
> indicated “true innovation lies in defining the paradigm shift beyond GPT4”
=> again confirming architecture shift for GPT5 beyond GPT
> “providing free, ad-free ChatGPT is how OpenAI positively influences society while pursuing their objectives”
=> Note the strong aversion to ad supported

See Zuck earnings call earlier this week on AI spend
> “we have a strong track record of monetizing effectively… by… introducing ads or paid content into AI interactions”

Here we see the core almost generational difference between the two leaders.
> if you believe in AGI
> you want AGI to be as honest as possible
> you do not want the AGI to be persuadable with a little bit of cash
> to make you buy Tide detergent
> or vote for Biden/Trump
> or because it wants a little more compute
> or decides to make paperclips

Zuck sells ads because Meta doesn’t believe AGI is possible.
Sam doesn’t because he does.

Who is right ? Who has moral fiber and courage ? Or is just out for buck?

The above was excerpted from my newsletter for next week. Subscribe link below.

2/4
Are you not entertained? Subscribe here:

3/4
Fair

4/4
This is not a moral fibre discussion and framing it as such is a mistake.

The question is how this is paid for. Mark has chosen a path; Sam will have to choose one.

Whatever path he chooses will be fraught with trade offs and compromises.

“Moral fibre” is not the point.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/4
Nat Friedman says there will be a surge of new discoveries as AI digests the entire scientific literature and identifies connections that haven't been noticed before

2/4
Source:

3/4
it sounds pretty amazing

4/4
ikr

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Apr 26, 2024

1/1
Excited to announce Tau Robotics (
@taurobots ). We are building a general AI for robots. We start by building millions of robot arms that learn in the real world.

In the video, two robot arms are fully autonomous and controlled by a single neural network conditioned on different language instructions (four axes and five axes robot arms). The other two arms are teleoperated. The entire hardware cost in the video is about $1400. The video is at 1.5x speed.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

Large Language Models News & Discussions

Veteran

Meta’s battle with ChatGPT begins now​

Meta’s AI assistant is being put everywhere across Instagram, WhatsApp, and Facebook. Meanwhile, the company’s next major AI model, Llama 3, has arrived.​

Command Line​

“Compete with everything out there”​

Related​

Veteran

The Languages AI Is Leaving Behind​

Veteran

Superstar

Veteran

Veteran

Meet Phi-3: Microsoft’s New LLM That Can Run On Your Phone​

Microsoft’s Phi-3 LLM Family​

1) Phi-3-mini 3.8b​

2) Phi-3-Small and Phi-3-Medium​

Looking At the Benchmarks​

Are there any Limitations?​

Is Phi-3 Safe?​

Conclusion​

Veteran

Veteran

SenseTime launches SenseNova 5.0 with comprehensive updates and the industry-leading "Cloud-to-Edge" full-stack large model product matrix​

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Veteran

Meta’s battle with ChatGPT begins now

Meta’s AI assistant is being put everywhere across Instagram, WhatsApp, and Facebook. Meanwhile, the company’s next major AI model, Llama 3, has arrived.

Command Line

“Compete with everything out there”

Related

The Languages AI Is Leaving Behind

Meet Phi-3: Microsoft’s New LLM That Can Run On Your Phone

Microsoft’s Phi-3 LLM Family

1) Phi-3-mini 3.8b

2) Phi-3-Small and Phi-3-Medium

Looking At the Benchmarks

Are there any Limitations?

Is Phi-3 Safe?

Conclusion

SenseTime launches SenseNova 5.0 with comprehensive updates and the industry-leading "Cloud-to-Edge" full-stack large model product matrix