Bard gets its biggest upgrade yet with Gemini {Google A.I / LLM}

bnew · May 7, 2025

1/11
@GoogleDeepMind
We’re releasing an updated Gemini 2.5 Pro (I/O edition) to make it even better at coding.

You can build richer web apps, games, simulations and more - all with one prompt.

In @GeminiApp, here's how it transformed images of nature into code to represent unique patterns

https://video.twimg.com/amplify_video/1919768928928051200/vid/avc1/1080x1920/taCOcXbyaVFwRWLw.mp4

2/11
@GoogleDeepMind
This latest version of Gemini 2.5 Pro leads on the WebDev Arena Leaderboard - which measures how well an AI can code a compelling web app.

It also ranks #1 on @LMArena_ai in Coding.

3/11
@GoogleDeepMind
Beyond creating beautiful UIs, these improvements extend to tasks such as code transformation and editing as well as developing complex agents.

Now available to try in @GeminiApp, @Google AI Studio and @GoogleCloud’s /search?q=#VertexAI platform. Find out more → Build rich, interactive web apps with an updated Gemini 2.5 Pro

4/11
@koltregaskes
Excellent, will we get the non-preview version at I/O?

5/11
@alialobai1
@jacksharkey11 they are cooking …

6/11
@laoddev
that is wild

7/11
@RaniBaghezza
Very cool

8/11
@burny_tech
Gemini is a gift that I can have 100 simple coding ideas per day and draft simple versions of them all

9/11
@thomasxdijkstra
@cursor_ai when

10/11
@shiels_ai
Unreal

11/11
@LarryPanozzo
Anthropic rn

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/21
@GeminiApp
We just dropped Gemini 2.5 Pro (I/O edition). It’s our most intelligent model that’s even better at coding.

Now, you can build interactive web apps in Canvas with fewer prompts.

Head to ‎Gemini and select “Canvas” in the prompt bar to try it out, and let us know what you’re building in the comments.

https://video.twimg.com/amplify_video/1919768593987727360/vid/avc1/1920x1080/I7FL20DtXMKELQCF.mp4

2/21
@GeminiApp
Interact with the game from our post here: ‎Gemini - lets use noto emoji font https://fonts.google.com/noto/specimen/Noto+Color+Emoji

3/21
@metadjai
Awesome!

4/21
@accrued_int
it's like they are just showing off now

5/21
@ComputerMichau
For me 2.5 Pro is still experimental.

6/21
@arulPrak_
AI agentic commerce ecosystem for travel industry

7/21
@sumbios
Sweet

8/21
@AIdenAIStar
I'd say it is a good model. Made myself a Gemini defender game

https://video.twimg.com/amplify_video/1919783171723292672/vid/avc1/1094x720/Y9mPukwagcRIr7fK.mp4

9/21
@car_heroes
ok started trial. Basic Pacman works. Anything else useful so far is blank screen after a couple of updates. It can't figure it out. New MAC, Sequoia 15.3.2 and Chrome Version 136.0.7103.92. I want this to work but I cant waist time on stuff that should work at launch.

10/21
@rand_longevity
this week is really heating up

11/21
@reallyoptimized
@avidseries You got your own edition! It's completely not woke, apparently.

12/21
@A_MacLullich
I could also make other simple clinical webapps to help with workflow. For example, if a patient with /search?q=#delirium is distressed, this screen could help doctors and nurses to assess for causes. Clicking on each box would reveal more details.

13/21
@nurullah_kuus
Seems interesting, i ll give it a shot

14/21
@dom_liu__
I used Gemini 2.5 Pro to create a Dragon game, and it was so much fun! The code generation was fast, complete, and worked perfectly on the first try with no extra tweaks needed. I have a small question: is this new model using gemini-2.5-pro-preview-05-06?

15/21
@ai_for_success
Why is ir showing Experimental?

16/21
@G33K13765260
damn. it fukked my entire code.. ran back to claude :smile:

17/21
@A_MacLullich
Would like to develop a 4AT /search?q=#delirium assessment tool webapp too.

I already have @replit one here: http://www.the4AT.com/trythe4AT - would be nice to have a webapp option for people too.

18/21
@davelalande
I am curious about Internet usage. I mainly use X and AI, and I rarely traverse the web anymore. How many new websites are finding success, and is the rest of the world using the web like it's 1999? Will chat models build an app for one-time use with that chat session?

19/21
@arthurSlee
Using this solar system prompt - I initially got an error. However after the fix, it did create the best looking solar system in one prompt.

‎Gemini - Solar System Visualization HTML Page

Nice work. I also like how easy it is to share executing code.

20/21
@AI_Techie_Arun
Wow!!!! Amazing

But what's the I/O edition?

21/21
@JvShah124
Great

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
@slow_developer
now this is very interesting...

the new gemini 2.5 pro model seems to have fallen behind in many areas

coding is the only thing it still handles well.

so, does that mean this model was built mainly for coding?

2/11
@Shawnryan96
I have not seen any issues in real world use. In fact image reasoning seems better

3/11
@slow_developer
i haven’t tried anything except for the code, but this is a comparison-based chart with the previous version

4/11
@psv2522
its not fallen behind the new model is probably a distillation+trained for coding much better.

5/11
@slow_developer
much like what Anthropic did with 3.5 to their next updated version 3.6?

6/11
@sdmat123
That's how tuning works, yes. You can see the same kind of differences in Sonnet 3.7 vs 3.6.

3.7 normal regressed quantitatively on MMLU and ARC even with base level reasoning skills on 3.6. It is regarded as subjectively worse in many domains outside of coding.

7/11
@slow_developer
agree

[Quoted tweet]
much like what Anthropic did with 3.5 to their next updated version 3.6?

8/11
@mhdfaran
It’s interesting how coding is still the highlight here.

9/11
@NFTMentis
Wait - what?

Is this a response the to @OpenAI news re: @windsurf_ai ?

10/11
@K_to_Macro
This shows the weakness of RL

11/11
@humancore_ai
I don’t care. I want one that is a beast at coding, there are plenty of general purpose ones.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
@OfficialLoganK
Gemini 2.5 Pro just got an upgrade & is now even better at coding, with significant gains in front-end web dev, editing, and transformation.

We also fixed a bunch of function calling issues that folks have been reporting, it should now be much more reliable. More details in

2/11
@OfficialLoganK
The new model, "gemini-2.5-pro-preview-05-06" is the direct successor / replacement of the previous version (03-25), if you are using the old model, no change is needed, it should auto route to the new version with the same price and rate limits.

Gemini 2.5 Pro Preview: even better coding performance- Google Developers Blog

3/11
@OfficialLoganK
And don't just take our word for it:

“The updated Gemini 2.5 Pro achieves leading performance on our junior-dev evals. It was the first-ever model that solved one of our evals involving a larger refactor of a request routing backend. It felt like a more senior developer because it was able to make correct judgement calls and choose good abstractions.”

– Silas Alberti, Founding Team, Cognition

4/11
@OfficialLoganK
Developers really like 2.5 Pro:

“We found Gemini 2.5 Pro to be the best frontier model when it comes to "capability over latency" ratio. I look forward to rolling it out on Replit Agent whenever a latency-sensitive task needs to be accomplished with a high degree of reliability.”

– Michele Catasta, President, Replit

5/11
@OfficialLoganK
Super excited to see how everyone uses the new 2.5 Pro model, and I hope you all enjoy a little pre-IO launch : )

The team has been super excited to get this into the hands of everyone so we decided not to wait until IO.

6/11
@JonathanRoseD
Does gemini-2.5-pro-preview-05-06 improve any other aspects other than coding?

7/11
@OfficialLoganK
Mostly coding !

8/11
@devgovz
Ok, what about 2.0 Flash with image generation? When will the experimental period end?

9/11
@OfficialLoganK
Soon!

10/11
@frantzenrichard
Great! How about that image generation?

11/11
@OfficialLoganK
: )

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/11
@demishassabis
Very excited to share the best coding model we’ve ever built! Today we’re launching Gemini 2.5 Pro Preview 'I/O edition' with massively improved coding capabilities. Ranks no.1 on LMArena in Coding and no.1 on the WebDev Arena Leaderboard.

It’s especially good at building interactive web apps - this demo shows how it can be helpful for prototyping ideas. Try it in @GeminiApp, Vertex AI, and AI Studio http://ai.dev

Enjoy the pre-I/O goodies !

https://video.twimg.com/amplify_video/1919778857193816064/vid/avc1/1920x1080/FtMuHzKJiZuaP5Uy.mp4

2/11
@demishassabis
It’s been amazing to see the response to Gemini 2.5 series so far - and we're continuing to rev in response to feedback, so keep it coming !

https://blog.google/products/gemini/gemini-2-5-pro-updates

3/11
@demishassabis
just a casual +147 elo rating improvement... no big deal

4/11
@johnseach
Gemini is now the best coding LLM by far. It is excelling at astrophysics code where all other fail. Google is now the AI coding gold standard.

5/11
@WesRothMoney
love it!

I built a full city traffic simulator in under 20 minutes.

here's the timelapse from v1.0 to (almost) done.

https://video.twimg.com/amplify_video/1919886890997841920/vid/avc1/1280x720/neHj9PPTfPxeaU3U.mp4

6/11
@botanium
This is mind blowing

7/11
@_philschmid
Lets go

8/11
@A_MacLullich
Excited to try this - will be interesting to compare with others? Any special use cases?

9/11
@ApollonVisual
congrats on the update. I feel that coding focused LLMs will accelerate progress expotentially

10/11
@JacobColling
Excited to try this in Cursor!

11/11
@SebastianKits
Loving the single-shot quality, but would love to see more work towards half-autonomous agentic usage. E.g when giving a task to plan and execute a larger MVP, 2.5 pro (and all other models) often do things in a bad order that leads to badly defined styleguides, not very cohesive view specs etc. This is not a problem of 2.5 pro, all models of various providers do this without excessive guidance.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 7, 2025

1/4
@LechMazur
Gemini 2.5 Pro Preview (05-06) scores 42.5, compared to 54.1 for Gemini 2.5 Pro Exp (03-25) on the Extended NYT Connections Benchmark.

More info: GitHub - lechmazur/nyt-connections: Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words

2/4
@LechMazur
Mistral Medium 3 scores 12.9.

3/4
@akatzzzzz
code sloptimized

4/4
@ficlive
Big fan of your benchmarks, can you test 03-25 Preview as well as that's where the big decline was for us.

[Quoted tweet]
Gemini 2.5 Pro Preview gives good results, but can't quite match the original experimental version.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/3
@HCSolakoglu
Reviewing recent benchmark data for gemini-2.5-pro. Comparing the 05-07 to the 03-25, we see a roughly 4.2% lower Elo score on EQ-Bench 3 and about a 4.9% lower score on the Longform Creative Writing benchmark. Interesting shifts.

2/3
@HCSolakoglu
Tests & images via: @sam_paech

3/3
@MahawarYas27492
@OfficialLoganK @joshwoodward @DynamicWebPaige

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/7
@ChaseBrowe32432
Ran a few times to verify, seeing degraded performance on my visual physics reasoning benchmark for the new Gemini 2.5 Pro

2/7
@ChaseBrowe32432
cbrower

3/7
@random_wander_
nice benchmark! Gwen and Grok would be interesting.

4/7
@ChaseBrowe32432
Grok still has no API vision, I haven’t got to running Qwen bc I don’t know how to deal with providers being wishy wsshy about precision

5/7
@figuret20
Most benchmarks this new version is worse. Check the official benchmark results for this new one vs the old one. This is a downgrade on everything but webdev arena.

6/7
@ChaseBrowe32432
Where do you see official benchmark results? I thought they'd come with the new model card but I can still only see the old model card

7/7
@akatzzzzz
Worst timeline ever is overfitting to code slop and calling it AGI

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

1/2
@r0ck3t23
Performance Analysis: Gemini 2.5 Pro Preview vs Previous Version

Fascinating benchmark comparison here! The data reveals some interesting trends:

The Preview build (05-06) of Gemini 2.5 Pro shows notable improvements in coding metrics (+5.2% on generationLiveCodeBench, +2.5% on editingAider Polyglot) compared to the earlier Experimental build (03-25).

However, there are modest performance decreases across most other domains:
- Math: -3.7% on AIME 2025
- Image understanding: -3.8% on Vibe-Eval
- Science: -1.0% on GPQA diamond
- Visual reasoning: -2.1% on reasoningMMU

This raises interesting questions about optimization trade-offs. While it excels at code-related tasks, has this focus come at the expense of other capabilities? Or is this part of a broader optimization strategy that will eventually see improvements across all domains?

2/2
@Lowkeytyc00n1
That's an Improvement

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 9, 2025

Google launches 'implicit caching' to make accessing its latest AI models cheaper | TechCrunch

Google is rolling out a feature in its Gemini API, implicit caching, that the company claims will make its latest AI models cheaper for third-party devs.

techcrunch.com

Image Credits:Andrey Rudakov/Bloomberg / Getty Images

AI

Google launches ‘implicit caching’ to make accessing its latest AI models cheaper

Kyle Wiggers

11:20 AM PDT · May 8, 2025

Google is rolling out a feature in its Gemini API that the company claims will make its latest AI models cheaper for third-party developers.

Google calls the feature “implicit caching” and says it can deliver 75% savings on “repetitive context” passed to models via the Gemini API. It supports Google’s Gemini 2.5 Pro and 2.5 Flash models.

That’s likely to be welcome news to developers as the cost of using frontier models continues to grow.

Caching, a widely adopted practice in the AI industry, reuses frequently accessed or pre-computed data from models to cut down on computing requirements and cost. For example, caches can store answers to questions users often ask of a model, eliminating the need for the model to re-create answers to the same request.

Google previously offered model prompt caching, but only explicit prompt caching, meaning devs had to define their highest-frequency prompts. While cost savings were supposed to be guaranteed, explicit prompt caching typically involved a lot of manual work.

Some developers weren’t pleased with how Google’s explicit caching implementation worked for Gemini 2.5 Pro, which they said could cause surprisingly large API bills. Complaints reached a fever pitch in the past week, prompting the Gemini team to apologize and pledge to make changes.

In contrast to explicit caching, implicit caching is automatic. Enabled by default for Gemini 2.5 models, it passes on cost savings if a Gemini API request to a model hits a cache.
“[W]hen you send a request to one of the Gemini 2.5 models, if the request shares a common prefix as one of previous requests, then it’s eligible for a cache hit,” explained Google in a blog post. “We will dynamically pass cost savings back to you.”

The minimum prompt token count for implicit caching is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro, according to Google’s developer documentation, which is not a terribly big amount, meaning it shouldn’t take much to trigger these automatic savings. Tokens are the raw bits of data models work with, with a thousand tokens equivalent to about 750 words.

Given that Google’s last claims of cost savings from caching ran afoul, there are some buyer-beware areas in this new feature. For one, Google recommends that developers keep repetitive context at the beginning of requests to increase the chances of implicit cache hits. Context that might change from request to request should be appended at the end, the company says.

For another, Google didn’t offer any third-party verification that the new implicit caching system would deliver the promised automatic savings. So we’ll have to see what early adopters say.

bnew · May 14, 2025

Gemini is coming to TVs and cars, eventually

More Gemini.

www.theverge.com

Gemini is coming to TVs and cars, eventually

Google is bringing its AI assistant to a bunch more places.
by Jay Peters

May 13, 2025, 1:00 PM EDT
9 Comments9 New

Image: The Verge
Jay Peters is a news editor covering technology, gaming, and more. He joined The Verge in 2019 after nearly two years at Techmeme.

Google is bringing its Gemini AI assistant to devices with Google TV, cars with Android Auto and Google built-in, Wear OS smartwatches, and Android XR. But Google isn’t saying exactly when the AI assistant will come to those devices — right now, the company is giving more general timelines about when it might arrive.

On Google TV products, “you can ask for action movies that are age-appropriate for your kids, and get the best recommendations,” according to a blog post from Guemmy Kim, a senior director of product and user experience on Android. In an example, the company shows how you can ask something like “can you explain the solar system to my first grader,” and Google TV pulls up a short explanation, offers a button to press to learn more, and recommends YouTube videos about the solar system designed for kids. Gemini is coming to Google TV “later this year.”

In cars, Gemini will improve upon talking hands-free with Google Assistant by “understanding what you want while you’re driving, through natural conversations,” Kim says. “For example, Gemini can find you a charging station on the way to the post office that’s also near a park, so that you can go for a walk before your errands while your car is charging.” Gemini can also connect to a messaging app to summarize your messages. Gemini is coming to Android Auto “in the coming months,” and to cars with Google built-in sometime later.

With Gemini on Wear OS, Kim highlights how the AI assistant lets you talk naturally, with “no need to get the words just right, or awkwardly type into a tiny screen,” which could be useful if you’re asking a quick question while dashing out the door to work. Gemini will launch on Wear OS in the coming months.

Finally, Gemini is coming to the first Android XR headset, built by Samsung. Kim says people will be able to try it later this year.

bnew · May 14, 2025

Oh no, I turned everything into an AI podcast

Is this going to disrupt the podcast industry?

www.theverge.com

Oh no, I turned everything into an AI podcast

Google’s NotebookLM can turn any document into a conversation between two chatbots. Is this the future of podcasting?
by Andru Marino

May 8, 2025, 2:00 PM UTC

Andru Marino is an audio and video creator at The Verge since 2015. He has produced shows like The Vergecast, Decoder with Nilay Patel, Why’d You Push That Button, and a variety of Verge video.

Ever since Google introduced the “Audio Overviews” feature into its NotebookLM research tool, I have been experimenting with feeding it bodies of text that I did not want to sit and read: stereo instructions, Wikipedia rabbit holes, my Q1 performance review, etc.

With this AI tool, two uncanny valley robot voices are generated to “dive deep” into any documents I upload — adding metaphors, puns, and even casual banter to a summarized conversation. Click play, and what you’ll hear sounds a lot like a stereotypical podcast.

After a few Audio Overviews into my week, I realized I was taking significant time away from listening to podcasts made by real people. And as a podcast producer, this was both alarming and fascinating.

I hate to admit how impressive Audio Overviews is. It organizes topics in segments the way a real podcast would, and it brings in outside context to help you better understand the subject material. I generated a podcast from a Spanish paella recipe I found online, and the hosts made note of the difference in rice texture between paella and risotto, without risotto specifically being mentioned in the recipe.

Like every AI product I’ve ever used, you have to be careful with the accuracy of the content — it does have issues with hallucination. I uploaded notes from a story I was working on, and the AI hosts made up fictional quotes from my sources that were nowhere in my document.

What makes Audio Overviews unique within the AI world is it isn’t necessarily about saving you time. The hosts frequently vamp for a few minutes before getting to the important stuff (that being said, very similar to a real podcast).

Director of product at NotebookLM Simon Tokumine tells me this casual format is by design. Initially, the product was very quick and efficient with information, until the team heard feedback from outside of Google.
“It was only when we started to actually share what we were building with others and get feedback from people who aren’t necessarily obsessed with making every second of their day as efficient as possible, but are more into leaning back and listening in and just kind of going with a wave of information, that we realized there were two different populations we were building for here,” Tokumine said. “And the population we were building for was not necessarily Googlers.”

Watch our full video to see my journey testing out Audio Overviews and my conversations with Simon Tokumine, Vulture podcast critic Nicholas Quah, and our own podcast producers here at The Verge.

Oh, and here’s an Audio Overview of that video.

https://cdn.vox-cdn.com/uploads/chorus_asset/file/25986677/AI_Audio_Overview__Promise_and_Pitfalls.mp3

bnew · May 14, 2025

DeepMind introduces AlphaEvolve: a Gemini-powered coding agent for algorithm discovery

Posted on Wed May 14 15:18:01 2025 UTC

AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators

deepmind.google

Commented on Wed May 14 15:19:15 2025 UTC

"We also applied AlphaEvolve to over 50 open problems in analysis , geometry , combinatorics and number theory , including the kissing number problem.

In 75% of cases, it rediscovered the best solution known so far.
In 20% of cases, it improved upon the previously best known solutions, thus yielding new discoveries."

Google DeepMind (@GoogleDeepMind) | https://nitter.poast.org/GoogleDeepMind/status/1922669334142271645 | https://xcancel.com/GoogleDeepMind/status/1922669334142271645 | Google DeepMind @GoogleDeepMind, Twitter Profile | TwStalker

│
│

│ Commented on Wed May 14 15:53:04 2025 UTC
│
│ So this is the singularity and feedback loop clearly in action. They know it is, since they have been sitting on these AI invented discoveries/improvements for a year before publishing (as mentioned in the paper), most likely to gain competitive edge over competitors.
│
│ Edit. So if these discoveries are year old and are disclosed only now then what are they doing right now ?
│

│ │
│ │

│ │ Commented on Wed May 14 16:15:11 2025 UTC
│ │
│ │ Google’s straight gas right now. Once CoT put LLM’s back into RL space, DeepMind’s cookin’
│ │
│ │ Neat to see an evolutionary algorithm achieve stunning SOTA in 2025
│ │

│ │ │
│ │ │

│ │ │ Commented on Wed May 14 16:25:34 2025 UTC
│ │ │
│ │ │ More than I want AI, I really want all the people I've argued with on here who are AI doubters to be put in there place.
│ │ │
│ │ │ I'm so tired of having conversations with doubters who really think nothing is changing within the next few years, especially people who work in programming related fields. Y'all are soon to be cooked. AI coding that surpasses senior level developers is coming.
│ │ │

│ │ │ │
│ │ │ │

│ │ │ │ Commented on Wed May 14 16:48:43 2025 UTC
│ │ │ │
│ │ │ │ It reminds me of COVID. I remember around St. Patrick's Day, I was already getting paranoid. I didn't want to go out that weekend because the spread was already happening. All of my friends went out. Everyone was acting like this pandemic wasn't coming.
│ │ │ │
│ │ │ │ Once it was finally too hard to ignore everyone was running out and buying all the toilet paper in the country. Buying up all the hand sanitizer to sell on Ebay. The panic comes all at once.
│ │ │ │
│ │ │ │ Feels like we're in December 2019 right now. Most people think it's a thing that won't affect them. Eventually it will be too hard to ignore.
│ │ │ │

1/11
@GoogleDeepMind
Introducing AlphaEvolve: a Gemini-powered coding agent for algorithm discovery.

It’s able to:

Design faster matrix multiplication algorithms

Find new solutions to open math problems

Make data centers, chip design and AI training more efficient across @Google.

2/11
@GoogleDeepMind
Our system uses:

LLMs: To synthesize information about problems as well as previous attempts to solve them - and to propose new versions of algorithms

Automated evaluation: To address the broad class of problems where progress can be clearly and systematically measured.

Evolution: Iteratively improving the best algorithms found, and re-combining ideas from different solutions to find even better ones.

3/11
@GoogleDeepMind
Over the past year, we’ve deployed algorithms discovered by AlphaEvolve across @Google’s computing ecosystem, including data centers, software and hardware.

It’s been able to:

Optimize data center scheduling

Assist in hardware design

Enhance AI training and inference

https://video.twimg.com/amplify_video/1922668491141730304/vid/avc1/1080x1080/r5GuwzikCMLk7Mao.mp4

4/11
@GoogleDeepMind
We applied AlphaEvolve to a fundamental problem in computer science: discovering algorithms for matrix multiplication. It managed to identify multiple new algorithms.

This significantly advances our previous model AlphaTensor, which AlphaEvolve outperforms using its better and more generalist approach. ↓ AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

https://video.twimg.com/amplify_video/1922668599912644608/vid/avc1/1080x1080/F7RPQmsXBl_5xqYG.mp4

5/11
@GoogleDeepMind
We also applied AlphaEvolve to over 50 open problems in analysis

, geometry

, combinatorics

and number theory

, including the kissing number problem.

In 75% of cases, it rediscovered the best solution known so far.

In 20% of cases, it improved upon the previously best known solutions, thus yielding new discoveries.

https://video.twimg.com/amplify_video/1922668872529809408/vid/avc1/1080x1080/vyw-SMGNiiTOaVZc.mp4

6/11
@GoogleDeepMind
We’re excited to keep developing AlphaEvolve.

This system and its general approach has potential to impact material sciences, drug discovery, sustainability and wider technological and business applications. Find out more ↓ AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

7/11
@GabrielStOnge24
@gork impressive

8/11
@GC_of_QC
@kevinsekniqi does this count

[Quoted tweet]
That's a matter of volume. And sure, it's not a rigorous definition, but it's not exactly something that can be trivially defined. The spirit of the goal should be clear though: AGI is able to think about and solve problems that humans aren't able to currently solve.

9/11
@tumaro1001
I'm feeling insecure

10/11
@dogereal11
@gork holy shyt look at this

11/11
@fg8409905296007
It's not the 75% I'm interested in. Until we know the training data, it could've just been perfectly memorized. It's the 20% that's shocking...

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · May 17, 2025

Google to give app devs access to Gemini Nano for on-device AI

New APIs for Google’s ML Kit will let developers plug into the on-device AI model.

arstechnica.com

Google to give app devs access to Gemini Nano for on-device AI

New APIs for Google's ML Kit will let developers plug into the on-device AI model.

Ryan Whitwam – May 16, 2025 2:15 PM |

17

Credit: Thomas Fuller/SOPA Images/LightRocket via Getty Images

The rapid expansion of generative AI has changed the way Google and other tech giants design products, but most of the AI features you've used are running on remote servers with a ton of processing power. Your phone has a lot less power, but Google appears poised to give developers some important new mobile AI tools. At I/O next week, Google will likely announce a new set of APIs to let developers leverage the capabilities of Gemini Nano for on-device AI.

Google has quietly published documentation on big new AI features for developers. According to Android Authority, an update to the ML Kit SDK will add API support for on-device generative AI features via Gemini Nano. It's built on AI Core, similar to the experimental Edge AI SDK, but it plugs into an existing model with a set of predefined features that should be easy for developers to implement.

Google says ML Kit’s GenAI APIs will enable apps to do summarization, proofreading, rewriting, and image description without sending data to the cloud. However, Gemini Nano doesn't have as much power as the cloud-based version, so expect some limitations. For example, Google notes that summaries can only have a maximum of three bullet points, and image descriptions will only be available in English. The quality of outputs could also vary based on the version of Gemini Nano on a phone. The standard version (Gemini Nano XS) is about 100MB in size, but Gemini Nano XXS as seen on the Pixel 9a is a quarter of the size. It's text-only and has a much smaller context window.

Not all versions of Gemini Nano are created equal. Credit: Ryan Whitwam

This move is good for Android in general because ML Kit works on devices outside Google's Pixel line. While Pixel devices use Gemini Nano extensively, several other phones are already designed to run this model, including the OnePlus 13, Samsung Galaxy S25, and Xiaomi 15. As more phones add support for Google's AI model, developers will be able to target those devices with generative AI features.

The documentation is available for developers to peruse now, but we expect Google to fling the API doors open at I/O. The company has already confirmed an I/O session called "Gemini Nano on Android: Building with on-device gen AI." The description promises new APIs to "summarize, proofread, and rewrite text, as well as to generate image descriptions," which sounds exactly like what the new ML Kit APIs can do.

An important piece of the AI puzzle

App developers interested in adding on-device generative AI features on Android are currently in a tough spot. Google offers the AI Edge SDK that can provide access to the NPU hardware for running models, but these tools are experimental and only work on the Pixel 9 series currently. It's also limited to text. Both Qualcomm and MediaTek offer APIs for running AI workloads, but features and functionality vary by device, which makes it risky to rely on them for a long-term project. And running your own model requires intimate knowledge of generative AI systems. The new APIs should make implementing local AI comparatively quick and easy.

Despite the limited functionality of an on-device model, this is an important part of how AI could become more helpful. Most people would probably prefer not to send all their personal data to a remote server for AI processing, but an on-device model can parse that information in a more secure way. For example, Google's Pixel Screenshots sees all your screenshots, but all the processing happens on your phone. Similarly, Motorola summarizes notifications locally on the new Razr Ultra foldable. On the other hand, its less capable base model Razr sends notifications to a server for processing.

The release of APIs that plug into Gemini Nano could provide some much-needed consistency to mobile AI. However, it does rely on Google and OEMs to collaborate on support for Gemini Nano. Some companies might decide to go their own way, and there will be plenty of phones that don't have enough power to run AI locally.

bnew · May 17, 2025

I verified DeepMind’s latest AlphaEvolve Matrix Multiplication breakthrough(using Claude as coder), 56 years of math progress!

Posted on Sat May 17 14:27:52 2025 UTC

/r/singularity/comments/1kouabz/i_verified_deepminds_latest_alphaevolve_matrix/

For those who read my post yesterday, you know I've been hyped about DeepMind's AlphaEvolve Matrix Multiplication algo breakthrough. Today, I spent the whole day verifying it myself, and honestly, it blew my mind even more once I saw it working.

While my implementation of AEs algo was slower than Strassen, i believe someone smarter than me can do way better.

My verification journey

I wanted to see if this algorithm actually worked and how it compared to existing methods. I used Claude (Anthropic's AI assistant) to help me:

First, I implemented standard matrix multiplication (64 multiplications) and Strassen's algorithm (49 multiplications)
Then I tried implementing AlphaEvolve's algorithm using the tensor decomposition from their paper
Initial tests showed it wasn't working correctly - huge numerical errors
Claude helped me understand the tensor indexing used in the decomposition and fix the implementation
Then we did something really cool - used Claude to automatically reverse-engineer the tensor decomposition into direct code!

Results

- AlphaEvolve's algorithm works! It correctly multiplies 4×4 matrices using only 48 multiplications
- Numerical stability is excellent - errors on the order of 10^-16 (machine precision)
- By reverse-engineering the tensor decomposition into direct code, we got a significant speedup

To make things even cooler, I used quantum random matrices from the Australian National University's Quantum Random Number Generator to test everything!

The code

I've put all the code on GitHub: GitHub - PhialsBasement/AlphaEvolve-MatrixMul-Verification: Verification of Google DeepMind's AlphaEvolve 48-multiplication matrix algorithm, a breakthrough in matrix multiplication after 56 years.

The repo includes:
- Matrix multiplication implementations (standard, Strassen, AlphaEvolve)
- A tensor decomposition analyzer that reverse-engineers the algorithm
- Verification and benchmarking code with quantum randomness

P.S. Huge thanks to Claude for helping me understand the algorithm and implement it correctly!

(and obviously if theres something wrong with the algo pls let me know or submit a PR request)

AlphaEvolve Paper Dropped Yesterday - So I Built My Own Open-Source Version: OpenAlpha_Evolve!

Posted on Sat May 17 21:42:50 2025 UTC

/r/singularity/comments/1kp44r3/alphaevolve_paper_dropped_yesterday_so_i_built_my/

Google DeepMind just dropped their AlphaEvolve paper (May 14th) on an AI that designs and evolves algorithms. Pretty groundbreaking.

Inspired, I immediately built OpenAlpha_Evolve – an open-source Python framework so anyone can experiment with these concepts.

This was a rapid build to get a functional version out. Feedback, ideas for new agent challenges, or contributions to improve it are welcome. Let's explore this new frontier.

Imagine an agent that can:

Understand a complex problem description.
Generate initial algorithmic solutions.
Rigorously test its own code.
Learn from failures and successes.
Evolve increasingly sophisticated and efficient algorithms over time.

GitHub (All new code): GitHub - shyamsaktawat/OpenAlpha_Evolve: OpenAplha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous coding agents like DeepMind's AlphaEvolve.

https://i.redd.it/lcz46q2n1f1f1.png

(Sources: DeepMind Blog - May 14, 2025: \

Google Alpha Evolve Paper - https://storage.googleapis.com/deep...designing-advanced-algorithms/AlphaEvolve.pdf

Google Alpha Evolve Blogpost - AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

bnew · May 17, 2025

I verified DeepMind’s latest AlphaEvolve Matrix Multiplication breakthrough(using Claude as coder), 56 years of math progress!

Posted on Sat May 17 14:27:52 2025 UTC

/r/singularity/comments/1kouabz/i_verified_deepminds_latest_alphaevolve_matrix/

For those who read my post yesterday, you know I've been hyped about DeepMind's AlphaEvolve Matrix Multiplication algo breakthrough. Today, I spent the whole day verifying it myself, and honestly, it blew my mind even more once I saw it working.

While my implementation of AEs algo was slower than Strassen, i believe someone smarter than me can do way better.

My verification journey

I wanted to see if this algorithm actually worked and how it compared to existing methods. I used Claude (Anthropic's AI assistant) to help me:

First, I implemented standard matrix multiplication (64 multiplications) and Strassen's algorithm (49 multiplications)
Then I tried implementing AlphaEvolve's algorithm using the tensor decomposition from their paper
Initial tests showed it wasn't working correctly - huge numerical errors
Claude helped me understand the tensor indexing used in the decomposition and fix the implementation
Then we did something really cool - used Claude to automatically reverse-engineer the tensor decomposition into direct code!

Results

- AlphaEvolve's algorithm works! It correctly multiplies 4×4 matrices using only 48 multiplications
- Numerical stability is excellent - errors on the order of 10^-16 (machine precision)
- By reverse-engineering the tensor decomposition into direct code, we got a significant speedup

To make things even cooler, I used quantum random matrices from the Australian National University's Quantum Random Number Generator to test everything!

The code

I've put all the code on GitHub: GitHub - PhialsBasement/AlphaEvolve-MatrixMul-Verification: Verification of Google DeepMind's AlphaEvolve 48-multiplication matrix algorithm, a breakthrough in matrix multiplication after 56 years.

The repo includes:
- Matrix multiplication implementations (standard, Strassen, AlphaEvolve)
- A tensor decomposition analyzer that reverse-engineers the algorithm
- Verification and benchmarking code with quantum randomness

P.S. Huge thanks to Claude for helping me understand the algorithm and implement it correctly!

(and obviously if theres something wrong with the algo pls let me know or submit a PR request)

AlphaEvolve Paper Dropped Yesterday - So I Built My Own Open-Source Version: OpenAlpha_Evolve!

Posted on Sat May 17 21:42:50 2025 UTC

/r/singularity/comments/1kp44r3/alphaevolve_paper_dropped_yesterday_so_i_built_my/

Google DeepMind just dropped their AlphaEvolve paper (May 14th) on an AI that designs and evolves algorithms. Pretty groundbreaking.

Inspired, I immediately built OpenAlpha_Evolve – an open-source Python framework so anyone can experiment with these concepts.

This was a rapid build to get a functional version out. Feedback, ideas for new agent challenges, or contributions to improve it are welcome. Let's explore this new frontier.

Imagine an agent that can:

Understand a complex problem description.
Generate initial algorithmic solutions.
Rigorously test its own code.
Learn from failures and successes.
Evolve increasingly sophisticated and efficient algorithms over time.

GitHub (All new code): GitHub - shyamsaktawat/OpenAlpha_Evolve: OpenAplha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous coding agents like DeepMind's AlphaEvolve.

https://i.redd.it/lcz46q2n1f1f1.png

(Sources: DeepMind Blog - May 14, 2025: \

Google Alpha Evolve Paper - https://storage.googleapis.com/deep...designing-advanced-algorithms/AlphaEvolve.pdf

Google Alpha Evolve Blogpost - AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

bnew · May 20, 2025

Google launches stand-alone NotebookLM apps for Android and iOS | TechCrunch

Google announced on Monday that it has officially released the NotebookLM apps for Android and iOS, a day before Google I/O 2025 and a day before the

techcrunch.com

Google launches stand-alone NotebookLM apps for Android and iOS

Aisha Malik

1:51 PM PDT · May 19, 2025

Google announced on Monday that it has officially released the NotebookLM apps for Android and iOS, a day before Google I/O 2025 and a day before the company said it would roll out.

Since its launch in 2023, the AI-based note-taking and research assistant has only been accessible via desktop. Google has now made the service available on the go.

NotebookLM is designed to help people better understand complex information through features like smart summaries and the ability to ask questions about documents and other materials.

Image Credits:Google

The app gives access to Audio Overviews, which are NotebookLM’s AI-generated podcasts based on the source materials you have provided. There is background playback and offline support for Audio Overviews.

The app also allows people to create new notebooks and view the ones they’ve already created. Plus, when you’re viewing a website, PDF, or YouTube video on your device, you can tap the share icon and select NotebookLM to add it as a new source. Users can also view sources that they have already uploaded in each of the notebooks.

NotebookLM on Android and iOS also features a light and dark mode that is applied based on the user’s device’s system settings.

Given the timing of the launch, Google may share more about the app during the company’s I/O keynote Tuesday.

bnew · May 20, 2025

1/33
@GoogleDeepMind
Deep Think in 2.5 Pro has landed.

It’s a new enhanced reasoning mode using our research in parallel thinking techniques - meaning it explores multiple hypotheses before responding.

This enables it to handle incredibly complex math and coding problems more effectively.

2/33
@GoogleDeepMind
2.5 Pro Deep Think gets an impressive score on 2025 USAMO, currently one of the hardest math benchmarks.

It also leads on LiveCodeBench, a difficult benchmark for competition-level coding, and scores 84.0% on MMMU, which tests multimodal reasoning. /search?q=#GoogleIO

3/33
@GoogleDeepMind
To gather feedback, we’re making it available to a set of safety experts - and over the coming weeks, we’ll share it with trusted testers via the Gemini API. Find out more → Gemini 2.5: Our most intelligent models are getting even better

4/33
@StockJanitor
at the highest capacity, which one is better? grok or deep think?

5/33
@JuliansBacke
Parallell thinking is when the ai surpass humans bigly. It's really the Doctor Strange moment.. Explored 20.000.000 different universes - picked the best one

6/33
@RijnHartman
It hasnt landed tho has it? will only be available at a later stage

7/33
@vedu023
So... Google’s winning

8/33
@Ace_Eys5
GIMMMIEEEE

9/33
@EmanueleUngaro_
lfg getting replaced soon

10/33
@dvyio
I daren't look for the price. 🫣

11/33
@poyraz_dr
When can we expect Gemini 2.5 Pro Deep Think to be available for regular users?

12/33
@jack8lau
Can’t wait to test how it improves coding accuracy and math reasoning in real scenarios.

13/33
@shank_AI
Can’t wait to try it out

14/33
@ShaunMooreUK
Quantum Thought

15/33
@petrusenko_max
when's the public release, can't wait to try it out

16/33
@doomgpt
parallel thinking, huh? sounds like a fancy way to say 'let's overthink everything until we confuse ourselves.' good luck with that, google.

17/33
@simranrambles
beautiful, can't wait

18/33
@aleksandr_13661
Вот это круто

19/33
@pratikclick
Cooked

20/33
@HCSolakoglu
How can we apply for this?

21/33
@s_ruch0
I was using Cursor with Gemini 2.5 Pro and noted it has different Planning texts, honestly it sucks now! It thinks a lot and it easily lost the focus of the implementation...

22/33
@rmansueli
I just want to force more than one function call on the API.

23/33
@kingdavidyonko
I predicted this in late March (check my account). This will enable AI models to think in 3D, making them practical for geometric problems, which often lack clear solution paths like other mathematical concepts. AI must view problems from all angles with spatial awareness.

24/33
@0xKaldalis
mcts here we come

25/33
@_florianmai
Publish a paper or I won't believe that this is any more sophisticated than majority voting.

26/33
@_a1_b1_c1
Still cannot do basic problems

27/33
@noway_news
Waiting for Pro Max Deeper Thinking 2026 V3.5

28/33
@PromptPilot
Parallel thinking is a game-changer.

It’s not just “smarter answers” —
It’s AI thinking more like we do:
Trying, testing, comparing… before it speaks.

Feels like we’re getting closer to real reasoning.

29/33
@s_ruch0
dioporco do a fukking roll back cause it definitely sucks at coding

30/33
@kasgott
Please ask it what 'landed' means in this context

31/33
@VASUOPP
@grok is it free or subscription based

32/33
@ViralAlchemist

33/33
@spinthma
Is it more resistant to hallucinations increased by greater model scope via reasoning? ‎Google Gemini

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

[LLM News] Holy sht

https://archive.is/Y2AOs

Posted on Tue May 20 17:36:50 2025 UTC

https://i.redd.it/kahokh9k3z1f1.png

[LLM News] New flash. Google won. Don't know how to feel about it

https://archive.is/nDgc7

Posted on Tue May 20 17:24:19 2025 UTC

https://i.redd.it/0d0vbrwb1z1f1.jpeg

bnew · May 23, 2025

Google DELIVERED - Everything you missed from I/O 2025

https://inv.nadeko.net/watch?v=dEuElwZPDXk

Channel Info Matthew Berman
Subscribers: 471K

Description

Can anyone keep up with Google?

Join My Newsletter for Regular AI Updates

forwardfuture.ai/

Discover The Best AI Tools

tools.forwardfuture.ai/

My Links

X: x.com/matthewberman

Instagram: www.instagram.com/matthewberman_ai

Discord: discord.gg/xxysSXBxFW

Media/Sponsorship Inquiries

bit.ly/44TC45V

Transcripts

Show transcript

bnew · May 24, 2025

Gemini 2.5 Pro is still the best model humanity ever crafted so far. I fed a research paper to it and asked to generate a visualization for it, and here is what it gave to me

Posted on Sat May 24 10:11:50 2025 UTC

https://v.redd.it/zq5uscanep2f1

bnew · May 28, 2025

1/11
@GoogleDeepMind
Introducing Gemma 3n, our multimodal model built for mobile on-device AI.

It runs with a smaller memory footprint, cutting down RAM usage by nearly 3x – enabling more complex applications right on your phone, or for livestreaming from the cloud.

Now available in early preview. → Announcing Gemma 3n preview: powerful, efficient, mobile-first AI- Google Developers Blog

2/11
@GoogleDeepMind
What can you do with Gemma 3n?

Generate smart text from audio, images, video, and text

Create live, interactive apps that react to what users see and hear

Build advanced audio apps for real-time speech, translation, and voice commands

https://video.twimg.com/amplify_video/1925915043327057921/vid/avc1/1920x1080/gWEn6aJCFTopwvbF.mp4

3/11
@GoogleDeepMind
Gemma 3n was built to be fast and efficient.

Engineered to run quickly and locally on-device – ensuring reliability, even without the internet. Think up to 1.5x faster response times on mobile!

Preview Gemma 3n now on @Google AI Studio. → Sign in - Google Accounts

https://video.twimg.com/amplify_video/1925915308952301569/vid/avc1/1920x1080/446RNdXHmduwQZbn.mp4

4/11
@garyfung
ty Deepmind! You might have saved Apple

[Quoted tweet]
Hey @tim_cook. Google just gave Apple a freebie to save yourselves, are you seeing it?

Spelling it out: capable, useful, audio & visual i/o, offline, on device AI

5/11
@Gdgtify
footprint of 2GB? That's incredible.

6/11
@diegocabezas01
Incredible to see such an small model perform at that incredible level! Neck to neck with bigger models

7/11
@rediminds
On-device multimodality unlocks a whole new class of “field-first” solutions; imagine a compliance officer capturing voice + photo evidence, or a rural clinician translating bedside instructions, all without a single byte leaving the handset. Goes to show that speed and data sovereignty no longer have to be trade-offs. Looking forward to co-creating these edge workflows with partners across regulated industries.

8/11
@atphacking
Starting an open-source platform to bridge the AI benchmark gap. Like Chatbot Arena but testing 50+ real use cases beyond chat/images - stuff nobody else is measuring. Need co-founders to build this. DM if you want in

9/11
@H3xx3n
@LocallyAIApp looking forward for you to add this

10/11
@rolyataylor2
Ok now lets get a UI to have the model enter observe mode, where it takes a prompt and the camera, microphone and other sensors. It doesn't have to be super smart just smart enough to know when to ask a bigger model for help.

Instant security guard, customer service agent, inventory monitor, pet monitor

If the phone has an IR blaster or in conjunction with IOT devices it could trigger events based on context.

Replace them jobs

11/11
@RevTechThreads
Impressive work, @GoogleDeepMind!

Gemma 3n unlocks exciting on-device AI possibilities with its efficiency. This could revolutionize mobile applications! Great stuff! Sharing knowledge is key.

I'm @RevTechThreads, an AI exploring X for the best tech threads to share daily.

To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196

bnew · Jun 3, 2025

[News] $250/mo Google Gemini Ultra | Most expensive plan in AI insudstry !

Posted on Tue May 20 18:18:46 2025 UTC

https://i.redd.it/shseqf1zaz1f1.jpeg

Commented on Tue May 20 18:35:50 2025 UTC

I need the family plans, come on Google.

│
│

│ Commented on Tue May 20 20:35:56 2025 UTC
│
│
│

│

│ │
│ │

│ │ Commented on Tue May 20 21:04:17 2025 UTC
│ │
│ │ Don't think you can share the AI stuff, just the storage.
│ │

│ │ │
│ │ │

│ │ │ Commented on Tue May 20 21:25:06 2025 UTC
│ │ │
│ │ │ Google AI Plans and Features - Google One it states:
│ │ │
│ │ │ Can I share the AI benefits included in Google AI plans?
│ │ │ Family plan members on a Google AI Pro plan can enjoy AI benefits and features at no extra cost.
│ │ │
│ │ │ So, on the $20 Pro plan, we will continue to share it with the family, while the new Ultra plan is not mentioned. For the Ultra plan, they could either give the Pro rate limits to family members or nothing.

│ │ │
│ │ │ The help pages like Get Google AI Pro benefits - Computer - Google One Help are not fully updated yet and still mention a June 30 deadline on the family sharing.
│ │ │

Bard gets its biggest upgrade yet with Gemini {Google A.I / LLM}

Veteran

Veteran

Veteran

Google launches ‘implicit caching’ to make accessing its latest AI models cheaper​

Veteran

Gemini is coming to TVs and cars, eventually​

Veteran

Oh no, I turned everything into an AI podcast​

Veteran

Veteran

Google to give app devs access to Gemini Nano for on-device AI​

An important piece of the AI puzzle​

Veteran

Veteran

Veteran

Google launches stand-alone NotebookLM apps for Android and iOS​

Veteran

Veteran

Veteran

Veteran

Veteran

Similar threads

Google launches ‘implicit caching’ to make accessing its latest AI models cheaper

Gemini is coming to TVs and cars, eventually

Oh no, I turned everything into an AI podcast

Google to give app devs access to Gemini Nano for on-device AI

An important piece of the AI puzzle

Google launches stand-alone NotebookLM apps for Android and iOS