Is it a wrap for Software Engineers? Devin autonomous AI software engineer...

Regular Developer

Supporter
Joined
Jun 2, 2012
Messages
9,983
Reputation
2,837
Daps
28,751
Reppin
NJ
I dunno, I'd still need to see a real world application of this for the situations they want it to be applied to. Doing this for coding challenges is good though, but I do want to see how it scales enterprise levels worth of code created from scratch

Edit: I do wonder, though, if the solution is making a language that has AI/LLM in mind first...:jbhmm:
 

Dameon Farrow

Superstar
Joined
Jan 19, 2014
Messages
16,076
Reputation
4,197
Daps
54,042
Once you sit and actually try AI in one of these languages you realize we are a long long way from AI taking away engineer jobs en masse. Right now folks are looking at it like, "look it works..." then you realize it's garbage at updating the software. All it can do is give you something to start with. Wanna add a feature? AI isn't gonna do it at the snap of a finger. And CEOs are not gonna sit and play with it to figure it out. Gonna need actual humans who know what they are looking at...
 

southpawstyle

Superstar
Joined
Aug 31, 2013
Messages
4,471
Reputation
1,470
Daps
16,043
Reppin
California
Once you sit and actually try AI in one of these languages you realize we are a long long way from AI taking away engineer jobs en masse. Right now folks are looking at it like, "look it works..." then you realize it's garbage at updating the software. All it can do is give you something to start with. Wanna add a feature? AI isn't gonna do it at the snap of a finger. And CEOs are not gonna sit and play with it to figure it out. Gonna need actual humans who know what they are looking at...
Think about where we were 5 years ago. Think about where we are now. Think about what it will look like in 5 more years
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,705
Reputation
10,592
Daps
185,702
Gemini 2.5 Flash has insane potential... (Google Keeps WINNING)



Channel Info Matthew Berman Subscribers: 460K subscribers

Description
Check out Box AI here: https://bit.ly/43ErJOc

Join My Newsletter for Regular AI Updates 👇🏼

My Links 🔗
👉🏻 Subscribe: Matthew Berman
👉🏻 Twitter: https://twitter.com/matthewberman
👉🏻 Discord: Join the Forward Future AI Discord Server!
👉🏻 Patreon: Get more from Matthew Berman on Patreon
👉🏻 Instagram: https://www.instagram.com/matthewberman_ai
👉🏻 Threads: Matthew Berman (@matthewberman_ai) • Threads, Say more
👉🏻 LinkedIn: Forward Future | LinkedIn

Media/Sponsorship Inquiries ✅

Links:


 

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,705
Reputation
10,592
Daps
185,702

Posted:

5:21 PM PDT · April 29, 2025

Satya Nadella and Mark Zuckberg

Image Credits: Maxwell Zeff


Microsoft CEO says up to 30% of the company’s code was written by AI​


During a fireside chat with Meta CEO Mark Zuckerberg at Meta’s LlamaCon conference on Tuesday, Microsoft CEO Satya Nadella said that 20% to 30% of code inside the company’s repositories was “written by software” — meaning AI.

Nadella gave the figure after Zuckerberg asked roughly how much of Microsoft’s code is AI generated today. The Microsoft CEO said the company was seeing mixed results in AI-generated code across different languages, with more progress in Python and less in C++.

Microsoft CTO Kevin Scott previously said he expects 95% of all code to be AI generated by 2030.

When Nadella threw the question back at Zuckerberg, the Meta CEO said he didn’t know how much of Meta’s code is being generated by AI.

On Microsoft rival Google’s earnings call last week, CEO Sundar Pichai said AI was generating more than 30% of the company’s code. Of course, it’s unclear how exactly Microsoft and Google are measuring what’s AI generated versus not, so these figures are best taken with a grain of salt.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,705
Reputation
10,592
Daps
185,702
A research preview of Codex in ChatGPT

Channel Info OpenAI Subscribers: 1.52M subscribers

Description
Greg Brockman, Jerry Tworek, Joshua Ma, Hanson Wang, Thibault Sottiaux, Katy Shi, and Andrey Mishchenko introduce and demo Codex in ChatGPT.


Commented on Fri May 16 15:52:14 2025 UTC

Where is it in ChatGPT? Please don't tell me its not for EU again.


│ Commented on Fri May 16 15:53:25 2025 UTC

http://chatgpt.com/codex - but you need to be "Pro" subscriber, does not support plus members yet.
 

RageKage

Superstar
Joined
May 24, 2022
Messages
3,953
Reputation
2,165
Daps
13,171
Reppin
Macragge
Feel like im in a good niche where I work, we produce hw that interacts with our sw. For a large number of things I do, u physically need to contact or interface with hw to make things work as our customers must do.

AI is a tool for us and not a replacement.

Think I'm far enough along that I'll retire before the fukkery fully hits fan but times are changing for younger.

Imbedded programming/firmware should be something for folks to consider gravitating towards instead of pure software as one possible path
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,705
Reputation
10,592
Daps
185,702

OpenAI introduces Codex, its first full-fledged AI agent for coding​



It replicates your development environment and takes up to 30 minutes per task.

Samuel Axon – May 16, 2025 1:38 PM |
79

A place to enter a prompt, set parameters, and click code or ask


The interface for OpenAI's Codex in ChatGPT. Credit: OpenAI


We've been expecting it for a while, and now it's here: OpenAI has introduced an agentic coding tool called Codex in research preview. The tool is meant to allow experienced developers to delegate rote and relatively simple programming tasks to an AI agent that will generate production-ready code and show its work along the way.

Codex is a unique interface (not to be confused with the Codex CLI tool introduced by OpenAI last month) that can be reached from the side bar in the ChatGPT web app. Users enter a prompt and then click either "code" to have it begin producing code, or "ask" to have it answer questions and advise.

Whenever it's given a task, that task is performed in a distinct container that is preloaded with the user's codebase and is meant to accurately reflect their development environment.

To make Codex more effective, developers can include an "AGENTS.md" file in the repo with custom instructions, for example to contextualize and explain the code base or to communicate standardizations and style practices for the project—kind of a README.md but for AI agents rather than humans.

Codex is built on codex-1, a fine-tuned variation of OpenAI's o3 reasoning model that was trained using reinforcement learning on a wide range of coding tasks to analyze and generate code, and to iterate through tests along the way.



OpenAI's announcement post about Codex is filled with objection handling to tackle the common refrains against AI coding agents; based on older tools and models, many developers accurately point out that LLM coding tools (especially when used for vibe coding instead of just for code completion or as an advisor) have been known to produce scripts that don't follow standards, are opaque or difficult to debug, or are insecure.

The fine tuning that led to codex-1 is meant to address these concerns in part, and it's also key that Codex shows its thinking and work every step of the way as it goes through its tasks (which can take anywhere from one to 30 minutes to complete). All that said, OpenAI notes that "it still remains essential for users to manually review and validate all agent-generated code before integration and execution."

Codex is available in a research preview, but it's rolling out to all ChatGPT Pro, Enterprise, and Team users now. Plus and Edu support is coming at a later date. For now, "users will have generous access at no additional cost for the coming weeks" so that they "can explore what Codex can do," but OpenAI says it intends to introduce rate limits and a new pricing scheme later.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,705
Reputation
10,592
Daps
185,702

Windsurf Launches SWE-1: A Frontier AI Model Family for End-to-End Software Engineering​


By Asif Razzaq

May 16, 2025

In a move that signals a deeper convergence of AI and software engineering, Windsurf has announced the launch of SWE-1 , its first family of AI models purpose-built for the entire software development lifecycle. Unlike traditional code generation models, SWE-1 is designed to assist with real-world software engineering workflows—handling everything from incomplete code states to multi-surface task orchestration.

This marks a pivotal shift in Windsurf’s evolution—from offering AI-powered developer tools to designing proprietary models engineered for the complexity and nuance of real software engineering environments.

Beyond Code Generation: Engineering-Native Intelligence


While many AI systems are optimized for static code completion, SWE-1 is architected around a more realistic premise: that codebases are often incomplete, tasks span multiple tools, and developers frequently operate across asynchronous contexts.

“Writing code is just a small part of the job,” said Varun Mohan, CEO and co-founder of Windsurf. “To accelerate the entire development process by 99%, we needed models that are native to the workflows engineers actually face.”

By training on partially written programs, multi-modal workflows, and evolving interaction states, SWE-1 models are positioned not as code assistants, but as systems engineers—capable of understanding context, continuity, and complexity.

The SWE-1 Family: Three Models, One Unified Vision


The SWE-1 release includes three distinct models tailored to different use cases within the developer ecosystem:

  • SWE-1 : The flagship model, optimized for long-range context, multi-tool reasoning, and advanced workflows. It’s designed to support tasks that span beyond single-turn completions, including debugging, architecture exploration, and integration across tools. According to Windsurf, its performance is competitive with models like Claude 3.5 Sonnet and GPT-4.1—at a more favorable cost-to-performance ratio.
  • SWE-1-lite : A streamlined variant replacing Windsurf’s earlier Cascade Base model. It’s built for efficiency while retaining high contextual fidelity, making it well-suited for IDE integrations and mid-tier deployments.
  • SWE-1-mini : A latency-optimized model designed to power real-time predictive suggestions inside Windsurf’s own developer environment (Tab). It excels at fast, passive completions and local tasks.

All three models are natively integrated into Windsurf’s platform, enabling fluid interaction across coding interfaces, terminals, documentation, and system tooling.

Screenshot-2025-05-16-at-10.52.10%E2%80%AFPM-1-1024x566.png


Screenshot-2025-05-16-at-10.52.25%E2%80%AFPM-1024x590.png


Flow Awareness: Aligning with the Developer’s State of Mind


A cornerstone of SWE-1 is its flow awareness —a capability that allows the models to reason over time, track developer intentions, and maintain contextual coherence across tool boundaries.

Rather than treating tasks as isolated prompts, SWE-1 maintains awareness of the broader engineering flow. This includes recognizing the state of a project, anticipating downstream requirements, and syncing across multiple surfaces. The result is a model that feels less like a chatbot and more like an embedded engineering collaborator.

Benchmarking Against Frontier Models


Internally, Windsurf evaluated SWE-1 against leading general-purpose LLMs. Results show competitive performance with Claude 3.5 Sonnet in tasks requiring tool usage, multi-hop reasoning, and planning. However, Windsurf emphasizes that SWE-1 achieves this while being more cost-efficient and better aligned with developer-native workflows.

This cost-conscious design, paired with engineering-focused training data, offers an alternative to large generalist models that are often expensive to run and less attuned to software workflows.

Conclusion


SWE-1’s release represents a broader trend in AI: the specialization of models to suit domain-specific needs. As software development grows increasingly complex—with cloud deployments, dev environments, and AI tools in constant motion—models like SWE-1 offer a path forward that is both pragmatic and powerful.

By embedding deep knowledge of how software is built, maintained, and evolved, Windsurf positions SWE-1 not just as a coding tool—but as a system-level AI collaborator that understands the rhythms and realities of modern software engineering.




Check out the Technical Details and Download . All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit .

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
68,705
Reputation
10,592
Daps
185,702
ChatGPT's mysterious 'Summit' model one-shotting a streaming site


Posted on Sun Jul 27 21:14:03 2025 UTC


Not sure what OpenAI is cooking, but if what's been leaking out from WebDev Arena is anything to go by they may be set to cook the competition...

...or at least finally give Sonnet/Opus a serious run for their money.

Source: https://twiiit.com/chatgpt21/status/1949307106038878208 | https://nitter.poast.org/chatgpt21/status/1949307106038878208 | https://xcancel.com/chatgpt21/status/1949307106038878208 | Chris @chatgpt21, Twitter Profile | TwStalker






1/23
@chatgpt21
Summit when asked to make a streaming website

Wow!



https://video.twimg.com/amplify_video/1949307045288574976/vid/avc1/980x720/oFoa900qcWp2PYCj.mp4

2/23
@ShimazuSystems
points:
- looks terrible
- has mashed in features like an early 2000s website
- is easily identifiable as Vibe Coded
- not even going to begin to speculate how awful security is
- likely has significant hardcoded placeholder features
- backend for YouTube for example is huge



3/23
@chatgpt21
This is one shot lol, this was literally impossible 6 months ago to do one shot



4/23
@giffmana
MonkaS



5/23
@chatgpt21
pls give cracked OS llama 700B



6/23
@techikansh
this is some insane sorcery type shyt, how many turns for this output?



7/23
@Sherveen
Meh, I just asked each of the existing foundational apps to do this in their canvas and they all did fine.



8/23
@Automager
Was this one-shotted? What was the prompt?



9/23
@sortaAutistic
Actually nuts



10/23
@mrgshum
Now put it in production



11/23
@cube__lol
I’m so excited man.



12/23
@cube__lol
How can I get access to this rn? Someone get me in the loop pls.



13/23
@MaxPazow
Summit and Lobster are cooking for me on webdev arena. Pretty cool.



14/23
@Prometheons
These "shadow" models are getting out of hand..

..saw a Lobster one shot of a neural network visualisation in threejs that looked insane.



15/23
@ChiaraCervetta
Wow, this looks unbelievable! 🤩 🤯 🌟



16/23
@SirOligopoly
Where is this from



17/23
@syrax0o0
That's crazy, how many prompts did it take? Was it one shot?



18/23
@bennyj504
Link to Summit



19/23
@dirtdogg03
Where does the video comes from??🤔



20/23
@RodFdw
@chatgpt21 what are your thoughts on Summit vs. Zenith, so far? 👀



21/23
@KirovDoc
Fake



22/23
@xw33bttv
Now lets see our boy Nectarine's streaming website



23/23
@ReturnHumans
AI bros discovering templates.




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Top