Will Large Language Models End Programming?

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,161
Reputation
7,364
Daps
134,090

ARTIFICIAL INTELLIGENCE

Will Large Language Models End Programming?​


Published
8 seconds ago
on
November 14, 2023

By
Aayush Mittal
LLM replacing human programmers

Last week marked a significant milestone for OpenAI, as they unveiled GPT-4 Turbo at their OpenAI DevDay. A standout feature of GPT-4 Turbo is its expanded context window of 128,000, a substantial leap from GPT-4's 8,000. This enhancement enables the processing of text 16 times greater than its predecessor, equivalent to around 300 pages of text.

This advancement ties into another significant development: the potential impact on the landscape of SaaS startups.

OpenAI's ChatGPT Enterprise, with its advanced features, poses a challenge to many SaaS startups. These companies, which have been offering products and services around ChatGPT or its APIs, now face competition from a tool with enterprise-level capabilities. ChatGPT Enterprise's offerings, like domain verification, SSO, and usage insights, directly overlap with many existing B2B services, potentially jeopardizing the survival of these startups.

In his keynote, OpenAI's CEO Sam Altman revealed another major development: the extension of GPT-4 Turbo's knowledge cutoff. Unlike GPT-4, which had information only up to 2021, GPT-4 Turbo is updated with knowledge up until April 2023, marking a significant step forward in the AI's relevance and applicability.

ChatGPT Enterprise stands out with features like enhanced security and privacy, high-speed access to GPT-4, and extended context windows for longer inputs. Its advanced data analysis capabilities, customization options, and removal of usage caps make it a superior choice to its predecessors. Its ability to process longer inputs and files, along with unlimited access to advanced data analysis tools like the previously known Code Interpreter, further solidifies its appeal, especially among businesses previously hesitant due to data security concerns.

The era of manually crafting code is giving way to AI-driven systems, trained instead of programmed, signifying a fundamental change in software development.

The mundane tasks of programming may soon fall to AI, reducing the need for deep coding expertise. Tools like GitHub's CoPilot and Replit’s Ghostwriter, which assist in coding, are early indicators of AI's expanding role in programming, suggesting a future where AI extends beyond assistance to fully managing the programming process. Imagine the common scenario where a programmer forgets the syntax for reversing a list in a particular language. Instead of a search through online forums and articles, CoPilot offers immediate assistance, keeping the programmer focused towards to goal.

Transitioning from Low-Code to AI-Driven Development​

Low-code & No code tools simplified the programming process, automating the creation of basic coding blocks and liberating developers to focus on creative aspects of their projects. But as we step into this new AI wave, the landscape changes further. The simplicity of user interfaces and the ability to generate code through straightforward commands like “Build me a website to do X” is revolutionizing the process.

AI's influence in programming is already huge. Similar to how early computer scientists transitioned from a focus on electrical engineering to more abstract concepts, future programmers may view detailed coding as obsolete. The rapid advancements in AI, are not limitd to text/code generation. In areas like image generation diffusion model like Runway ML, DALL-E 3, shows massive improvements. Just see the below tweet by Runway showcasing their latest feature.



Extending beyond programming, AI's impact on creative industries is set to be equally transformative. Jeff Katzenberg, a titan in the film industry and former chairman of Walt Disney Studios, has predicted that AI will significantly reduce the cost of producing animated films. According to a recent article from Bloomberg Katzenberg foresees a drastic 90% reduction in costs. This can include automating labor-intensive tasks such as in-betweening in traditional animation, rendering scenes, and even assisting with creative processes like character design and storyboarding.

The Cost-Effectiveness of AI in Coding​

Cost Analysis of Employing a Software Engineer:
  1. Total Compensation: The average salary for a software engineer including additional benifits in tech hubs like Silicon Valley or Seattle is approximately $312,000 per year.

Daily Cost Analysis:
  1. Working Days Per Year: Considering there are roughly 260 working days in a year, the daily cost of employing a software engineer is around $1,200.
  2. Code Output: Assuming a generous estimate of 100 finalized, tested, reviewed, and approved lines of code per day, this daily output is the basis for comparison.

Cost Analysis of Using GPT-3 for Code Generation:
  1. Token Cost: The cost of using GPT-3, at the time of the video, was about $0.02 for every 1,000 tokens.
  2. Tokens Per Line of Code: On average, a line of code can be estimated to contain around 10 tokens.
  3. Cost for 100 Lines of Code: Therefore, the cost to generate 100 lines of code (or 1,000 tokens) using GPT-3 would be around $0.12.

Comparative Analysis:
  • Cost per Line of Code (Human vs. AI): Comparing the costs, generating 100 lines of code per day costs $1,200 when done by a human software engineer, as opposed to just $0.12 using GPT-3.
  • Cost Factor: This represents a cost factor difference of about 10,000 times, with AI being substantially cheaper.

This analysis points to the economical potential of AI in the field of programming. The low cost of AI-generated code compared to the high expense of human developers suggests a future where AI could become the preferred method for code generation, especially for standard or repetitive tasks. This shift could lead to significant cost savings for companies and a reevaluation of the role of human programmers, potentially focusing their skills on more complex, creative, or oversight tasks that AI cannot yet handle.

ChatGPT's versatility extends to a variety of programming contexts, including complex interactions with web development frameworks. Consider a scenario where a developer is working with React, a popular JavaScript library for building user interfaces. Traditionally, this task would involve delving into extensive documentation and community-provided examples, especially when dealing with intricate components or state management.

With ChatGPT, this process becomes streamlined. The developer can simply describe the functionality they aim to implement in React, and ChatGPT provides relevant, ready-to-use code snippets. This could range from setting up a basic component structure to more advanced features like managing state with hooks or integrating with external APIs. By reducing the time spent on research and trial-and-error, ChatGPT enhances efficiency and accelerates project development in web development contexts.

Challenges in AI-Driven Programming​

As AI continues to reshape the programming landscape, it’s essential to recognize the limitations and challenges that come with relying solely on AI for programming tasks. These challenges underscore the need for a balanced approach that leverages AI's strengths while acknowledging its limitations.
  1. Code Quality and Maintainability: AI-generated code can sometimes be verbose or inefficient, potentially leading to maintenance challenges. While AI can write functional code, ensuring that this code adheres to best practices for readability, efficiency, and maintainability remains a human-driven task.
  2. Debugging and Error Handling: AI systems can generate code quickly, but they don't always excel at debugging or understanding nuanced errors in existing code. The subtleties of debugging, particularly in large, complex systems, often require a human's nuanced understanding and experience.
  3. Reliance on Training Data: The effectiveness of AI in programming is largely dependent on the quality and breadth of its training data. If the training data lacks examples of certain bugs, patterns, or scenarios, the AI’s ability to handle these situations is compromised.
  4. Ethical and Security Concerns: With AI taking a more prominent role in coding, ethical and security concerns arise, especially around data privacy and the potential for biases in AI-generated code. Ensuring ethical use and addressing these biases is crucial for the responsible development of AI-driven programming tools.

Balancing AI and Traditional Programming Skills

In future software development teams maybe a hybrid model emerges. Product managers could translate requirements into directives for AI code generators. Human oversight might still be necessary for quality assurance, but the focus would shift from writing and maintaining code to verifying and fine-tuning AI-generated outputs. This change suggests a diminishing emphasis on traditional coding principles like modularity and abstraction, as AI-generated code need not adhere to human-centric maintenance standards.

In this new age, the role of engineers and computer scientists will transform significantly. They'll interact with LLM, providing training data and examples to achieve tasks, shifting the focus from intricate coding to strategically working with AI models.

The basic computation unit will shift from traditional processors to massive, pre-trained LLM models, marking a departure from predictable, static processes to dynamic, adaptive AI agents.

The focus is transitioning from creating and understanding programs to guiding AI models, redefining the roles of computer scientists and engineers and reshaping our interaction with technology.

The Ongoing Need for Human Insight in AI-Generated Code

The future of programming is less about coding and more about directing the intelligence that will drive our technological world.

The belief that natural language processing by AI can fully replace the precision and complexity of formal mathematical notations and traditional programming is, at best, premature. The shift towards AI in programming does not eliminate the need for the rigor and precision that only formal programming and mathematical skills can provide.

Moreover, the challenge of testing AI-generated code for problems that haven't been solved before remains significant. Techniques like property-based testing require a deep understanding programming, skills that AI, in its current state, cannot replicate or replace.

In summary, while AI promises to automate many aspects of programming, the human element remains crucial, particularly in areas requiring creativity, complex problem-solving, and ethical oversight.
 

Professor Emeritus

Veteran
Poster of the Year
Supporter
Joined
Jan 5, 2015
Messages
48,554
Reputation
18,772
Daps
193,499
Reppin
the ether
I'm interested as to whether LLM will dramatically change how a good programmer is defined, and possibly in multiple ways.

Are there guys who wouldn't actually have been able to be considered a competent programmer in the previous economy, but who have particular creativity/insights that will allow them to employ LLM to program more effectively than "better programmers" do?

Or will competent use of AI programming correspond extremely closely with competent programming in general?

Or will it depend on the subject and profession?
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,161
Reputation
7,364
Daps
134,090
I'm interested as to whether LLM will dramatically change how a good programmer is defined, and possibly in multiple ways.

Are there guys who wouldn't actually have been able to be considered a competent programmer in the previous economy, but who have particular creativity/insights that will allow them to employ LLM to program more effectively than "better programmers" do?

Or will competent use of AI programming correspond extremely closely with competent programming in general?

Or will it depend on the subject and profession?


i'm not a programmer but for that past year, i've been using AI/LLM's to write scripts, userscripts,, bookmarklets and modify existing scripts or "broken/incomplete" scripts that i come across online. If I were to have used fiverr for all the initial programming requests and additional adjustments i made, i probably would have spent well over $2,000.

automating thinks I repeatedly did has been absolutely incredible for me. i can only imagine what an amateur programmer can accomplish with this. i've shared some of the A.I generated code that improved my experience on using this very site.

last sunday i started working on a bookmarklet that would transform some links related to embedded content.

gSxdHGE.png

GElm7aB.png


i cn only imagine the number of tasks that people would like to have automated but don't know how or if it's even possible that AI could help with. using LLM's needs to be taught in K-8 to yound kids can capatlize on their own creativity and learn how to communicate better. prompting also has the side benefit of helping you communicate better and i haven't felt this creative in years.
 
Last edited:

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,161
Reputation
7,364
Daps
134,090

Computer Science > Computation and Language​

[Submitted on 5 Mar 2024]

Design2Code: How Far Are We From Automating Front-End Engineering?​

Chenglei Si, Yanzhe Zhang, Zhengyuan Yang, Ruibo Liu, Diyi Yang
Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This can enable a new paradigm of front-end development, in which multimodal LLMs might directly convert visual designs into code implementations. In this work, we formalize this as a Design2Code task and conduct comprehensive benchmarking. Specifically, we manually curate a benchmark of 484 diverse real-world webpages as test cases and develop a set of automatic evaluation metrics to assess how well current multimodal LLMs can generate the code implementations that directly render into the given reference webpages, given the screenshots as input. We also complement automatic metrics with comprehensive human evaluations. We develop a suite of multimodal prompting methods and show their effectiveness on GPT-4V and Gemini Pro Vision. We further finetune an open-source Design2Code-18B model that successfully matches the performance of Gemini Pro Vision. Both human evaluation and automatic metrics show that GPT-4V performs the best on this task compared to other models. Moreover, annotators think GPT-4V generated webpages can replace the original reference webpages in 49% of cases in terms of visual appearance and content; and perhaps surprisingly, in 64% of cases GPT-4V generated webpages are considered better than the original reference webpages. Our fine-grained break-down metrics indicate that open-source models mostly lag in recalling visual elements from the input webpages and in generating correct layout designs, while aspects like text content and coloring can be drastically improved with proper finetuning.
Comments:Technical Report; The first two authors contributed equally
Subjects:Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
Cite as:arXiv:2403.03163 [cs.CL]
(or arXiv:2403.03163v1 [cs.CL] for this version)
[2403.03163] Design2Code: How Far Are We From Automating Front-End Engineering?
Focus to learn more

Submission history

From: Chenglei Si [view email]
[v1] Tue, 5 Mar 2024 17:56:27 UTC (3,151 KB)


https://arxiv.org/pdf/2403.03163.pdf




Design2Code: How Far Are We From Automating Front-End Engineering​


Chenglei Si*1, Yanzhe Zhang*2, Zhengyuan Yang3, Ruibo Liu4, Diyi Yang1

1Stanford University, 2Georgia Tech, 3Microsoft, 4Google DeepMind

Paper Code Data

Abstract​

Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This enabled a brand new paradigm of front-end development, where multimodal LLMs can potentially convert visual designs into code implementations directly, thus automating the front-end engineering pipeline. In this work, we provide the first systematic study on this visual design to code implementation task (dubbed as Design2Code). We manually curate a benchmark of 484 real-world webpages as test cases and develop a set of automatic evaluation metrics to assess how well current multimodal LLMs can generate the code implementations that directly render into the given reference webpages, given the screenshots as input. We develop a suit of multimodal prompting methods and show their effectiveness on GPT-4V and Gemini Vision Pro. We also finetune an open-source Design2Code-18B model that successfully matches the performance of Gemini Pro Vision. Both human evaluation and automatic metrics show that GPT-4V is the clear winner on this task, where annotators think GPT-4V generated webpages can replace the original reference webpages in 49% cases in terms of visual appearance and content; and perhaps surprisingly, in 64% cases GPT-4V generated webpages are considered better than even the original reference webpages. Our fine-grained break-down metrics indicate that open-source models mostly lag in recalling visual elements from the input webpages and in generating correct layout designs, while aspects like text content and coloring can be drastically improved with proper finetuning.​


Test Set Examples​

We show some examples from our benchmark (for evaluation purpose; bottow two rows) in comparison with the synthetic data created by Huggingface (for training purpose; first row). Our benchmark contains diverse real-world webpages with varying levels of complexities. (Image files are replaced with a placeholder blue box.)

Example image.

Benchmark Performance: Automatic Metrics​

For automatic evaluation, we consider high-level visual similarity (CLIP) and low-level element matching (block-match, text, position, color). We compare all the benchmarked models along these different dimensions.

Example image.

Benchmark Performance: Human Evaluation​

We recruit human annotators to judge pairwise model output preference. The Win/Tie/Lose rate against the baseline (Gemini Pro Vision Direct Prompting). We sample 100 examples and ask 5 annotators for each pair of comparison, and we take the majority vote on each example.

Example image.

Model Comparison Examples​

We present some case study examples to compare between different prompting methods and different models.

Example image.


Example image.


Example image.

Additional GPT-4V Generation Examples​

We present more examples of GPT-4V generated webpages in comparison with the original reference webpages. The original designs are on the left and the GPT-4V generated webpages are on the right. You can judge for yourself whether GPT-4V is ready to automate building webpages.

testset_full_2.png


gpt4v_visual_revision_prompting_2.png


testset_full_13.png


gpt4v_visual_revision_prompting_13.png


testset_full_28.png


gpt4v_visual_revision_prompting_28.png


testset_full_33.png


gpt4v_visual_revision_prompting_33.png


testset_full_61.png


gpt4v_visual_revision_prompting_61.png
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,161
Reputation
7,364
Daps
134,090







1/8
Today we're excited to introduce Devin, the first AI software engineer.

Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.

Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser.

When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted.

Check out what Devin can do in the thread below.

2/8
Today we're excited to introduce Devin, the first AI software engineer.

Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.

Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser.

When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted.

Check out what Devin can do in the thread below.Today we're excited to introduce Devin, the first AI software engineer.

Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.

Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser.

When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted.

Check out what Devin can do in the thread below.

3/8
1/4 Devin can learn how to use unfamiliar technologies.

4/8
2/4 Devin can contribute to mature production repositories.

5/8
3/4 Devin can train and fine tune its own AI models.

6/8
4/4 We even tried giving Devin real jobs on Upwork and it could do those too!

7/8
For more details on Devin, check out our blog post here: For more details on Devin, check out our blog post here: https://https://cognition-labs.com/blog See Devin in action
If you have any project ideas, drop them below and we'll forward them to Devin.

See Devin in action
If you have any project ideas, drop them below and we'll forward them to Devin.

8/8
We'd like to thank all our supporters who have helped us get to where we are today, including @patrickc, @collision, @eladgil, @saranormous, Chris Re, @eglyman, @karimatiyeh, @bernhardsson, @t_xu, @FEhrsam, @foundersfund, and many more.

If you’re excited to solve some of the…

 

bnew

Veteran
Joined
Nov 1, 2015
Messages
44,161
Reputation
7,364
Daps
134,090

The end of coding? Microsoft publishes a framework making developers merely supervise AI


Michael Petraeus

April 15, 2024

code-for-food.jpg

In this article​



Two months ago, Jensen Huang, Nvidia’s CEO, dismissed the past 15 years of global career advice and advised that it’s not a good idea to learn to code anymore—at least not for most people.

Last month, Microsoft added its own contribution to the argument by releasing a research paper detailing AutoDev: an automated AI-driven development framework, in which human developers are relegated to the role of mere supervisors of artificial intelligence doing all of the actual software engineering work.

Goodbye, developers?

The authors have outlined—and successfully tested—a system of multiple AI agents interacting with each other as well as provided repositories to not only tackle complex software engineering tasks but also validate the outcomes on their own.

The role of humans, in their own words: “transforms from manual actions and validation of AI suggestions to a supervisor overseeing multi-agent collaboration on tasks, with the option to provide feedback. Developers can monitor AutoDev’s progress toward goals by observing the ongoing conversation used for communication among agents and the repository.”

In other words, instead of writing code, human developers would become spectators to the work of AI, interjecting whenever deemed necessary.

It’s really more akin to a management role, where you work with a team of people, guiding them towards the goals set for a project.


Overview of the AutoDev framework. Only the green input is provided by humans. / Image Credit: Microsoft

AutoDev workflow, outlining all of the actions that AI gents can perform on their own in pursuit of the desired output. / Image Credit: Microsoft

“We’ve shifted the responsibility of extracting relevant context for software engineering tasks and validating AI-generated code from users (mainly developers) to the AI agents themselves.”

But if that’s the case do we need human developers anymore? And what sort of skill should they have or acquire to remain useful in this AI-enabled workplace?

Will machines require soft skills in the future?

The conclusion to this evolutionary process may be unsettling to many—particularly to those highly-talented but reclusive software engineers who prefer working alone, dreading social interaction.

Well, as it happens, social skills may soon be required to… interact with machines.

Since all conversational models are essentially mimicking human communication, AI tools will require similar skills of their users as other humans would.


Image Credit: VisualGeneration / depositphotos

Fortunately, nobody is planning to equip computers with human emotions, so at least this aspect of teamwork is unlikely to become a problem, but many developers who today simply write code will now have to specialise in explaining it rather than executing themselves.

This certainly wasn’t a challenge that most techies foresaw when they entered the field but may very soon become a do or die situation for them.

If you’re not an effective supervisor, instructing machines to do the right things, your value in most companies may go down and not up, despite your highly specialised knowledge.

Software engineering path has just become less predictable

There will still be jobs for human programmers, of course, but they are more likely to be available in the companies that make the technology underpinning AI. After all, some future development and maintenance will have to be done by humans, if only as a safety measure.

That said, the pool of vacancies for software engineering experts will be draining quickly, unless you’re a competent communicator with a dose of managerial skills to competently run your own team of AI agents.

Perhaps the worst thing of all is that the creeping of AI into the field makes it very unpredictable as to what skills you should master as a future developer.

This is because we’re still looking for an answer to the fundamental question—if AI replaces most of us, will there be any humans left competent enough to modify the code if things go wrong in the future?

We have some examples of that in ancient IT systems written in outdated languages that few people know well enough to manage, not to mention update.

Since the demand for expertise in e.g. COBOL has gone down with time, there simply aren’t enough people to tackle the problems of old financial or government systems, which often hold millions of critical records of customers and nation’s citizens.

One could easily see how AI could have a similar impact, just on a far larger scale.

It’s a chicken or egg situation: what comes first? You need development skills to understand what AI is doing but how do you learn if AI is doing everything?

If you no longer need to master hard programming skills in any field, how many people will be left to fix things if they go wrong and we’re overly dependent on artificial intelligence?

This isn’t just a problem of reduced attractiveness as a candidate, retaining skills in technologies most companies won’t need you to do anything in, but a fundamental lack of practice, since it would only be needed in rare emergencies.

You can’t be good at something you rarely do.

How do you plan for a career in tech?

It used to be simple: you specialised in a specific field, mastered the required tools and languages, continuously updated your skill set as the technology advanced, and you could expect to be a well-paid, sought after professional eventually.

But now the value of your technical expertise vs. the ability to juggle AI bots coming up with solutions on their own flips it all on its head.

Experts with years of practice are likely to remain in demand (a bit like an old mechanic still fixing modern cars today – they may be different, but many fundamentals remain the same, and his experience can’t easily be gained anymore). But young students in computer science will have a tough nut to crack picking between hard skills and competency in using AI tools to achieve the same or better outcome.

There will be some jobs for highly-specialised experts and many jobs for those simply interacting with chatbots. But those stuck in the middle will soon have to pick their future between the two.

It’s whether you have what it takes to climb to the top or descend down to compete with Zoomers getting their AI-enabled coding lessons on TikTok.

Featured Image Credit: 123RF
 
Top