Coding with GenAI assistance is like working with a brilliant but over-excited junior developer

Why Experienced Developers Are More Crucial Than Ever

It seems trite to say that Generative AI has been making waves in the world of software development, but the analogy is a good one as each wave comes so fast upon the previous and washes it away with another way of working. We’ve recently started working with Codeium and it feels like another step change. But is it a big wave or just another crest of the current flood? That’s probably enough of that analogy now, actually.

While these tools bring incredible capabilities to the table, reviewing their use has highlighted how the role of skilled developers has never been more critical. In this article, we’ll explore how AI is transforming coding, the strengths and limitations of these tools, and why software engineering expertise remains indispensable.

From ChatGPT to Integrated Coding Assistants

GenAI’s journey in coding began with general-purpose models like ChatGPT, which allowed developers to directly ask questions and receive code snippets in response. Over time, the tooling just within ChatGPT has improved quite a bit. Now it’s possible to maintain a parallel script while interacting in a conversational manner, preserving context and formatting for easy copying and modification.

Just in the last couple of weeks ChatGPT has added a feature that provides code reviewing suggesting improvements. At present, the ChatGPT approach is pretty slow to make updates, review points get forgotten after the first is addressed, and in some cases half the script got forgotten. But it’s likely this will be rapidly improved upon. Certainly at present it takes a skilled developer to be doing their own reviewing and removing mistakes the AI makes.

For a single-page script, this parallel approach can still be a quick way to put something together, though the lack of non-AI IDE capabilities like type checking means that these scripts are likely to need another stage of development before they can be used.

The Rise of AI Pair Programmers: Copilot and Beyond

A step up from just typing into a chat is the IDE integration. GitHub Copilot was one of the first widely adopted AI tools designed specifically for coding. Powered by OpenAI’s models, Copilot acts as a pair programmer, suggesting code snippets, completing functions, and even generating boilerplate code. This innovation spawned a wave of competitors and clones, leading to an entire ecosystem of AI-assisted development tools.

Approaches that have iterated on copilot include double.bot, which in common with other recent tools allows use of more recent OpenAI and Anthropic models, and improves on the IDE integration with fluid movement between chat and code allowing for discussion and review in a chat before having the changes applied and viewing a diff. They also support highlighting a section and asking for a change just to that code.

The expansion of ways of interacting with the AI supports developers of different experience and parts of the code life-cycle:

Novice starting a script - can ask for the functionality they want and get a common approach and structure

Please write me a script to pull down a spreadsheet from google, loop through the lines and insert each into a database

Experienced coder starting a script - can ask for a given boilerplate approach to structure including documentation and version control. The coder does not have to write this by hand, and the AI has a better starting point as by itself is unlikely to follow all the best practice that the experienced coder can prompt it to use

We’re going to run a script in AWS lambda in node.js 22 called sync-sheet-to-db. It should also be able to be run locally with an .env file for environment variables, and will have unit tests for the processing of rows. Create a boiler-plate structure with .gitignore, readme.md, .env.example and npm run scripts to run locally, run tests, zip and publish…

Novice asking for changes - even without knowing the language they can ask for changes and new features in the chat. That’s something that wouldn’t have been possible without AI. If the request is simple and similar to common requests others have made it might even work out of the gate

The spreadsheet has a header row. Don’t push that to the database - start on the next row

Experienced coder making small changes - directly editing the code, they will get auto-complete suggestions which they can accept or ignore, speeding up what they already can do. Or they can write comments detailing what they know is possible to do in a few lines and have more of the code filled in mostly accurately. You may notice that an experienced programmer might write the code for this faster than the comment, in which case the AI can filling in the comment instead

// Populate return array by filtering input rows, skipping any that are blank in all columns, or that fail validation of the first column having a date and all others whole numbers

Experienced coder making larger changes in a file - being able to highlight the area to affect limits the unrequested changes the AI can make

Refactor to extract validation to its own private function, WITHOUT changing the functionality.

Anyone making sweeping changes - refactoring an approach or even tidying up file structures can work for a handful of files, but will often need wider context than many AIs can handle

Refactor script to service, model and utility classes with each having clear responsibilities described in their header, maximising dependency injection at a class level.

Those last two points highlight the current limit of these tools:

Over-eager changes
Lack of context

This last point is starting to get addressed…

The Context Problem: A Key Limitation of AI Coding

One of the limitations of generative AI is that it is trained on large data but it does not handle large inputs well, as anyone who has tried to load anything large into Chat GPT will attest. The current generation of AI development tools start to break down this barrier by allowing reasoning on whole code bases, opening up the possibility of more context-aware code that takes into account more of the system, and the ability to summarise more than a single script.

Some of these tools currently on the market:

Tabnine: Similar to Copilot, Tabnine provides AI-powered code completion but emphasizes privacy by allowing developers to train the tool on their own repositories.
Codeium: Promising to close the development iteration loop where improvements are made, tested and analysed to spot where changes need to be made. Codeium can automate entire sequences of improvements without intervention
Amazon CodeWhisperer: Tailored for AWS developers, CodeWhisperer integrates seamlessly with Amazon’s ecosystem, offering context-aware suggestions optimized for cloud applications.

While approaches like these are resource-intensive in terms of AI runtime, each is making advances by either training on the custom code bases and/or using an inference engine to determine intelligently what context to pull from at each step.

The Double-Edged Sword of AI “Efficiency”

To return to the other limitation we mentioned earlier of unexpected changes, AI’s ability to make rapid, wide-ranging changes is both a strength and a potential pitfall. Codeium and similar tools can:

Generate large-scale changes quickly, sometimes removing functionality or introducing unintended features. For instance, AI may infer that code "should look" a certain way based on patterns, leading to unnecessary additions.
Optimize locally while ignoring the broader scope of the application, resulting in fragmented or repeated functionality.

Here is an example of both happening at once:

I was recently coding up a backend API and frontend app involving recipes. After asking the AI to make some changes to the frontend design I was unsurprised to find the API requests broken. But I was surprised to find it was because recipes now required a photo URL - something I’d never asked for, but as it’s something similar apps use, the helpful AI added this functionality in - to the back half.

Finding what it had done and undoing the changes reminded me of working with junior developers who are very excited about adding functionality but lack the discipline to do one thing at a time or even tell you that they’ve decided to make a change.

Sticking to what’s expected

Any non-standard algorithm highlights the limitations of GenAI to aid with coding, at least with the current models. I’ve repeatedly had a carefully crafted algorithm replaced with something more generic (and wrong) even when specifically asking the tool to stick to the current approach. In these cases, we have to write the code ourselves and limit use of these assistants to just that - assisting. For anything truly novel, a coder has to actually write code.

Experienced developers play a crucial role in mitigating all of these risks. They can spot inconsistencies, avoid functionality loss through small, incremental changes, and leverage regression test suites to identify errors early.

AI-Assisted Meta-Development: Tests and Refactoring

AI tools can also help with meta-development tasks like creating and updating tests or refactoring code:

Test generation: AI can automatically write unit tests for new features, reducing the time spent on manual test creation.
Refactoring suggestions: Platforms like Mutable.ai identify opportunities to simplify or optimize code structures.

Test automation

Even in cases where AI cannot write your code for you, it can help with one of the most important and lengthy tasks - writing test cases. To be clear, it cannot do it by itself, as choice of tests and what the true result should look like needs to be done by a developer who understands the assignment, but the grunt-work can be delegated to AI.

However, these tasks require careful oversight. Without a clear understanding of good coding practices, AI-driven changes can inadvertently introduce technical debt or spaghetti code, leading to fragile and hard-to-maintain systems.

Why Developers Are More Important Than Ever

The rise of AI in coding doesn’t diminish the need for skilled developers; it amplifies it. Here’s why:

Quality Control: Developers must validate AI-generated code, ensuring it aligns with the project’s goals and avoids introducing bugs.
Big-Picture Thinking: AI often focuses on isolated tasks. Developers provide the strategic oversight needed to ensure changes work harmoniously within the entire system.
Minimizing Technical Debt: Writing maintainable, scalable code remains a human skill. Developers ensure that AI contributions don’t compromise long-term project health.

For example, an AI might refactor a function to improve performance but fail to consider its impact on downstream modules. A skilled developer would catch this and adjust the changes accordingly, preserving functionality and maintainability.

Conclusion: The Developer-AI Partnership

GenAI is revolutionising coding by saving time, reducing repetitive tasks, and offering insights that were previously unattainable. Tools like Codeium, Copilot, and others demonstrate the power of AI to augment human capabilities. However, they also underscore the continued importance of experienced developers who can guide and refine AI-driven processes.

In the end, the most successful projects will be those where developers and AI work together seamlessly — with developers providing the expertise and judgment that no AI can replace. Will AI start encroaching on more of these areas? Most definitely, and I look forward to fewer hallucinations and better context awareness. And I look forward to skipping the writing of boilerplate, test structures and readmes, and concentrating on designing and writing the interesting parts of code.

‍

Rob Egginton

Head of Development