AI Coding Assistants Are Getting Worse

January 8th, 2026

Great, the hallucinations compile and run without errors.

Via: IEEE Spectrum:

In recent months, I’ve noticed a troubling trend with AI coding assistants. After two years of steady improvements, over the course of 2025, most of the core models reached a quality plateau, and more recently, seem to be in decline. A task that might have taken five hours assisted by AI, and perhaps ten hours without it, is now more commonly taking seven or eight hours, or even longer. It’s reached the point where I am sometimes going back and using older versions of large language models (LLMs).

Until recently, the most common problem with AI coding assistants was poor syntax, followed closely by flawed logic. AI-created code would often fail with a syntax error or snarl itself up in faulty structure. This could be frustrating: the solution usually involved manually reviewing the code in detail and finding the mistake. But it was ultimately tractable.

However, recently released LLMs, such as GPT-5, have a much more insidious method of failure. They often generate code that fails to perform as intended, but which on the surface seems to run successfully, avoiding syntax errors or obvious crashes. It does this by removing safety checks, or by creating fake output that matches the desired format, or through a variety of other techniques to avoid crashing during execution.

As any developer will tell you, this kind of silent failure is far, far worse than a crash. Flawed outputs will often lurk undetected in code until they surface much later. This creates confusion and is far more difficult to catch and fix.

Leave a Reply

You must be logged in to post a comment.