New Study Shows AI Coding Tools Boost Speed — While Hurting Code Quality

AI coding tools are still making many codebases worse, and a new study from Carnegie Mellon University on Cursor shows the main problem is falling code quality.[arxiv]​

What the study did

Researchers looked at 807 GitHub projects that adopted Cursor and compared them with 1,380 similar projects that did not use it. They measured how fast the teams moved (commits, lines of code) and how healthy the code looked (static analysis warnings, duplication, and code complexity) from January 2024 to August 2025.[arxiv]​

Speed goes up, then disappears

When teams turned on Cursor, they briefly got much faster. In the first month, lines of code jumped by about 3–5 times (around 281% more) and commits rose by about 55% compared to what would have happened without Cursor. But by the third month, both numbers had dropped back to normal, so the speed boost was short‑lived.[arxiv]​

Quality goes down and stays down

The real problem is what happens to quality. The study used SonarQube to track static analysis warnings and a measure of how complex the code is to understand. After teams adopted Cursor:[arxiv]​

  • Static analysis warnings went up by about 30% and stayed high instead of returning to earlier levels.[arxiv]​
  • Code complexity went up by about 41% and also stayed high through the end of the study period.[arxiv]​

This was not just because there was more code. The researchers controlled for project size and other factors, and still saw AI‑assisted projects becoming more complex and warning‑heavy than similar non‑AI projects. They also showed that this extra complexity later slowed the teams down, creating a loop where low quality hurts future speed.[arxiv]​

Why this is important now

The data covers early 2024 to mid‑2025, when modern agent features and strong models were already available inside Cursor. This is not about old, weak models; it is about current tools that can edit many files, run tests, and change whole parts of a system with little human help. If newer “smarter” models were going to fix earlier code quality issues, this would be a good place to see that improvement—but the study does not see it.[arxiv]​

Some people say bad AI code is only due to bad prompts or careless users. This study shows something stronger: even after matching similar projects and controlling for growth, AI‑using repos still gain complexity and warnings faster than those without AI. That means the tools themselves are helping create harder‑to‑maintain code.[arxiv]​

What teams should do

You cannot just plug AI into your existing workflow and expect lasting benefits. You get a quick burst of output, then a codebase that is more complex, with more potential issues, and work slows down again. To use AI safely, teams need to put quality first:[arxiv]​

  • Treat AI changes like changes from a junior developer: strong reviews and targeted tests before merging.
  • Set clear limits on allowed complexity and fail builds when those limits are broken, instead of only watching dashboards.
  • Plan regular refactoring based on actual metrics (warnings, complexity, duplication), not just on calendar time.

Without this, AI becomes a technical‑debt accelerator: it helps you write more code quickly, but much of that code makes the system worse over time.