- According to the report, the average pull request generated by AI has 10.83 problems, compared to 6.45 for human code.
- The quality could be better in terms of typos, leaving room for human reviewers.
- There are Microsoft patches available, but probably also the full release.
According to new data from CodeRabbit, AI-generated code is actually more vulnerable to more exploits than human-generated code, raising questions about the reliability of some tools.
Pull requests created with AI tools averaged 10.83 issues, compared to 6.45 issues for human-generated pull requests, ultimately leading to longer reviews and the potential for more bugs to make their way into the final product.
AI-generated pull requests not only had general issues that were 1.7 times more severe, but also critical issues that were 1.4 times more severe and issues that were 1.7 times more severe. So these aren’t just minor issues.
AI-generated code is not as secure as you might think
Logic and correctness errors (1.75x), code quality and maintainability (1.64x), security (1.57), and performance (1.42x) all recorded code errors higher than average, and the report criticizes AI for introducing more serious errors that needed to be addressed by human reviewers.
Problems likely to be introduced by AI include poor password management, insecure object references, XSS vulnerabilities, and insecure deserialization.
“AI coding tools dramatically increase productivity, but they also introduce predictable and measurable vulnerabilities that companies must actively mitigate,” said David Loker, AI director at CodeRabbit.
However, this is not necessarily a bad thing as AI improves efficiency from the early stages of code generation. The technique also resulted in 1.76 times fewer typos and 1.32 times fewer testability issues.
So while the study highlights some of AI’s shortcomings, it also serves the important purpose of showing how humans and AI agents might interact in the future. Instead of replacing human workers, we’re seeing human work shift to AI management and assessment: computers just perform some of the tedious tasks that slow humans down.
While Microsoft claims to have patched 1,139 CVEs in 2025, which is the second highest year ever, that doesn’t necessarily mean that’s a bad thing. With AI, developers create more code initially, so the overall percentage of questionable code may not be as high as these numbers initially suggest.
Add to that the fact that AI models like OpenAI’s GPT family are constantly being improved to produce more accurate and less inaccurate results.
