Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips
In a test across four security domains using ESLint security plugins mapped to CWEs, Claude Sonnet 4.6 and Gemini 2.5 Flash produced code with nearly identical security gaps. The scoreboard: one Gemini win, two ties, one split. More critically, across 700 AI-generated functions, 63% contained vulnerabilities. The author argues that the question of which model writes more secure code is the wrong frame; the real problem is that developers typically prompt for features without security hints, and code review fails to catch the missing hardening steps. The plugins used are specifically designed to catch these bugs, and the results show that both models skip the same hardening steps, meaning shipped AI-generated auth middleware likely has the same gaps.
63% of AI-generated functions ship vulnerabilities, and code review misses them.