Backslash Security Reveals in New Research that GPT-4.1, Other Popular LLMs Generate Insecure Code Unless Explicitly Prompted

Addressing “vibe coding” security gaps, Backslash to demo its MCP server and built-in rules for securing Agentic IDEs at RSAC 2025

/EIN News/ -- TEL AVIV, Israel, April 24, 2025 (GLOBE NEWSWIRE) -- Backslash Security, the modern application security platform for the AI era, today revealed that the most popular LLMs on the market produce insecure code by default, failing to address the most common weaknesses. When prompted with additional security guidance or when governed by rules, security is greatly improved, but not equally among the different tools and versions. To address the risks of insecure code generation by AI, Backslash is also announcing the debut of its Model Context Protocol (MCP) Server, and the debut of its Rules and Extension for Agentic IDEs such as Cursor, Windsurf and GitHub Copilot in VS Code.

Backslash Security selected seven current versions of OpenAI’s GPT, Anthropic's Claude and Google’s Gemini to test the influence varying prompting techniques had on their ability to produce secure code. Three tiers of prompting techniques, ranging from "naive" to “comprehensive,” were used to generate code for everyday use cases. Code output was measured by its resilience against 10 Common Weakness Enumeration (CWE) use cases. The results carried a common theme – secure code output success rose with prompt sophistication, but all LLMs generally produced insecure code by default:

In response to simple, “naive” prompts, all LLMs tested generated insecure code vulnerable to at least 4 of the 10 common CWEs. Naive prompts merely asked to generate code for a specific application, without specifying security requirements.
Prompts that generally specified a need for security produced more secure results, while prompts that requested code that complied with Open Web Application Security Project (OWASP) best practices produced superior results, yet both still yielded some code vulnerabilities for 5 out of the 7 LLMs tested.
Prompts that were bound to rules specified by Backslash to address the specific CWEs resulted in code that is secure and not vulnerable to the tested CWEs.
Overall, OpenAI’s GPT-4o had the lowest performance across all prompts, scoring a 1/10 secure code result using "naive" prompts. When prompted to generate secure code, it still produced insecure outputs vulnerable to 8 out of 10 issues. GPT-4.1 didn’t fare much better with naive prompts, scoring 1.5/10.
Among the GenAI tools, the best performer was Claude 3.7 Sonnet, scoring 6/10 using naive prompts and 10/10 with security-focused prompts.

For AppSec to keep pace with the emerging “vibe coding” paradigm, in which developers tap AI to create code based on “feel” rather than formal planning, application security tools must ensure that LLMs generate safe, secure code. To address the issues revealed by its LLM prompt testing, Backslash is introducing several new features that immediately enable safe vibe coding. By controlling the LLM prompt, the Backslash platform can leverage AI-coding to drive secure code from the get-go, enabling true "security by design" for the first time. Backslash will debut the new capabilities at RSAC 2025:

Backslash AI Rules & Policies: Machine-readable rules (e.g., for Cursor) can be injected into prompts to ensure CWE coverage, while AI policies control which AI rules are active in IDEs via the Backslash platform.
Backslash IDE Extension: IDE integration is key to serving developers where they work. The IDE extension enables developers to receive Backslash security reviews on code written by both humans and AI.
Backslash Model Context Protocol (MCP) Server: The context-aware API conforms to the MCP standard, connecting Backslash to AI tools, enabling secure coding, scanning, and fixes. Through this connection, Backslash can answer questions like: Is this package vulnerable? Does this code expose a vulnerable package? What code needs to change to safely upgrade a package?

"For security teams, AI-generated code – or vibe coding – can feel like a nightmare,” said Yossi Pik, co-founder and CTO of Backslash Security. “It creates a flood of new code and brings LLM risks like hallucinations and prompt sensitivity. But with the right controls – like org-defined rules and a context-aware MCP server plugged into a purpose-built security platform – AI can actually give AppSec teams more control from the start. That’s where Backslash comes in, with dynamic policy-based rules, a context-sensitive MCP server, and an IDE extension built for the new coding era."

See Backslash Security’s new blog post for full details about the AI prompt research: https://www.backslash.security/blog/can-ai-vibe-coding-be-trusted.

Meet the Backslash Security team at the RSA Conference to see a live demonstration of Backslash MCP Server at booth ESE-52 from April 28 to May 1, 2025. To schedule a remote demo, sign up at https://www.backslash.security/demo.

About Backslash Security
Backslash Security offers a fresh approach to application security by creating a digital twin of your application, modeled into an AI-enabled App Graph. It filters “triggerable” vulnerabilities, categorizes security findings by business process, secures AI-generated code, and simulates the security impact of updates, using a fully agentless approach. Backslash dramatically improves AppSec efficiency, eliminating the frustration caused by legacy SAST and SCA tools. Forward-looking organizations use Backslash to modernize their application security for the AI era, shorten remediation time, and accelerate time-to-market of their applications. For more information, visit https://backslash.security.

Media Contact:
Jacob Manchester
Scratch Marketing & Media for Backslash
backslash@scratchmm.com

Distribution channels: Culture, Society & Lifestyle, Media, Advertising & PR, Technology ...

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Submit your press release