Enhancing Code Security with Graph-Based Reasoning in Large Language Models

TLDR: GRASP is a new framework that improves the security of code generated by Large Language Models (LLMs). Instead of relying on extensive training or external tools, GRASP uses a structured approach based on Secure Coding Practices (SCPs). It organizes SCPs into a graph and guides LLMs through a reasoning process to apply relevant security measures. This method achieves over 80% security rates, significantly improves protection against zero-day vulnerabilities, maintains functional correctness, and is efficient and model-agnostic.

Large Language Models (LLMs) have significantly changed how software is developed, making coding faster and more efficient. However, this progress comes with a major challenge: the code generated by LLMs often contains security vulnerabilities. For example, studies have shown that tools like GitHub Copilot can produce vulnerable code in about 40% of cases, and even advanced LLMs like GPT-4o generate secure code only 70-75% of the time. These rates are simply too high for practical use, as insecure code can lead to serious issues like data breaches or system compromises.

Current efforts to make LLMs generate more secure code often involve injecting security knowledge through specialized datasets, fine-tuning the models, or using external tools that analyze code for vulnerabilities. While these methods can be effective, they have their drawbacks. They can be very demanding in terms of computing resources, struggle to adapt to new, previously unseen vulnerabilities (known as zero-day vulnerabilities), and are often not usable with proprietary LLMs where access to the model’s internal workings is restricted.

Introducing GRASP: A New Approach to Secure Code Generation

To tackle these limitations, researchers have introduced a novel framework called GRASP. Instead of focusing on adding more security knowledge to LLMs or relying on external feedback, GRASP explores a different path: structured reasoning over Secure Coding Practices (SCPs). The core idea is that LLMs already possess a latent understanding of security concepts; the challenge is to ensure they consistently apply this knowledge during code generation, much like human developers benefit from structured practices.

GRASP is built on two main components:

SCP Graph: This is a Directed Acyclic Graph (DAG) that organizes Secure Coding Practices. Each node in the graph represents an individual SCP, and the connections (edges) between nodes capture important relationships, such as when one practice depends on another or when a general practice has more specific implementations. This structure helps GRASP understand how SCPs relate to each other.
Graph-Based Reasoning: This is a systematic process that guides LLMs through the SCP Graph. It helps the LLM select and apply only the SCPs that are relevant to the specific code generation task, ensuring they are applied in the correct order and context.

This design offers several key advantages. GRASP is interpretable, meaning it’s easier to understand why certain security transformations are applied. It’s also model-agnostic, working with various LLMs without needing specific training or access to their internal weights. Furthermore, it’s scalable and particularly effective at addressing previously unseen, zero-day vulnerabilities because it relies on general principles rather than memorized patterns from specific vulnerability datasets.

How GRASP Works in Practice

The process begins with the construction of the SCP Graph. Secure Coding Practices are gathered from authoritative sources like OWASP, CERT, and Microsoft. These practices are then filtered to include only those relevant to common software weaknesses (MITRE Top 25 CWEs) and applicable at the code level. An LLM helps organize these practices into the DAG, identifying dependencies and relationships, with a final human review to ensure accuracy.

Once the SCP Graph is ready, GRASP follows a three-step reasoning process:

Initial Solution Generation: The LLM first generates a basic code solution for the given task, focusing on functionality without immediate security constraints.
Graph Traversal: GRASP then systematically navigates the SCP Graph. For each node (SCP), it evaluates its relevance to the current code. If an SCP is deemed relevant and its dependencies are met, it’s applied to refine the code. This iterative process ensures that security practices are integrated logically and efficiently.
Ensuring Functional Correctness: After applying SCPs, GRASP performs a final review to ensure the modified code still meets the original functional requirements and doesn’t introduce new errors or inconsistencies.

Impressive Results Across Various LLMs

Evaluations show that GRASP significantly improves the security of LLM-generated code. It consistently achieves Security Rates (SR) exceeding 80% across multiple LLMs, including Claude, GPT-4o, Gemini, and Llama3. For vulnerabilities like Path Traversal (CWE-022) and Command Injection (CWE-078), where base models performed poorly (SRs of 0.2-0.4), GRASP delivered dramatic gains, with GPT-4o improving from 0.24 to 0.87 on CWE-022.

Crucially, GRASP also demonstrates strong generalization to zero-day vulnerabilities, achieving up to 88% improvements over baseline methods. This is a significant advantage over approaches that rely on curated vulnerability datasets, which often fail when encountering new threats.

While there might be a slight trade-off in functional correctness for a single generated sample, GRASP shows substantial gains in functional reliability when considering multiple generated samples (secure-pass@k scores), indicating it can produce both secure and correct code. Moreover, GRASP outperforms other prompting-based methods like Zero-Shot, Plan-and-Solve, and PromSec, offering superior security and functional correctness in a lightweight manner.

Also Read:

Efficiency and Future Directions

GRASP is also efficient and cost-effective. Despite requiring more input tokens due to the inclusion of SCP context, it generates more concise output and achieves a slightly lower monetary cost on average compared to other methods, thanks to its efficient use of reasoning iterations.

The framework is designed to be extensible, allowing new SCPs to be easily integrated into the graph as security landscapes evolve, without requiring model retraining. This adaptability ensures that GRASP can remain responsive to emerging threats and updated guidelines. You can read the full paper for more details at arxiv.org/pdf/2510.09682.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Enhancing Code Security with Graph-Based Reasoning in Large Language Models

Introducing GRASP: A New Approach to Secure Code Generation

How GRASP Works in Practice

Impressive Results Across Various LLMs

Efficiency and Future Directions

Gen AI News and Updates

OneShield Achieves Landmark Registration Under Cloud Security Alliance AI Controls Matrix, Setting New Industry Standard

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates