Agentic AI for Smart Contract Security Audits: Introducing POCO

TLDR: POCO is an AI-powered framework that automates the creation of Proof-of-Concept (PoC) exploits for smart contract vulnerabilities. It takes natural language descriptions from auditors and generates executable exploits using an agentic Reason-Act-Observe loop and specialized tools. Evaluated on real-world data, POCO significantly outperforms traditional methods, producing well-formed and logically correct PoCs, thereby streamlining security audits and helping developers fix issues faster.

Smart contracts, the self-executing agreements on blockchain, operate in an environment fraught with risks, where vulnerabilities can lead to substantial financial losses. As of October 2025, on-chain exploits have already resulted in approximately $15 billion in losses. To combat this, security audits are a critical part of the smart contract lifecycle, and within these audits, Proof-of-Concept (PoC) exploits play a vital role. PoCs demonstrate that reported vulnerabilities are genuine, reproducible, and actionable, providing concrete evidence for stakeholders.

However, the manual creation of these PoC exploits is often time-consuming, prone to errors, and constrained by tight audit schedules. This is where POCO comes in.

Introducing POCO: Agentic Exploit Generation

POCO is an innovative agentic framework designed to automatically generate executable PoC exploits. It takes natural-language vulnerability descriptions written by auditors and autonomously crafts exploits compatible with the Foundry testing framework. This means the exploits are ready for immediate integration into audit reports and other security tools, significantly reducing the effort required for high-quality PoCs in smart contract audits.

The core of POCO’s functionality lies in its agentic architecture, which employs a Reason–Act–Observe loop. This allows POCO to interact with a set of code-execution tools, making decisions, performing actions, and learning from the observed outcomes. It’s a self-correcting system that can iteratively refine its generated exploits until they are well-formed and logically correct.

How POCO Works

POCO operates with a set of specialized tools:

Basic Tools: For exploring and modifying its working environment, including file-system search, reading, and writing.
Planning Tool: A lightweight utility to track progress and manage tasks.
Smart-contract Tools: Specifically designed for Solidity contracts, these include tools to compile contracts and execute generated PoC exploits using Foundry’s Forge framework. This provides crucial feedback for POCO to self-correct.

All operations are conducted within isolated Docker containers. This not only ensures reproducibility but also mitigates potential security risks, preventing the agent from interacting with real on-chain contracts or accessing sensitive host data.

Also Read:

Evaluation and Impact

The researchers evaluated POCO on a dataset of 23 real-world vulnerability reports, comparing its performance against traditional prompting and workflow baselines. The results were compelling: POCO consistently outperformed both baselines, generating a higher number of well-formed and logically correct PoCs. For instance, while prompting baselines struggled with compilation failures, POCO’s agentic approach successfully produced 50 well-formed PoC exploits across various models.

A key aspect of the evaluation was assessing the logical correctness of the generated PoCs. This was done by executing the PoCs against the corresponding security patches. A PoC was deemed correct if it successfully demonstrated the vulnerability on the unpatched code and was then prevented by the ground-truth mitigation patch. POCO demonstrated the highest number of logically correct exploits, proving its ability to identify genuine attack paths.

The study also explored the impact of vulnerability annotation detail on POCO’s performance. It found that more detailed, procedural annotations generally led to higher success rates in PoC generation. This highlights a crucial takeaway for security auditors: providing comprehensive descriptions of vulnerabilities significantly enhances the likelihood of obtaining accurate and effective PoCs, even with advanced AI tools.

In conclusion, POCO represents a significant advancement in smart contract security. By automating the generation of PoC exploits, it addresses a critical bottleneck in the auditing process, providing auditors with verifiable evidence and developers with actionable test cases to understand and fix security flaws more efficiently. This framework promises to enhance the overall security of smart contracts in a practical and cost-effective manner. You can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Agentic AI for Smart Contract Security Audits: Introducing POCO

Introducing POCO: Agentic Exploit Generation

How POCO Works

Evaluation and Impact

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates