TLDR: Researchers at NYU Tandon School of Engineering, in collaboration with NYU Abu Dhabi and other universities, have developed EnIGMA, an AI agent capable of autonomously solving complex cybersecurity challenges. This breakthrough system, presented at ICML 2025, leverages Large Language Models (LLMs) and specialized ‘Interactive Agent Tools’ to tackle real-world vulnerabilities, marking a significant advancement in AI’s application to cybersecurity.
A team of researchers from NYU Tandon School of Engineering, NYU Abu Dhabi, and other academic institutions has introduced EnIGMA, an innovative AI agent designed to autonomously address intricate cybersecurity challenges. The system, which was showcased at the International Conference on Machine Learning (ICML) 2025 in Vancouver, Canada, represents a substantial leap forward in the application of artificial intelligence to cybersecurity.
Traditionally, AI systems have shown strong capabilities in areas like software development and web navigation, but their effectiveness in cybersecurity has been limited. EnIGMA aims to change this by utilizing Large Language Model (LLM) agents for cybersecurity applications. Meet Udeshi, a Ph.D. student at NYU Tandon and co-author of the research, explained that EnIGMA’s core innovation lies in its ability to integrate LLMs with specialized cybersecurity tools.
The development of EnIGMA involved adapting an existing framework called SWE-agent, originally designed for software engineering tasks. A key challenge was enabling LLMs, which primarily process text, to interact with traditional cybersecurity tools that often feature graphical user interfaces (GUIs) with visual displays and interactive elements. To overcome this, the researchers developed ‘Interactive Agent Tools’ that convert these visual programs into text-based formats comprehensible by the AI.
Furthermore, the team created a unique dataset by structuring Capture The Flag (CTF) challenges specifically for large language models. CTFs are gamified cybersecurity competitions that simulate real-world vulnerabilities and are typically used to train human cybersecurity professionals. EnIGMA has demonstrated remarkable success in these challenges, reportedly resolving 390 CTF challenges across four benchmarks, which is three times more than previous systems.
This breakthrough has significant implications for both developers and businesses. Developers can now leverage EnIGMA’s framework to integrate domain-specific tools into LLM workflows, while businesses gain a scalable solution for proactive threat detection and autonomous vulnerability assessments. The system’s ability to autonomously test for weaknesses could reduce reliance on manual penetration testing, thereby enhancing cyber resilience.
Also Read:
- 7AI’s AI Agents Revolutionize Cybersecurity by Combating Alert Overload
- ReliaQuest Revolutionizes Cybersecurity with AI-Powered Agentic Teammates for Enhanced Predictive Security
However, the rise of powerful AI agents like EnIGMA also brings forth new considerations regarding ethical guardrails and potential misuse. The research highlights the need for careful guidance and foresight in deploying such systems to ensure they strengthen digital security responsibly. As AI agents continue to evolve, they are poised to redefine human-machine collaboration in cybersecurity, moving beyond passive models to proactive, goal-driven systems capable of complex task execution.


