TLDR: The paper demonstrates that AI, specifically the AlphaEvolve LLM coding agent, can make significant discoveries in complexity theory. It achieved improved bounds for average-case hardness of MAX-CUT and MAX-Independent Set by constructing large Ramanujan graphs, and new inapproximability results for worst-case MAX-k-CUT by discovering novel gadget reductions. A key innovation was using AlphaEvolve itself to accelerate the verification of these complex combinatorial structures by up to 10,000 times, overcoming a major computational bottleneck.
A groundbreaking study explores the potential of artificial intelligence to uncover novel combinatorial structures, leading to significant advancements in complexity theory. The research, conducted by Ansh Nagda, Prabhakar Raghavan, and Abhradeep Thakurta, demonstrates how an AI coding agent named AlphaEvolve can push the boundaries of what’s provable in the realm of efficient algorithms.
The paper tackles two major areas within complexity theory, providing affirmative evidence that AI can indeed make non-trivial discoveries. The first area focuses on the average-case hardness for problems like MAX-CUT and MAX-Independent Set on specific types of random graphs. By leveraging AlphaEvolve, the researchers improved upon previous results, achieving near-optimal upper and conditional lower bounds. This was accomplished by constructing remarkably large and complex Ramanujan graphs, some with as many as 163 nodes, a scale previously unattainable by traditional methods. These graphs are crucial because their properties directly influence the computational hardness of these problems.
The second area of focus is the worst-case hardness of approximation for MAX-k-CUT problems, specifically for k=3 and k=4. Here, AlphaEvolve was instrumental in discovering new ‘gadget reductions.’ These gadgets are finite combinatorial structures that help translate the hardness of one problem (3LIN(k)) to another (MAX-k-CUT). The AI agent’s discoveries led to new inapproximability results, improving the state-of-the-art for MAX-4-CUT from 0.9883 to 0.987, and for gadget-based MAX-3-CUT from 0.9853 to 0.9649. These improvements mean that it is even harder than previously thought to find approximate solutions for these problems within certain factors.
A significant technical hurdle faced during the research was the immense computational cost of verifying the candidate constructions generated by AlphaEvolve. These verification procedures often required exponential time, posing a major bottleneck. In a remarkable display of AI’s capabilities, the researchers used AlphaEvolve itself to evolve and accelerate its own verification procedures. This led to speedups of up to 10,000 times, enabling the exploration of much larger and more complex structures. The team ensured the correctness of these faster verifiers through a combination of synthetic dataset checks and a separate LLM acting as a judge.
AlphaEvolve: A Framework for Discovery
The core of this work lies in AlphaEvolve, an LLM coding agent designed for scientific and algorithmic discovery. It operates on a ‘propose-test-refine’ (PTR) paradigm. In this framework, AlphaEvolve iteratively modifies a code snippet that generates combinatorial structures. An evaluation function, or ‘verifier,’ then scores these generated structures based on specific criteria. Finally, a large language model suggests new modifications to the code snippet, aiming to improve the fitness score. The efficiency of this process heavily relies on how quickly the verifier can assess the quality of a construction, highlighting why the speedup in verification was so critical.
Also Read:
- Language Models Tackle Complex Optimization Challenges End-to-End
- CogAtom: Building Advanced Math Problems to Elevate AI Reasoning
AI’s Role in Mathematical Research
The paper also delves into a broader discussion about the norms for assessing AI’s assistance in mathematical sciences. The authors categorize AI’s role into several scenarios, with their work falling under the category of ‘human uses AI-derived tools to generate better proof elements.’ They emphasize that while LLMs can summarize existing literature, directly generating complex proofs for unproven statements has seen limited success. Crucially, the combinatorial structures discovered by AlphaEvolve in this study come with computationally verifiable certificates of correctness, eliminating the need for human scrutiny of the AI’s output itself, provided the verification code is sound.
The scale of the problems tackled, such as finding Ramanujan graphs with 163 nodes or complex gadget reductions, was beyond the reach of simple ‘pencil-and-paper’ methods or even state-of-the-art traditional solvers like Mixed-Integer Programming (MIP) due to the exponential growth in complexity. This underscores the unique contribution of AlphaEvolve in exploring vast search spaces and discovering structures that would otherwise remain elusive.
This research provides compelling evidence that AI models with agentic feedback can generate intricate mathematical constructions, sometimes exhibiting signs of reasoning. It opens new avenues for exploring problems in extremal combinatorics and suggests a future where AI plays an increasingly integral role in accelerating scientific and mathematical discovery. For more details, you can read the full research paper here.


