CARS: Efficient and Accurate Constrained Generation from Language Models

TLDR: Constrained Adaptive Rejection Sampling (CARS) is a new algorithm for language models that generates outputs satisfying strict semantic or syntactic constraints. It uniquely combines the fidelity of rejection sampling (producing samples from the exact constrained distribution) with adaptive efficiency. CARS achieves this by learning and pruning invalid output prefixes, significantly reducing wasted computation and improving acceptance rates. It has demonstrated superior performance in program fuzzing, molecular synthesis, and PDDL planning, offering a practical solution for generating diverse and valid outputs.

Language models (LMs) are becoming increasingly powerful, but their outputs often need to adhere to strict rules, whether for programming languages, data formats, or scientific applications. This challenge, known as constrained generation, has long presented a dilemma: either generate outputs that are perfectly valid but computationally expensive, or generate them quickly but with potential inaccuracies or biases.

Traditional methods have struggled with this trade-off. Rejection Sampling (RS), for instance, ensures outputs exactly match the desired distribution but is highly inefficient, often discarding the vast majority of generated content. Greedy Constrained Decoding (GCD) is fast because it enforces validity during generation, but it distorts the LM’s original probability distribution, which can lead to suboptimal results. Other approximate methods promise eventual accuracy but can produce biased samples in their early stages, and lack clear stopping points.

Introducing Constrained Adaptive Rejection Sampling (CARS)

A new approach, Constrained Adaptive Rejection Sampling (CARS), aims to resolve this fundamental tension. Developed by researchers from the University of Warsaw and the University of California San Diego, CARS offers a method that strictly improves the efficiency of rejection sampling without compromising the fidelity of the LM’s distribution. This means it generates samples that are both valid and truly representative of the language model’s underlying probabilities, conditioned on the given constraints.

How CARS Works

CARS builds on the idea of Adaptive Rejection Sampling but goes a significant step further. It begins by sampling from the language model without any constraints. As it generates sequences, it actively identifies and records any partial outputs (prefixes) that are guaranteed to violate the constraints. These invalid prefixes are stored in a data structure called a trie. Crucially, CARS then subtracts the probability mass associated with these invalid prefixes from future sampling attempts. This adaptive pruning ensures that the system never wastes computation by revisiting paths that have already been proven invalid. As a result, the acceptance rate of valid samples monotonically improves over time, and the final samples precisely follow the constrained distribution.

Imagine trying to generate a valid arithmetic expression like “1+0+1”. If the model initially tries to generate “0++”, CARS would identify “0++” as an invalid prefix. In subsequent generations, if the model reaches “0+”, CARS would prevent it from choosing another “+” token, effectively guiding it towards valid continuations.

Also Read:

Key Advantages and Applications

The research paper highlights several key contributions of CARS:

Exactness and Efficiency: CARS is the only exact method that achieves practical efficiency, making it feasible for real-world applications where other exact methods fail due to high computational cost.
Improved Sample Diversity: By preserving the LM’s true distribution, CARS tends to produce a wider variety of valid samples compared to approximate methods.
Lower Amortized Cost: Over many samples, the average number of LM forward passes required per valid output is significantly reduced.

CARS has been evaluated across a variety of domains, demonstrating its effectiveness:

Program Fuzzing: In tasks like generating SQLite regression test files or XML parsing inputs, CARS-generated seeds led to higher branch coverage in fuzzers, indicating better exploration of code paths. This is crucial for finding bugs in software.
Molecular Synthesis: For generating valid SMILES strings (a chemical language) for drug discovery, CARS showed substantial improvements in sampling efficiency while maintaining high molecular diversity and validity, which are essential for exploring chemical space.
PDDL Planning: In generating action plans for complex tasks, CARS proved more efficient than other exact methods, sometimes making otherwise intractable sampling problems feasible.

In summary, CARS represents a significant advancement in constrained language model generation. By combining the fidelity of exact sampling with adaptive efficiency, it provides a powerful tool for applications where both validity and diversity of generated outputs are paramount, from software testing to drug discovery.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

CARS: Efficient and Accurate Constrained Generation from Language Models

Introducing Constrained Adaptive Rejection Sampling (CARS)

How CARS Works

Key Advantages and Applications

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates