AI-Powered SQL Rewriting for Better Database Efficiency

TLDR: E3-Rewrite is a new framework that uses large language models (LLMs) to automatically rewrite SQL queries. Unlike traditional rule-based methods, E3-Rewrite learns to generate queries that are not only syntactically correct and semantically identical to the original but also significantly more efficient. It achieves this by incorporating execution plan insights, using a reinforcement learning approach with a staged training strategy, and leveraging a library of successful past rewrites. Experiments show it drastically reduces query execution times and improves the success rate of rewrites across various SQL benchmarks.

In the world of databases, efficient query processing is paramount. SQL query rewriting is a technique used to transform a given SQL query into a more efficient form while ensuring it produces the exact same results. Traditionally, this has been done using predefined rules, but these rule-based methods often struggle with complex queries and new patterns, and they can’t capture all effective rewriting strategies.

The Challenge of SQL Rewriting

The limitations of rule-based systems are clear: fixed rules don’t adapt well, rule dependencies create a fragile search space, and many powerful rewriting strategies (like those involving Common Table Expressions or specific evaluation orders) fall outside their scope. While Large Language Models (LLMs) show promise for generating rewrites, directly applying them often leads to queries that are not optimal, don’t produce equivalent results, or even fail to execute due to a lack of understanding of how databases actually run queries.

Introducing E3-Rewrite: A New Approach

To overcome these challenges, researchers from Soochow University, Hong Kong University of Science and Technology, Zhejiang Normal University, ByteDance Inc., Alibaba Group, Southeast University, and University of Electronic Science and Technology of China have proposed a groundbreaking framework called E3-Rewrite. This LLM-based system is designed to produce SQL queries that are Executable, Equivalent, and Efficient. You can read the full research paper here: E3-Rewrite: Learning to Rewrite SQL for Executability, Equivalence, and Efficiency.

E3-Rewrite moves beyond fixed rules by training an LLM to directly generate optimized SQL rewrites. It tackles the core issues of LLMs lacking execution awareness and semantic grounding, and the instability of optimizing for multiple, sometimes conflicting, objectives like correctness and performance.

How E3-Rewrite Works

The framework integrates three core components:

Execution-Guided Context Construction: E3-Rewrite doesn’t just look at the SQL query. It first analyzes the query’s execution plan – essentially, how the database intends to run the query. This plan reveals inefficiencies like full table scans or unindexed joins. This ‘execution hint’ is then fed to the LLM, guiding it to identify and fix performance bottlenecks.
Reinforcement Learning Framework: Instead of relying on predefined rules, E3-Rewrite uses a reinforcement learning (RL) approach. The LLM generates multiple candidate rewrites, which are then evaluated based on three criteria: executability (does it run?), equivalence (does it produce the same result?), and efficiency (is it faster?). A reward function combines these factors, and the model learns to generate better rewrites through this feedback. To ensure stable learning, a two-stage curriculum is used: first, the model focuses on generating correct (executable and equivalent) queries, and then it gradually incorporates efficiency optimization.
Hybrid Demonstration Retrieval: To help the LLM generalize to new queries, E3-Rewrite maintains a pool of past successful rewrites. When a new query comes in, the system retrieves similar examples from this pool based on both their structural patterns (how the query is built) and their semantic meaning. If a new rewrite significantly improves performance, it’s added to this pool, allowing the system to continuously learn and improve.

Impressive Results

Extensive experiments on widely used SQL benchmarks like TPC-H, IMDB, and DSB demonstrate E3-Rewrite’s superior performance. It achieved up to a 25.6% reduction in query execution time compared to state-of-the-art methods. Furthermore, it delivered up to 24.4% more successful rewrites, expanding its coverage to complex queries that previous systems struggled with. The system also showed strong robustness across varying data scales, consistently maintaining low latencies even with larger datasets.

The ablation study, which tested the system with individual components removed, highlighted the critical role of each part: reinforcement learning ensures correctness and efficiency, execution plan hints provide crucial structural awareness, and demonstration retrieval enables better generalization and pattern reuse.

Also Read:

The Future of SQL Optimization

E3-Rewrite represents a significant leap forward in SQL query rewriting. By combining execution-guided context, reinforcement learning with detailed rewards, and a dynamic demonstration retrieval system, it offers a powerful, end-to-end solution for generating highly optimized SQL queries without relying on rigid rule sets. This integration of plan-based context and RL holds immense potential for creating more robust and adaptable SQL rewriting systems for modern database environments.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI-Powered SQL Rewriting for Better Database Efficiency

The Challenge of SQL Rewriting

Introducing E3-Rewrite: A New Approach

How E3-Rewrite Works

Impressive Results

The Future of SQL Optimization

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates