TLDR: A research paper introduces an agentic LLM pipeline that significantly improves natural-language-to-SQL (NL-to-SQL) systems for complex spatio-temporal queries. By orchestrating a Mistral-based ReAct agent with tools for schema inspection, SQL generation, execution, and visualization, the system achieves 91.4% accuracy compared to a naive baseline’s 28.6%, making database interaction more intuitive and providing richer, interpretable insights for non-experts.
Accessing and analyzing structured data often requires specialized knowledge of Structured Query Language (SQL), creating a barrier for many users. Natural-language-to-SQL (NL-to-SQL) systems aim to democratize this access by allowing users to query databases using everyday language. However, existing systems frequently struggle with complex real-world queries, especially those involving spatio-temporal data, which combine location and time information.
A recent research paper, titled “From Queries to Insights: Agentic LLM Pipelines for Spatio-Temporal Text-to-SQL,” addresses these limitations by introducing an innovative agentic pipeline. This system significantly enhances the accuracy and usability of NL-to-SQL for spatio-temporal queries, making database interactions more intuitive for users without SQL expertise or detailed schema knowledge.
The Challenges of Traditional NL-to-SQL
The paper highlights three main challenges faced by current NL-to-SQL systems when dealing with geospatial datasets:
- Semantic Mismatch: User queries often use vague phrasing (e.g., “laundromats”) that doesn’t directly align with database schema terms (e.g., “Laundry Service”).
- Temporal Reasoning: Queries involving time-based patterns (e.g., hour-of-day trends, weekday/weekend splits) require careful handling of edge cases that most systems aren’t designed for.
- Spatial Semantics: Users might reference locations like neighborhoods or landmarks not directly encoded in database columns, requiring external knowledge or query decomposition.
Beyond correctness, usability is another significant gap. Naive systems typically return raw tabular results, which can be difficult for non-experts to interpret or visualize.
Introducing the Agentic Pipeline
The core of this research is an agentic NL-to-SQL pipeline that extends a basic text-to-SQL model (defog/llama-3-sqlcoder-8b) with orchestration by a Mistral-based ReAct agent. Crucially, the naive SQL generator is embedded as a tool within the agent’s reasoning loop, allowing the researchers to isolate the value of orchestration and tool use.
The agent operates in a “plan–act–observe” loop, dynamically retrieving schema, generating and refining SQL, executing queries, and producing visualizations and summaries. It has access to six specialized tools:
- A tool to retrieve database schema and sample rows.
- A tool to generate SQL queries (wrapping the naive SQLCoder model).
- A tool to execute SQL and inspect results.
- A tool to read large result sets from files.
- A tool to create categorical or temporal plots.
- A tool to generate point maps or density heatmaps.
This design enables the agent to rephrase questions, chain multiple calls for complex requests, recover from execution errors, and deliver task-appropriate visualizations and summaries.
Remarkable Performance Gains
The agentic pipeline was evaluated on 35 natural-language queries over the NYC and Tokyo check-in dataset, covering spatial, temporal, and multi-dataset reasoning. The results were striking: the agentic pipeline achieved 91.4% correctness (32 out of 35 queries), dramatically outperforming the naive baseline, which managed only 28.6% correctness (10 out of 35 queries).
The agent showed consistently high accuracy across all query categories, including 100% on multi-step reasoning and nearly perfect performance on aggregation/ranking (96.2%) and temporal reasoning (94.7%). It also made substantial improvements in spatial/geographic tasks (from 0% to 85.7%), queries requiring external knowledge (from 0% to 83.3%), and multi-table queries (from 0% to 80.0%).
Enhanced User Experience
Beyond accuracy, the agent significantly improved usability by providing automatic visualizations and structured natural-language summaries. Instead of just raw tables, users received maps, heatmaps, and plots that enabled direct perception of trends. The system also generated concise textual explanations, reducing the cognitive load for users by interpreting results rather than forcing them to infer insights from data alone.
For instance, when asked about check-in activity across different times of day, the agent not only generated a line plot but also provided a detailed narrative summarizing activity during late night, early morning, morning, midday, afternoon, and evening hours.
Also Read:
- OraPlan–SQL: Advancing Natural Language to SQL Conversion with Intelligent Planning
- AI Agents Get Smarter: A Graph-Based Approach to Understanding Complex Tools
Practical Insights and Future Directions
The research offers practical guidance for designing agentic NL-to-SQL systems, emphasizing the importance of grounding user intent in schema, using errors as feedback for refinement, decomposing complex tasks, and balancing efficiency with reliability. While the agent increased SQL generation calls by about 50% compared to the naive baseline, this modest overhead is a small price for the significant gains in accuracy and usability.
Despite its strong performance, the agent did encounter some limitations, such as attempting to use unsupported geodesic functions, over-literal label matching, and occasional planning instability with complex cross-dataset comparisons. Future work aims to address these by scaling evaluations, quantifying the contribution of individual tools, incorporating interactive refinement, improving efficiency, and enhancing planning robustness.
This paper demonstrates that agentic orchestration, rather than simply more powerful SQL generators, is a promising foundation for building interactive geospatial assistants that are more accurate, efficient, reliable, and user-centered. You can read the full research paper here: From Queries to Insights: Agentic LLM Pipelines for Spatio-Temporal Text-to-SQL.


