TLDR: TeQoDO is a new method that enables large language models (LLMs) to automatically build structured knowledge bases (ontologies) for task-oriented dialogue systems. It uses the LLM’s ability to generate SQL code, combined with dialogue theory, to create and update a database from raw conversations without human supervision. TeQoDO outperforms previous methods and creates ontologies useful for downstream AI tasks, representing a step towards more explainable and controllable LLMs.
Large language models, or LLMs, have become incredibly powerful tools, capable of handling a wide array of language tasks. However, their knowledge is often stored in a way that makes it difficult to understand how they arrive at their conclusions, limiting their explainability and trustworthiness. This is particularly relevant in task-oriented dialogue (TOD) systems, where clear, structured knowledge is crucial for reliable interactions.
Traditionally, building these structured knowledge bases, known as ontologies, has been a labor-intensive process, requiring significant manual effort or extensive supervised training. This challenge has limited the widespread application of ontologies, despite their potential to make AI systems more transparent and controllable.
Introducing TeQoDO: A New Approach to Ontology Construction
A groundbreaking new method called TeQoDO (Text-to-SQL Task-oriented Dialogue Ontology Construction) offers a solution to this problem. TeQoDO empowers an LLM to autonomously build a TOD ontology from the ground up, without any human supervision. It achieves this by leveraging the LLM’s inherent SQL programming capabilities, combined with principles from dialogue theory.
Imagine an LLM not just understanding language, but also being able to design and populate a database to organize the information it learns from conversations. That’s essentially what TeQoDO does. It treats the ontology as a relational database, where domains become tables, slots become columns, and values are the entries within those columns. User intents and system actions are also captured, defining how these structured pieces of information are used in a dialogue.
How TeQoDO Works Its Magic
TeQoDO operates through an iterative process, learning and updating the ontology as it processes dialogues. Here’s a simplified breakdown of its steps:
-
Querying Existing Information: Before making any changes, the LLM first queries the current state of the database. This helps maintain consistency and ensures that new information is integrated correctly with what’s already known.
-
Dialogue State Tracking (DST): This crucial step allows the model to differentiate between information already present in the database and new information emerging from the current dialogue. It helps the LLM understand what needs to be added or updated.
-
Database Update: Based on the dialogue and the identified new information, the LLM generates SQL queries. These queries can create new tables (for new domains), add columns to existing tables (for new slots), or insert/update values. A key aspect here is the inclusion of ‘dialogue success’ in the prompt, guiding the model to make updates that support the user’s overall goal in the conversation.
This text-to-SQL approach is particularly effective because SQL’s structured format is familiar to LLMs from their pre-training on vast code corpora. It naturally aligns with the hierarchical nature of TOD ontologies, making the process intuitive for the AI.
Impressive Results and Broad Applicability
Experiments show that TeQoDO significantly outperforms existing transfer learning methods for ontology construction on widely used TOD datasets like MultiWOZ and Schema-Guided Dialogue (SGD). It particularly excels at predicting higher-level hierarchical concepts such as domains and slots. Ablation studies further highlight the critical role of dialogue theory in improving performance and reducing errors in SQL query generation.
Beyond task-oriented dialogues, TeQoDO also demonstrates its versatility by generalizing to larger, more general ontology datasets like Wikipedia and ArXiv. While some adjustments are needed for these different structures, the method shows competitive results, especially in capturing the overall graph structure of the ontologies.
Perhaps most importantly, the ontologies constructed by TeQoDO are highly functional. When used in a downstream task like dialogue state tracking, the TeQoDO-induced ontology performs comparably to using a manually created, ground-truth ontology. This indicates that the automatically generated knowledge base is genuinely useful for practical AI applications.
Also Read:
- Bridging Logic and Language: Explaining Knowledge Graph Rules with AI
- Bridging Natural Language and ERP Systems with AI Agents
A Step Towards More Explainable AI
The development of TeQoDO marks an important stride towards making large language models more explainable and controllable. By automatically distilling human-readable, structured knowledge from raw conversations, TeQoDO helps to open up the ‘black box’ of LLMs. This structured knowledge could potentially be used to detect AI hallucinations or to better understand the reasoning behind an LLM’s responses in complex tasks.
For more technical details, you can read the full research paper here: Text-to-SQL Task-oriented Dialogue Ontology Construction.


