TLDR: ST-Raptor is a novel AI framework that uses Large Language Models (LLMs) to accurately answer natural language questions over complex semi-structured tables, like Excel spreadsheets. It introduces a Hierarchical Orthogonal Tree (HO-Tree) to represent table layouts, decomposes questions into executable operations, and includes a two-stage verification system to ensure answer reliability. This approach significantly outperforms existing methods, making it easier to extract information from real-world, irregularly structured data.
In today’s data-driven world, information often comes in complex formats, especially in business and scientific documents. One common yet challenging format is the semi-structured table, found in everything from financial reports and medical records to transactional orders. Unlike simple, perfectly organized tables, these often feature flexible layouts, hierarchical headers, and merged cells, making them difficult for automated systems to understand and query. Traditionally, human analysts spend significant time interpreting these tables to answer natural language questions, a process that is both costly and inefficient.
Addressing this critical challenge, a new framework called ST-Raptor has been introduced. Developed by researchers from Shanghai Jiao Tong University, Simon Fraser University, Tsinghua University, and Renmin University of China, ST-Raptor aims to automate question answering over these complex semi-structured tables using the power of large language models (LLMs).
The Problem with Semi-Structured Tables
Existing methods for table question answering face several hurdles. Converting semi-structured tables into a fully structured format often leads to a significant loss of crucial layout information. Other approaches, like those generating code or using multi-modal LLMs, struggle to grasp the intricate layouts of these tables, leading to inaccurate answers. The core difficulty lies in distinguishing headers from content, understanding nested relationships, and performing various analytical lookups (like top-down or bottom-up searches) that humans do intuitively.
Introducing ST-Raptor: A Tree-Based Solution
ST-Raptor tackles these issues with a novel, tree-based approach. Its core innovation is the **Hierarchical Orthogonal Tree (HO-Tree)**, a structural model designed to accurately represent the complex layouts of semi-structured tables. This HO-Tree captures headers, content values, and their often-implicit relationships, providing a clear, structured view of the table’s organization.
The framework operates in several key stages:
- **Table2Tree**: This module converts any given semi-structured table into an HO-Tree. It uses a multi-modal LLM to identify meta-information (headers), applies heuristic rules to partition the table, and then employs a depth-first search algorithm to construct the tree, effectively mapping the table’s 2D structure into a navigable tree.
- **Question2Pipeline**: When a user asks a question, ST-Raptor doesn’t try to answer it in one go. Instead, it uses an LLM to break down complex questions into simpler, manageable sub-questions. For each sub-question, it generates a sequence of basic tree operations. These operations are designed to perform specific tasks like retrieving data (e.g., finding children or parent nodes), manipulating data (e.g., filtering, calculating, comparing), aligning parameters with table content, and performing semantic reasoning.
- **AnswerGenerator**: This module executes the generated operations on the HO-Tree. It can use both top-down and bottom-up retrieval strategies to locate relevant information, adapting to the specific needs of the question.
- **AnswerVerifier**: A crucial component, this module employs a two-stage verification process to ensure accuracy and reliability. **Forward validation** checks the correctness of each execution step and consistency with the question. If an operation yields inadequate results or misaligns with the table, it can be regenerated or the process halted. **Backward validation** assesses the answer’s reliability by generating alternative questions that should yield the same answer and comparing their underlying operation pipelines for similarity. This helps catch potential LLM ‘hallucinations’.
Also Read:
- Enhancing Table Understanding with CHAIN-OF-QUERY: A Multi-Agent Approach for LLMs
- HyST: A New Approach to Smarter Recommendations with AI
Enhanced Performance and Real-World Application
The researchers also introduced a new benchmark dataset, **SSTQA**, comprising 764 questions over 102 real-world semi-structured tables. This dataset is specifically designed to evaluate a model’s ability to handle the deep nesting and structural irregularities common in real-world scenarios.
Experiments show that ST-Raptor significantly outperforms nine state-of-the-art baseline methods across various benchmarks, achieving up to a 20% improvement in answer accuracy on the SSTQA dataset. This superior performance is attributed to its explicit structural modeling with the HO-Tree, the intelligent question decomposition, and the robust two-stage verification mechanism.
ST-Raptor’s ability to effectively process and answer questions from complex semi-structured tables marks a significant step forward in automating data analysis, potentially saving countless hours of manual effort in fields ranging from finance and healthcare to human resources. For more technical details, you can refer to the original research paper.


