Boosting Information Extraction: A New AI Workflow Combines Language Models with Logic

TLDR: A novel workflow integrates Large Language Models (LLMs) with Answer Set Programming (ASP) to enhance Joint Entity-Relation Extraction (JERE). This approach leverages LLMs for understanding unannotated text and uses ASP for consistency checking and incorporating domain-specific knowledge. Experiments show that this LLM + ASP workflow outperforms state-of-the-art JERE systems, particularly with limited training data, by effectively reducing false predictions and improving overall accuracy.

In the rapidly evolving field of Artificial Intelligence, extracting meaningful information from unstructured text remains a crucial challenge. This task, known as Joint Entity-Relation Extraction (JERE), involves simultaneously identifying entities (like people, organizations, or locations) and the relationships between them. Traditionally, building models for JERE has been a demanding process, requiring vast amounts of pre-annotated data and often struggling to incorporate specific domain knowledge easily.

A groundbreaking new approach proposes a generic workflow that combines the power of Large Language Models (LLMs) with the logical reasoning capabilities of Answer Set Programming (ASP). This innovative “LLM + ASP” workflow aims to overcome the limitations of traditional methods by working directly with unannotated text and seamlessly integrating domain-specific information.

The LLM + ASP Synergy

Large Language Models, such as GPT, are renowned for their ability to understand and generate human-like text, having been trained on massive datasets. This workflow harnesses their natural language understanding to process raw, unannotated text. However, LLMs can sometimes “hallucinate” or produce factually incorrect information. This is where Answer Set Programming comes in.

ASP is a form of logic programming that excels in knowledge representation and reasoning. In this workflow, ASP acts as a “consistency checker.” The predictions made by the LLM are fed into an ASP solver, along with any available domain-specific rules or “type specifications.” This allows the system to verify the consistency of the LLM’s output and filter out false predictions, significantly improving accuracy. A key advantage of ASP is its “elaboration tolerant” feature, meaning that adding new domain knowledge doesn’t require complex modifications to the core program.

A Generic and Effective Workflow

The proposed workflow is designed to be generic, meaning it can be applied to JERE tasks across various domains without significant changes. It primarily consists of two components: a flexible prompt template for the LLM and the ASP-based consistency checker. The prompt template is modular, allowing for easy integration of domain-specific context, experience, and output format specifications, often using a “one-shot” example to guide the LLM.

Experiments were conducted on three well-known JERE benchmarks: CoNLL04 (news and journalism), SciERC (scientific abstracts), and ADE (health and drug). The results are compelling: the LLM + ASP workflow demonstrated superior performance compared to state-of-the-art JERE systems, even when trained with only 10% of the typical training data. For instance, on the challenging SciERC corpus, the system achieved a remarkable 2.5 times (35% over 15%) improvement in the Relation Extraction task compared to some existing methods. The consistency checker, in particular, proved highly effective in reducing false positive entity-relationships, especially in datasets rich with type specifications like SciERC.

Also Read:

Looking Ahead

This research highlights a promising direction for information extraction, combining the strengths of generative AI with symbolic reasoning. The approach offers greater flexibility and scalability, reducing the reliance on extensive annotated datasets. The authors plan to further explore this workflow for extracting knowledge graphs, which are structured representations of information consisting of entities and their relationships. You can find more details about this innovative research in the full paper available here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Boosting Information Extraction: A New AI Workflow Combines Language Models with Logic

The LLM + ASP Synergy

A Generic and Effective Workflow

Looking Ahead

Gen AI News and Updates

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Keeping Up with Human Activity: A New Method for Adaptive Sensor-Based Recognition

Unbiased AI Model Evaluation in Randomized Controlled Trials

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates