Unlocking Demining Knowledge: How AI Transforms Unstructured Reports into Actionable Insights

TLDR: TextMine is an AI-powered system that uses Large Language Models and a specialized ontology to extract structured knowledge from unstructured humanitarian mine action reports. It improves extraction accuracy, reduces hallucinations, and creates a valuable knowledge base for demining operations, validated on Cambodian reports and adaptable globally.

Humanitarian Mine Action (HMA) faces a significant challenge: a vast amount of valuable best-practice knowledge is trapped in unstructured reports. This makes it difficult to share, access, and learn from crucial demining experiences, ultimately hindering decision-making and operational efficiency. In 2022 alone, landmines caused 4,710 casualties globally, with 85% being civilians, underscoring the urgent need for improved mine action strategies.

To address this, researchers have introduced TextMine, an innovative, ontology-guided pipeline that leverages Large Language Models (LLMs) to extract structured knowledge from HMA texts. TextMine aims to transform these unstructured reports into actionable insights, providing a foundation for a comprehensive demining knowledge base.

How TextMine Works

TextMine operates through a sophisticated pipeline that integrates several key components. First, it employs layout-aware document chunking to break down PDF reports into semantically coherent paragraph-level segments. This ensures that the LLMs receive manageable and context-rich inputs. Next, in the Ontology-Guided Knowledge Extraction phase, TextMine uses a newly constructed HMA ontology to guide the LLMs in extracting knowledge triples (subject-relation-object). This ontology, developed in collaboration with domain experts, systematically categorizes operational entities and relationships relevant to humanitarian demining.

A crucial aspect of TextMine is its use of domain-aware prompting. The research demonstrates that prompts enriched with ontology-aligned examples significantly boost extraction accuracy by up to 44.2%, reduce hallucinations (fabricated information) by 22.5%, and improve format conformance by 20.9% compared to baseline prompts. This highlights the power of providing LLMs with contextually relevant guidance.

Unique Contributions and Evaluation

TextMine stands out from previous approaches by reasoning over entire paragraphs, which enables better coreference resolution and multi-step inference. This is a significant advancement over prior sentence-level methods that often struggle with complex, domain-specific documents. Furthermore, TextMine utilizes a practical operational HMA ontology that is substantially larger than those used in existing benchmarks, making it more applicable to real-world scenarios.

The project also introduces the first dedicated HMA ontology and a curated dataset of real-world demining reports, filling a critical resource gap in the domain. For evaluation, TextMine employs a multi-perspective approach, combining reference-based metrics (comparing extracted triples against a human-annotated dataset) with a novel reference-free LLM-as-a-Judge framework. This LLM-as-a-Judge method helps assess the quality of extracted triples even when ground-truth data is scarce, and experiments show that a “Randomized Fair Judge Prompt” with GPT-4o significantly enhances ranking consistency.

Also Read:

Impact and Adaptability

Validated on Cambodian reports in collaboration with the Cambodian Mine Action Centre (CMAC), TextMine has the potential to convert their technical reports into a structured knowledge base. This framework serves as a proof of concept for LLM-driven demining knowledge extraction, transforming unstructured reports into structured insights that can directly inform and optimize future clearance planning. While initially focused on Cambodia, TextMine is designed to be adaptable to global demining efforts and even other domains facing similar challenges with unstructured data.

The research paper, titled “TextMine: LLM-Powered Knowledge Extraction for Humanitarian Mine Action,” was authored by Chenyue Zhou, Gürkan Solmaz, Flavio Cirillo, Kiril Gashteovski, and Jonathan Fürst. You can read the full paper for more technical details and experimental results here: TextMine Research Paper.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unlocking Demining Knowledge: How AI Transforms Unstructured Reports into Actionable Insights

How TextMine Works

Unique Contributions and Evaluation

Impact and Adaptability

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates