CrisiText: Enhancing Emergency Communication with a New AI Training Dataset

TLDR: CrisiText is the first large-scale dataset designed to train Large Language Models (LLMs) for generating expert-based warning messages in 13 types of crisis scenarios. It contains over 400,000 messages, including both ‘good’ messages following expert guidelines (Tone and Instructions) and ‘bad’ suboptimal messages for preference alignment. Experiments show that fine-tuning LLMs with CrisiText significantly improves their ability to generate accurate and contextually relevant warning messages, especially when provided with instruction guidelines and previous message history. The dataset also supports the development of effective post-editing tools for crisis communication.

In our increasingly complex world, shaped by rapidly evolving social and environmental phenomena, the ability to communicate effectively during crises is more critical than ever. Natural disasters, violent attacks, and other emergencies can impact thousands or millions, making timely and accurate warning messages paramount for safeguarding endangered individuals. While Artificial Intelligence (AI) has increasingly assisted in crisis management, the use of Natural Language Processing (NLP) techniques has largely focused on classification tasks, overlooking the significant potential of generating timely warning messages.

Addressing this crucial gap, researchers have introduced CrisiText, the first large-scale dataset specifically designed for the generation of warning messages across 13 different types of crisis scenarios. This innovative dataset contains over 400,000 warning messages, spanning almost 18,000 crisis situations, all aimed at assisting civilians during and after such events. The creation of CrisiText marks a significant step towards specializing Large Language Models (LLMs) in expert-based crisis communication.

The development of CrisiText involved a meticulous pipeline. Scenario descriptions were extracted from two primary sources: the FEMA IPAWS Archived Alerts, which covers natural disasters, and the Global Terrorism Database (GTD), focusing on violent attacks. Using an advanced LLM (GPT-4o-mini), these descriptions were transformed into sequences of chronological events, simulating the unfolding of each crisis. For each event, warning messages were then generated following expert-written guidelines.

These guidelines were structured around two key dimensions: Tone and Instructions. Tone guidelines, derived from a systematic review and expert panel, focused on increasing attention, comprehension, believability, clarity, and triggering protective action. This meant ensuring proper terminology, providing accurate information, avoiding panic, and clearly stating behaviors. Instruction guidelines, sourced from the official FEMA website, provided grounded suggestions on how to behave depending on the crisis type.

Beyond generating “Good Messages” that adhere to these expert guidelines, the dataset also includes three types of “Bad Messages.” These suboptimal versions were deliberately created to ignore or worsen essential aspects of a good warning message, such as poor tone, incorrect instructions, or flaws in both. This unique feature allows for the study of different Natural Language Generation (NLG) approaches, including preference alignment techniques where models learn by comparing chosen (good) and rejected (bad) outputs.

To assess the effectiveness of CrisiText, a series of experiments were conducted using Llama 3 models. These experiments explored various methodologies for warning message generation, including Supervised Fine-Tuning (SFT) and ORPO (a preference alignment technique), alongside zero-shot and few-shot baselines. Researchers also investigated the impact of providing additional context, such as previous messages from the same scenario or specific FEMA instruction guidelines, during the generation process.

A crucial aspect of the research involved Leave One Scenario Out (LOSO) experiments, which tested the models’ ability to generalize to crisis types not seen during training. This demonstrated the importance of explicitly including instruction guidelines for adapting to new emergency protocols. Furthermore, an automatic post-editor model was fine-tuned using the “Bad Messages,” showing promising results in improving the quality of poorly written warning messages.

The evaluation of these experiments utilized both traditional overlap metrics like ROUGE and BLEU, and a sophisticated LLM-as-a-judge technique to approximate human evaluation. Results indicated that SFT generally achieved better performance in automatic metrics compared to ORPO, while LLM-as-a-judge evaluations showed comparable performance. The inclusion of previous messages significantly improved message consistency, and instruction guidelines proved fundamental for out-of-distribution scenarios.

Also Read:

In conclusion, CrisiText represents a valuable resource for advancing AI-driven crisis communication. It enables the specialization of LLMs for generating expert-based warning messages, offering a robust foundation for future research and practical applications. While the dataset is synthetic and LLM-generated, the researchers emphasize that any products based on CrisiText should serve as tools to assist human experts, not replace them, especially in sensitive real-world situations. This work paves the way for more effective and timely communication during emergencies, ultimately contributing to public safety and crisis mitigation. You can find the full research paper here: CrisiText: A dataset of warning messages for LLM training in emergency communication.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

CrisiText: Enhancing Emergency Communication with a New AI Training Dataset

Gen AI News and Updates

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates