AI Enhances Flight Delay Prediction with Multimodal Data and Language Models

TLDR: A new AI framework predicts flight delays, specifically post-terminal delays, by combining textual aviation information (flight details, weather reports, and aerodrome notices) with real-time aircraft trajectory data. It uses Large Language Models adapted to understand both language and trajectory patterns, achieving sub-minute prediction errors and supporting second-by-second updates for air traffic controllers. This approach provides a comprehensive, real-time understanding of airspace conditions, making it highly practical for air traffic management.

Flight delays are a significant challenge in air traffic management, leading to increased operating costs for airlines, disruptions for passengers, and ripple effects across the entire air travel network. Addressing this, a new research paper introduces a novel approach to predict flight delays, particularly those occurring after an aircraft enters the terminal area, a critical phase for air traffic controllers (ATCs).

The paper, titled Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation, presents a lightweight, multimodal framework that leverages the power of Large Language Models (LLMs) combined with detailed aircraft trajectory data. Authored by Thaweerath Phisannupawonga, Joshua Julian Damanika, and Han-Lim Choia, this research aims to provide ATCs with accurate, real-time delay predictions, enhancing operational efficiency and safety.

A Multimodal Approach to Understanding Airspace

Traditional flight delay prediction models often rely on structured tabular data, historical delays, or airport network information. This new framework takes a more comprehensive approach by integrating diverse data types, or “modalities,” that ATCs themselves monitor. These include:

General Flight Information: Details like airline name, flight identifier, departure and destination airports, scheduled arrival times, aircraft type, and wake turbulence category.
Weather Reports: Real-time Meteorological Aerodrome Reports (METAR) and Terminal Aerodrome Forecasts (TAF), providing crucial weather conditions and predictions.
Aerodrome Notices (NOTAMs): Special operational constraints or disruptions, such as runway closures or equipment failures, which can directly cause delays.
Aircraft Trajectory Data: This is a key innovation. The model processes three types of trajectories: the “focusing trajectory” (the aircraft being monitored), “active trajectories” (other aircraft in the terminal area), and “prior trajectories” (completed flights that might influence current patterns). This data provides a real-time understanding of airspace conditions and congestion.

The core of the methodology lies in “cross-modality adaptation.” Large Language Models are inherently designed to understand text. However, aircraft trajectories are complex time-series data. The framework uses a specialized trajectory representation model (ATSCC) to convert this time-series data into a format that LLMs can interpret. A lightweight adaptation network then bridges the gap, projecting these trajectory representations into the LLM’s language-compatible embedding space. This allows the LLM to process and reason over both textual and trajectory information simultaneously.

How the Model Works

Instead of training a massive model from scratch, the framework smartly utilizes pre-trained components. A frozen LLM backbone (such as LLaMA-3.2-1B-Instruct or Pythia-1B) provides robust linguistic understanding. A pre-trained trajectory encoder (ATSCC) handles the complex trajectory data. Only a small cross-modality adaptation network and a regression head (for predicting delay duration) are trained. This makes the training process efficient and memory-friendly, allowing the model to run on consumer-grade GPUs.

The model focuses on predicting the “post-terminal duration” – the time an aircraft spends in the airspace after entering the terminal area until its actual arrival. By accurately estimating this duration, and knowing the scheduled arrival and actual airspace entry times, the total delay can be precisely calculated. This formulation aligns directly with the information available to ATCs, making the predictions highly relevant to their operational needs.

Promising Results and Real-time Capabilities

Experimental results demonstrate the framework’s effectiveness. When evaluated across various LLM backbones, the model consistently achieved sub-minute prediction errors. This level of accuracy is highly practical for air traffic management, where delays are typically recorded at the minute level. The research highlights that the LLM’s inherent linguistic understanding, combined with the rich contextual information from trajectory data, is crucial for these accurate predictions.

A significant advantage of this framework is its ability to provide second-by-second delay updates. Unlike previous models constrained by the infrequent updates of some data sources (like weather reports every 30 minutes), the integration of frequently transmitted surveillance broadcast signals allows the context to be updated in real-time. This continuous monitoring capability is invaluable for ATCs, enabling them to refine predictions as an aircraft progresses towards the airport.

Also Read:

Impact and Future Directions

This research offers a practical and scalable solution for flight delay prediction. By focusing on a single airport and integrating a wide range of multimodal data, it overcomes limitations of prior works that often relied on historical delay data or complex airport network information. The framework’s robustness, even when certain contextual information is occasionally unavailable, further underscores its operational suitability.

Looking ahead, the researchers suggest extending this framework to predict delays in other flight phases and exploring its application to broader air traffic management tasks, such as trajectory prediction and conflict detection. The integration of even higher-quality operational data and additional modalities like images or audio could further enhance the capabilities of multimodal models in aviation.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI Enhances Flight Delay Prediction with Multimodal Data and Language Models

A Multimodal Approach to Understanding Airspace

How the Model Works

Promising Results and Real-time Capabilities

Impact and Future Directions

Gen AI News and Updates

Northrop Grumman and Luminary Cloud Pioneer AI for Rapid Spacecraft Propulsion Design

Aeolus: A New Multi-Modal Dataset for Understanding Flight Delays

Enhancing Flight Control Stability with Lyapunov-Guided Reinforcement Learning

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates