TLDR: This study introduces LiTransMC, the first fine-tuned causal large language model (LLM) for predicting travel mode choice. It demonstrates that open-access, locally deployable LLMs, especially when fine-tuned with specific data, can outperform larger proprietary models and traditional methods in both prediction accuracy and understanding aggregate travel patterns. The research highlights the importance of targeted examples for training and the potential for privacy-preserving, cost-effective, and interpretable AI in transportation planning.
Understanding why people choose certain ways to travel, like driving, taking a train, or cycling, has traditionally relied on complex mathematical models. These models, while effective, often struggle to capture the subtle, real-world factors that influence our daily travel decisions. Think about how a sudden rain shower or a desire for comfort might sway your choice – traditional models find it hard to account for such nuances.
However, recent advancements in Large Language Models (LLMs), the same technology behind popular AI chatbots, are opening up new possibilities. These powerful AI systems can process and understand natural language, allowing them to incorporate qualitative factors and a vast amount of domain knowledge that traditional numerical models might miss.
A groundbreaking new study, titled “Towards Locally Deployable Fine-Tuned Causal Large Language Models for Mode Choice Behaviour” by Tareq Alsaleha and Bilal Farooqa, delves into this exciting area. This research is a significant step forward, investigating how open-access, locally deployable LLMs can be used to predict travel mode choices. The study introduces LiTransMC, the first LLM specifically fine-tuned for this task, marking a new era for transportation modeling.
A Comprehensive Evaluation
The researchers conducted an extensive evaluation, benchmarking eleven different LLMs, ranging in size from 1 billion to 12 billion parameters. They tested these models across three diverse datasets: Swissmetro, Brightwater SP, and London PMC. In total, they explored 396 different configurations, generating over 79,000 synthetic predictions of commuter choices. This systematic approach allowed them to understand not just predictive accuracy, but also how the models generated their reasoning.
The study looked at various learning strategies, including ‘zero-shot’ (where the model receives no examples), ‘random few-shot’ (where it gets a few random examples), and ‘targeted few-shot’ (where it receives carefully selected, similar examples). They also examined different prompting styles and temperature settings, which control the creativity and determinism of the AI’s responses.
Key Findings on Predictive Power
One of the most important discoveries was the impact of the learning strategy. Targeted few-shot prompting consistently led to significant improvements in both prediction accuracy and the stability of the models’ performance. This means that when the AI is given a few relevant examples, it learns much better how to make accurate and consistent travel predictions.
The research also revealed that the choice of the LLM itself and the learning strategy were the most crucial factors influencing performance, accounting for over 98% of the explained variation. Minor adjustments like prompt wording or temperature settings had only a marginal impact, suggesting that focusing on the right model and providing good examples is far more important than tweaking small details.
Different models showed different strengths. For instance, models like Gemma 3 12B and DeepSeek R1 Distill Llama 8B performed exceptionally well, especially with targeted examples. Smaller models, like those from the Stealth family, also showed remarkable adaptability and learning potential when given in-context examples.
Beyond Just Prediction: Understanding Why
A unique aspect of this study was its focus on the AI’s reasoning. The researchers developed a systematic framework to analyze the natural language explanations generated by the LLMs. They used techniques like the Explanation Strength Index (ESI) to quantify how well the explanations referenced key decision factors like time, cost, comfort, and convenience. They also used topic modeling to identify the main themes in the AI’s reasoning.
This analysis showed that models often focused tightly on time and cost trade-offs, especially when achieving high accuracy. However, models that also incorporated other factors, like ticket ownership or transfer burden, tended to produce more balanced predictions that better reflected the overall distribution of travel choices in the real world.
Introducing LiTransMC: A Game Changer
The crowning achievement of this research is LiTransMC. This model was created by fine-tuning the Gemma 3 12B model specifically for travel mode choice prediction using a technique called QLoRA, which makes the process memory-efficient. LiTransMC achieved a weighted F1 score of 0.6845, a measure of its predictive accuracy, and an exceptionally low Jensen-Shannon Divergence (0.000245), indicating its ability to accurately reproduce the overall distribution of travel choices.
These results are truly impressive. LiTransMC not only outperformed the best untuned local models but also surpassed larger, proprietary systems like GPT-4o and even traditional discrete choice and machine learning models on the same dataset. This demonstrates that a smaller, specialized AI model can be more effective than general-purpose, larger AIs for specific tasks.
Also Read:
- Unlocking Choice Modeling with AI: A Study on LLM Capabilities
- Beyond Standard Hours: Modeling the Unique Journeys of Shift Workers
The Future of Transportation Planning
The implications of this research are profound. Locally hosted and fine-tuned LLMs like LiTransMC offer several practical advantages. They can run without relying on external APIs, significantly reducing operational costs. More importantly, they keep sensitive data on-premises, ensuring privacy and security. This makes high-quality behavioral modeling accessible to organizations without access to expensive proprietary models.
By combining accurate prediction with interpretable reasoning, these LLM-based approaches provide policymakers with powerful tools that are not only analytically strong but also explainable. This dual capability is crucial for making informed decisions in transportation planning.
This study provides a clear pathway for transforming general-purpose, open-source LLMs into specialized, powerful AI tools for public agencies. It establishes that conversational LLMs are not just viable alternatives but are poised to become superior instruments for understanding and predicting travel behavior. For more details, you can read the full research paper here.


