StuckSolver: An LLM-Powered System for Autonomous Vehicle Recovery

TLDR: StuckSolver is a novel framework that uses Large Language Models (LLMs) to help autonomous vehicles (AVs) recover from immobilization in complex traffic scenarios. Designed as a plug-in add-on, it enables AVs to self-reason or follow passenger guidance to generate recovery plans, significantly improving reliability and performance. Evaluations show StuckSolver achieves near state-of-the-art results, enhancing AV resilience and accessibility without requiring extensive modifications to existing systems.

Autonomous vehicles (AVs) have made remarkable strides, moving from controlled environments to complex urban settings. Companies like Waymo and Baidu are deploying large fleets of robotaxis, offering millions of miles in public service. However, despite these advancements, AVs still encounter significant challenges in certain traffic scenarios where human drivers excel. These situations often lead to AVs becoming immobilized, causing traffic disruptions and inconvenience for passengers.

Current solutions for AV immobilization, such as remote intervention by engineers or manual takeover by a human driver, have notable limitations. Remote intervention is costly and inefficient, requiring substantial financial and human resources. Manual takeover, while effective for drivers, excludes non-driving passengers like the elderly or disabled, limiting AV accessibility. These challenges highlight a critical need for more robust and inclusive recovery mechanisms.

A new research paper introduces StuckSolver, a novel framework designed to address AV immobilization using Large Language Models (LLMs). StuckSolver aims to enable AVs to resolve these challenging scenarios through self-reasoning or by incorporating passenger-guided decision-making. This innovative approach leverages the extensive knowledge and advanced reasoning capabilities of LLMs to understand complex traffic situations and generate logical driving decisions.

How StuckSolver Works

StuckSolver is designed as a plug-in add-on module that integrates seamlessly with an AV’s existing perception–planning–control stack, requiring no modifications to its internal architecture. It continuously monitors the vehicle’s operational status and surrounding traffic conditions. When immobilization is detected, StuckSolver intervenes by analyzing the environmental context and generating high-level recovery commands that the AV’s native planner can execute.

The system transforms a powerful LLM, like GPT-4o, into an intelligent agent by utilizing prompt engineering, Chain-of-Thought (CoT) reasoning, and OpenAI’s Function Calling API. This allows StuckSolver to process multimodal information, detect immobilization, and make multi-step decisions in a zero-shot mode, meaning it requires no task-specific fine-tuning or additional training.

StuckSolver’s reasoning process involves three key steps:

Observation: It captures raw images from the vehicle’s front-view camera and extracts semantic information, including traffic control factors (e.g., traffic lights, signs, work zones) and traffic participants (e.g., vehicles, pedestrians, obstacles). It also associates these insights with object measurements, enriching them with quantitative attributes like distance and velocity.
Analysis: At this stage, StuckSolver determines if the vehicle is stuck and identifies the cause. It checks if the AV has stopped en route to its destination and if its speed is below a minimum threshold for a certain duration. It assesses traffic control elements and surrounding traffic participants to understand why the AV is immobilized. If passenger guidance is provided, StuckSolver interprets the passenger’s intent to formulate a behavior plan.
Decision-making: Based on its analysis, StuckSolver generates a recovery plan. This plan includes a route replanning flag, a designated route start point, and an action plan to help the vehicle resume normal operation. If a passenger-suggested plan is available, it evaluates its necessity for route replanning. Otherwise, it autonomously generates a safe and traffic-compliant behavior plan.

The integration of StuckSolver with an AV system is non-intrusive. It communicates with the AV’s primary modules via structured APIs and only intervenes when a stuck situation is detected, otherwise allowing normal AV operations to continue unaffected.

Evaluation and Results

The efficacy of StuckSolver was evaluated using the CARLA simulator on the Bench2Drive benchmark, which includes 220 challenging routes. The results demonstrate significant improvements across various metrics compared to a standard rule-based behavior agent. StuckSolver achieved notable gains in Driving Score (DS) and Success Rate (SR), ranking second only to state-of-the-art (SOTA) end-to-end methods.

For instance, a rule-based agent often gets stuck in scenarios like construction zones or parked obstacles, leading to low DS and SR. With StuckSolver, once the AV is identified as immobilized, the system initiates scene analysis and generates a recovery plan, enabling the vehicle to escape. The paper highlights that StuckSolver achieves near-SOTA performance through autonomous self-reasoning alone, and its performance is further enhanced when passenger guidance is incorporated.

Qualitative evaluations in scenarios such as a vehicle with an open door blocking a lane or a pedestrian crossing at a red light showed StuckSolver’s ability to make rational decisions. In the open door scenario, it correctly identified the obstruction and decided to perform a lane change. In the pedestrian crossing scenario, it accurately determined that the vehicle was not stuck but appropriately stopped due to traffic rules, thus choosing not to intervene.

This research underscores the potential of LLMs to enhance AV resilience and accessibility in complex traffic scenarios, offering a promising alternative to traditional recovery methods. For more in-depth information, you can read the full research paper here.

Also Read:

Future Directions

While StuckSolver shows great promise, future work will focus on improving its inference efficiency, as its current average inference time of 2.8 seconds per query can limit its application in time-sensitive scenarios. Plans include distilling a lightweight LLM for faster inference. Additionally, the current approach is designed for rule-based or modular AV systems, and future efforts will explore integration strategies to complement and enhance end-to-end autonomous driving frameworks.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

StuckSolver: An LLM-Powered System for Autonomous Vehicle Recovery

How StuckSolver Works

Evaluation and Results

Future Directions

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Valerann’s AI Traffic Platform Earns Dual International Accolades Amidst Ireland-Wide Rollout

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates