TLDR: A new method called SCLPlan combines large language models (LLMs) with traditional symbolic planning to make robots more reliable and adaptable in real-world tasks. It uses symbolic planning to verify LLM actions and solve simpler parts of tasks, leading to significantly higher success rates (e.g., 99% in simulation, 100% in real-world robot tasks) and better repeatability than either method alone. This hybrid approach addresses LLM hallucinations and the scalability issues of symbolic planners, offering a transparent and robust solution for embodied AI.
Achieving human-like intelligence in robots performing real-world tasks has long been a significant challenge. The unpredictable nature of real environments makes it difficult for robots to plan and execute actions reliably. Recent advancements in large language models (LLMs) have offered a promising avenue for task planning, allowing robots to understand and respond to complex natural language commands. However, LLMs come with their own set of limitations, such as generating incorrect or ‘hallucinated’ actions and requiring extensive, often opaque, prompt engineering to function effectively.
On the other side of the spectrum, traditional symbolic planning methods offer strong guarantees of reliability and repeatability. These methods rely on meticulously defined rules and logic to plan actions. The downside is their struggle to scale to the complexity and ambiguity of real-world tasks, as defining a complete set of rules for every possible scenario is often impossible.
Introducing SCLPlan: A Hybrid Approach
A new research paper, “Constrained Natural Language Action Planning for Resilient Embodied Systems,” introduces a novel robotic planning method called SCLPlan (Symbolically Constrained Language Planner). This approach ingeniously combines the strengths of LLMs with the reliability of symbolic planning. The core idea is to augment LLM planners with symbolic planning oversight, creating a system that is both adaptable and highly dependable.
SCLPlan aims to overcome the limitations of pure LLM planning by providing a transparent way to define hard constraints, offering much clearer guidance than traditional prompt engineering. Crucially, this hybrid method preserves the powerful reasoning capabilities of LLMs and their ability to generalize in open-world environments, while significantly enhancing reliability and repeatability.
How SCLPlan Works
The SCLPlan architecture is designed to iteratively plan and execute tasks. It involves four main phases:
1. Goal State Generation: An LLM interprets a high-level natural language task and defines a goal state using a formal language (PDDL). This goal state, along with the current environment, is then used to create a problem file for the symbolic planner.
2. Invoke Formal Planner: A symbolic planner attempts to find an optimal plan to achieve the LLM-defined goal. If a plan is found and successfully executed, the task is complete. If not, or if an unexpected error occurs, the process moves to the next phase.
3. Invoke LLM Planner: When the symbolic planner cannot find a solution or a failure occurs, an LLM takes over. It uses its common-sense reasoning to predict the next action, leveraging the current environment state and the history of actions and failures.
4. Precondition Verification: This is a critical step for reliability. Before executing an LLM-predicted action, SCLPlan uses the formal rules defined for the symbolic planner to verify if the action’s preconditions are met. If not, it can often formally plan a sub-sequence of actions to reach a valid state, effectively correcting LLM hallucinations and preventing erroneous actions.
This dynamic interplay allows SCLPlan to leverage the LLM for complex reasoning and adaptability where symbolic rules are insufficient, while relying on the symbolic planner for robust, error-free execution of well-defined sub-tasks and for verifying LLM outputs.
Impressive Results in Diverse Environments
The researchers rigorously tested SCLPlan across various environments, including a text-based simulator (ALFWorld), a custom simulation environment (AI2Thor) with more complex tasks, and a real-world Boston Dynamics Spot quadruped robot with a manipulator arm. The results were compelling:
- On the ALFWorld planning benchmark, SCLPlan achieved a near-perfect 99% success rate, significantly outperforming current state-of-the-art methods.
- In real-world experiments with a quadruped robot performing pick-and-place tasks, SCLPlan achieved 100% task success, compared to 50% for pure LLM planners and 30% for pure symbolic planners.
- The approach also demonstrated improved repeatability in plan trajectories, token count, and environment steps, especially with more capable LLMs.
- SCLPlan showed an ability to adapt its planning strategy based on environmental complexity, relying more on the symbolic planner in simpler environments and increasing LLM usage for more ambiguous tasks.
These findings highlight SCLPlan’s potential to greatly improve the reliability and transparency of AI planning approaches, fostering greater trust between humans and intelligent embodied agents. The research paper can be found at arXiv:2510.06357.
Also Read:
- A New Benchmark Reveals LLMs Struggle with Constrained Planning
- Boosting Robot Precision with Internal Model Confidence
A Step Towards Resilient Embodied Systems
The work demonstrates that a hybrid planning approach is highly beneficial for robotic performance across various environments. By providing a clear and logical process for defining hard constraints, SCLPlan circumvents many issues associated with prompt optimization in LLMs. This framework allows engineers to thoughtfully balance the effort of defining an environment with the desired level of system reliability, making it a practical and relevant solution for integrating LLMs into general-purpose robotic systems.


