spot_img
HomeResearch & DevelopmentAutomating Radiotherapy Planning with Zero-Shot Large Language Model Agents

Automating Radiotherapy Planning with Zero-Shot Large Language Model Agents

TLDR: This research introduces a novel method for fully automating radiotherapy treatment planning using a large language model (LLM) agent in a zero-shot setting. The LLM agent interacts directly with a clinical treatment planning system, iteratively adjusting optimization parameters based on real-time feedback and clinical objectives. Tested on head-and-neck cancer cases, the LLM-generated plans achieved comparable organ sparing and improved target conformity and hot spot control compared to manual plans, demonstrating a significant step towards generalizable and efficient AI-driven planning without the need for prior training data.

Radiotherapy is a crucial treatment for many cancer patients, with millions receiving it globally each year. However, the process of creating a treatment plan is highly complex, requiring specialized expertise and many iterative adjustments. This manual approach is becoming increasingly unsustainable due to the rising number of cancer cases and existing workforce shortages, leading to calls for greater automation.

Current automated planning methods, such as knowledge-based planning, protocol-based planning, multi-criteria optimization, and reinforcement learning, each offer benefits but also come with limitations. These often include the need for large, high-quality datasets, a lack of flexibility for unusual anatomies, significant human engagement, or intensive computational requirements. As a result, a universally applicable automated solution has remained elusive.

A recent study introduces a groundbreaking approach that leverages large language model (LLM) agents for fully automated radiotherapy treatment planning, operating in a “zero-shot” setting. This means the LLM agent performs the task without any prior exposure to manually generated treatment plans, fine-tuning, or specific task training. This capability is particularly valuable in specialized fields like radiation therapy, where extensive expert-labeled data is scarce.

The proposed workflow involves an LLM agent directly interacting with a commercial clinical treatment planning system (TPS), specifically Eclipse™ by Varian Medical Systems. The agent iteratively extracts information about the plan’s current state, such as dose-volume histograms (DVHs) and objective function losses, and then proposes new constraint values to guide the inverse optimization process. Its decision-making is informed by current observations, previous optimization attempts, and evaluations, allowing it to dynamically refine its strategy.

To enable the LLM to perform effectively, the complex planning task was broken down into simpler, domain-agnostic subtasks. The agent was equipped with an arithmetic tool to quantify deviations from clinical goals and was provided with historical data to facilitate trend-based reasoning. Crucially, domain-specific information about the optimization system, including how constraints influence dose distribution, was encoded into the prompt. The use of chain-of-thought reasoning further enhanced the agent’s ability to make multi-step decisions, similar to a human planner, by explicitly articulating its thought process before proposing adjustments.

The feasibility of this LLM-driven workflow was tested on twenty head-and-neck cancer cases. The LLM-generated plans were compared against clinical manual plans, with key dosimetric endpoints analyzed. The study utilized two state-of-the-art LLMs, GPT-4.1 and GPT-4.1-mini, both with and without access to optimization priors (domain-specific knowledge about constraint ranges and their effects).

The results were highly promising. Plans generated by GPT-4.1 with optimization priors (GPT-4.1-WP) achieved clinically comparable quality to manual plans. They demonstrated similar organ-at-risk (OAR) sparing, while showing improved hot spot control and superior conformity for the planning target volumes (PTVs). For instance, the maximum dose (Dmax) was 106.5% for LLM plans versus 108.8% for clinical plans, and the conformity index for the boost PTV was 1.18 versus 1.39. The study highlighted that access to optimization priors was critical; without them, the LLM’s performance significantly deteriorated, leading to worse OAR sparing.

A case study illustrated the agent’s reasoning process. The LLM initialized optimization constraints close to clinical goals, using larger step sizes early on to explore sparing potential and smaller steps for fine-tuning. When faced with difficult-to-achieve sparing objectives, such as for the mandible, the agent intelligently relaxed constraints to preserve target coverage, reasoning that further tightening would not yield significant dose reduction but would increase objective function loss. This adaptive behavior mirrors that of experienced human planners.

Beyond quality, the efficiency gains were substantial. The LLM-driven planning process completed in under 5 minutes on a standard workstation, a significant reduction compared to manual planning times. This research marks a significant step towards generalizable AI-driven planning, particularly for institutions with limited access to large, high-quality training datasets. By embedding the agent directly into a commercial TPS and constraining its actions to parameters human planners use, the approach maximizes clinical applicability and interpretability.

Also Read:

The study underscores that while LLMs possess strong general reasoning, their clinical utility in this domain heavily relies on the quality and interpretation of provided information. This includes understanding clinical constraints as flexible reference points rather than strict targets, and grasping the “hidden rules” of the optimization engine. This zero-shot, LLM-driven workflow offers a generalizable and clinically applicable solution that could reduce planning variability and support broader adoption of AI-based planning strategies in radiotherapy. You can read the full paper here.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -