TLDR: A research paper identifies 25 tasks and 51 questions prompt programmers ask, revealing that current tools poorly support their needs. Key challenges include understanding prompt content and behavior, debugging, managing history, and identifying external code dependencies. The study highlights significant opportunities for new tools to better assist developers in this iterative process.
Prompting large language models (LLMs) and other foundation models (FMs) has become a cornerstone of modern AI-powered software development. Developers are now embedding these prompts directly into software, a practice known as prompt programming. While this has opened up new possibilities, the process is often iterative and challenging, with developers frequently modifying their prompts without clear guidance or adequate tool support.
A recent research paper, titled “Understanding Prompt Programming Tasks and Questions,” delves into the specific challenges and information needs of prompt programmers. The study, conducted by Jenny T. Liang, Chenyang Yang, Agnia Sergeyuk, Travis D. Breaux, and Brad A. Myers, aims to shed light on the questions developers ask and the tasks they perform when working with prompts, ultimately identifying gaps in existing tools and outlining opportunities for future development.
Uncovering Developer Needs
The researchers employed a comprehensive mixed-method approach, involving interviews with 16 prompt programmers, observations of 8 developers making prompt changes, and a survey of 50 developers. This extensive data collection allowed them to develop a detailed taxonomy of 25 prompt programming tasks and 51 specific questions developers ask. They then measured the importance of each task and question and compared their findings against 48 existing research and commercial prompt programming tools.
The study revealed a significant finding: prompt programming is currently not well-supported. All identified tasks are largely performed manually, and a striking 16 out of the 51 questions – including many of the most important ones – remain unanswered by current tools. This highlights a critical need for more sophisticated and developer-centric tools in this rapidly evolving field.
Key Challenges Faced by Prompt Programmers
The research identified several key areas where prompt programmers struggle and require better support:
-
Understanding Prompt Content and Behavior: Developers need to grasp the high-level structure and specific text of their prompts, as well as how different components relate to each other. They also need to understand the prompt’s output and overall performance, often manually sifting through large amounts of generated text.
-
Managing Inputs and Data: A crucial aspect is understanding what inputs to provide to a prompt and assessing how representative these examples are. Generating diverse and relevant test cases is a significant challenge.
-
Debugging Unexpected Behavior: Debugging prompts is more complex than traditional code. Developers need to localize faults not just within the prompt content, but also by comparing different prompt versions, reasoning about external artifacts like related code, and understanding the context provided to the model. Current tools offer very limited support for these complex debugging scenarios.
-
Tracking Changes and History: Unlike traditional software development with robust version control, tracking changes in prompt content and understanding their impact on behavior is often manual. Recalling why a specific change was made or how behavior evolved across multiple versions is difficult, hindering iterative development.
-
Retrieving and Comparing Prompts: Developers often want to find past prompt versions based on their content, structure, or even their observed behavior. They also need to compare multiple versions to understand differences and progress, a task that is largely unsupported beyond basic side-by-side text comparisons.
-
Understanding External Dependencies: A unique challenge in prompt programming is the reliance on external code that processes inputs or handles the prompt’s output. The study found that identifying and understanding these code dependencies is a highly important but entirely unsupported task.
-
Understanding Relationships Between Prompt Components: The most important question identified by the study was understanding how different parts of a prompt (e.g., instructions, examples) logically relate to each other. This internal dependency tracking is crucial for maintaining consistency but is not supported by any existing tools.
Also Read:
- Unpacking Parameter Failures in LLM Tool-Agent Systems: A Deep Dive into the ‘Butterfly Effect’
- AI for Debugging: A Reality Check on Verified Bug Fixes
Opportunities for Future Tools
Based on these findings, the paper outlines several critical opportunities for tool builders and researchers to improve the prompt programming experience. These include developing tools that can automatically link prompt components to external code, visualize and manage relationships between different parts of a prompt, and provide more sophisticated debugging capabilities that go beyond simple output inspection. There is also a strong need for tools that help assess the representativeness of datasets used for testing prompts and offer more advanced methods for retrieving and comparing prompt versions based on their semantic meaning or behavior, not just keywords.
This research provides a valuable roadmap for the future of prompt programming tools, emphasizing the need for solutions that address the nuanced and often manual challenges faced by developers. By focusing on these identified information needs, the AI community can build more effective and user-friendly environments for creating the next generation of AI-powered applications. For more details, you can refer to the full research paper here.


