LinkAnchor: An Autonomous Agent for Smarter Software Issue Tracking

TLDR: LinkAnchor is a new autonomous LLM-based agent designed to accurately link software issues to their resolving code commits. It overcomes limitations of previous methods by dynamically accessing relevant data (commit history, issue comments, code) without exceeding token limits, and by directly pinpointing the correct commit instead of evaluating all possible pairs. LinkAnchor requires no task-specific training, outperforms state-of-the-art approaches by a significant margin, and demonstrates strong generalizability and cost-effectiveness in real-world scenarios.

In the world of software development, keeping track of changes and fixes is crucial for efficient project management and maintaining software quality. One significant challenge is accurately linking reported issues, like bugs or feature requests, to the specific code changes (commits) that resolve them. This process, known as Issue-to-Commit Link Recovery (ILR), is often difficult, with studies showing that less than half of issues are correctly linked to their corresponding commits on platforms like GitHub.

Traditional methods for ILR, many of which rely on artificial intelligence and machine learning, face several hurdles. A major limitation is their inability to process all available information, such as extensive commit histories, detailed issue comments, or large code repositories, due to constraints like limited context windows in models. Furthermore, many existing approaches evaluate issue-commit pairs individually, which becomes highly impractical for large software projects with thousands of commits.

Addressing these challenges, a new autonomous agent called LinkAnchor has been introduced. LinkAnchor is the first of its kind to use a large language model (LLM) to tackle the issue-to-commit link recovery problem. Its innovative ‘lazy-access’ architecture allows the underlying LLM to dynamically retrieve only the most relevant contextual data, such as commit details, issue discussions, and code files, without being overwhelmed by too much information. This means it can access a rich context of software development data without hitting token limits.

LinkAnchor also stands out because it can automatically pinpoint the target commit that resolves an issue, rather than having to exhaustively score every possible candidate commit. This makes it much more efficient for real-world repositories. The agent is designed to be a ready-to-use tool, initially tested for GitHub and Jira, and easily extendable to other platforms.

The core of LinkAnchor’s functionality lies in its ability to grant the LLM on-demand access to various project data sources through specialized function calls. These functions are categorized into Git functions (for commit history and details), Issue functions (for issue titles, descriptions, and comments), Codebase functions (for exploring code definitions and documentation), and Control functions (for managing the interaction flow). For instance, the LLM can ask for commits by a specific author or inspect lines of code at a particular point in time.

A notable advantage of LinkAnchor is that it does not require task-specific training, as it is built on a pre-trained, general-purpose LLM like ChatGPT-4o-nano. This makes it immune to inaccuracies often found in manually generated training datasets used by other methods. By framing ILR as a search problem, LinkAnchor avoids the need to evaluate every single commit, significantly reducing computational overhead.

Evaluations show that LinkAnchor significantly outperforms state-of-the-art ILR approaches, achieving improvements of 60% to 262% in Hit@1 scores across various case study projects. Even when compared against other models’ Hit@10 scores (meaning the correct commit is found within the top 10 predictions), LinkAnchor’s single prediction often performs better. This robust and consistent performance across different projects highlights its adaptability to varying project contexts, unlike methods that rely on fixed feature sets.

LinkAnchor’s generalizability was further demonstrated by testing it on new, unseen data from 120 randomly selected GitHub issues resolved after the LLM’s training cut-off date. It successfully linked 107 of these issues, achieving an impressive 89% accuracy. This indicates its strong real-world utility and adaptability across diverse codebases and programming languages like Python and Go.

From a cost perspective, LinkAnchor is also efficient. The median time to link an issue to its resolving commit was found to be 23 seconds, consuming approximately 115,000 tokens, which translates to about 0.01 US dollars per issue. This is significantly faster and more practical than traditional methods that might require hours of training and lengthy inference times for large repositories.

Also Read:

The development of LinkAnchor offers valuable insights for future research in LLM-based agents. Its success underscores the importance of on-demand access to data, scalable context handling through pagination and feedback pruning, and a balanced approach between deterministic functions and LLM autonomy. LinkAnchor is publicly available as a ready-to-use tool, and its replication package can be found here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

LinkAnchor: An Autonomous Agent for Smarter Software Issue Tracking

Gen AI News and Updates

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Sulava, The Digital Neighborhood’s AI Pioneer, Crowned Microsoft’s Global Partner of the Year for Copilot and AI Agents

AI Agent Startup Genspark Achieves Unicorn Status with Over $200 Million Series B Funding

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates