TLDR: ColaUntangle is a novel AI-powered framework that uses a multi-agent system driven by Large Language Models (LLMs) to untangle complex, multi-purpose code commits into atomic, single-concern changes. By modeling both explicit (direct) and implicit (semantic) dependencies among code modifications through a collaborative consultation process, ColaUntangle significantly outperforms previous methods, demonstrating substantial accuracy improvements on C# and Java datasets. This approach simulates human-like reasoning to enhance code review and maintenance.
In the world of software development, keeping code changes organized is crucial. Ideally, every change, known as a ‘commit,’ should be ‘atomic’ – meaning it addresses only one specific task, like fixing a bug or adding a new feature. This practice makes code easier to review, understand, and maintain.
However, developers often face real-world pressures that lead to ‘tangled commits.’ These are single commits that mix unrelated changes, making code reviews difficult and potentially introducing errors. Imagine trying to fix a bug, refactor some code, and update documentation all in one go – that’s a tangled commit. Studies show that a significant portion of commits in software repositories are tangled, ranging from 11% to 40%.
Previous attempts to untangle these commits have used various methods, including rule-based systems, feature-based models, and graph-based clustering. While these approaches have made progress, they often fall short. They tend to rely on surface-level signals, act as ‘black boxes’ without explaining their decisions, and struggle to differentiate between explicit dependencies (like one piece of code directly affecting another) and implicit dependencies (like changes that are conceptually related but not directly linked in the code structure).
Introducing ColaUntangle: A Collaborative AI Approach
A new research paper, titled “LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning,” introduces an innovative framework called ColaUntangle. Developed by Bo Hou, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, and Li Zhang, this model aims to overcome the limitations of prior methods by explicitly considering both explicit and implicit dependencies among code changes. You can read the full paper here.
ColaUntangle leverages the power of Large Language Models (LLMs) in a multi-agent architecture. Think of it as a team of specialized AI experts working together:
- Explicit Worker Agent: This agent focuses on direct relationships, such as how data flows or how control is passed between different parts of the code.
- Implicit Worker Agent: This agent looks for deeper, semantic connections, like conceptual similarities or shared development purposes, even if there’s no obvious structural link. For example, changes that collectively form a complete activity, like creating, using, and deleting an index, would be identified by this agent.
- Reviewer Agent: This agent acts as a supervisor, synthesizing the perspectives of the two worker agents and guiding an iterative consultation process until a consensus is reached.
To help these agents understand the code, ColaUntangle constructs ‘multi-version Program Dependency Graphs’ (δ-PDG). These graphs capture structural and contextual information about code changes, allowing the agents to reason with both symbolic and semantic depth.
How ColaUntangle Works
The process begins with the system extracting structured information from the commit, including the code differences and contextual details from the δ-PDG. Then, the two worker agents generate their initial untangling suggestions, each based on their specialized focus (explicit or implicit dependencies).
The reviewer agent then takes these initial results and synthesizes them. This kicks off a collaborative consultation process where the worker agents provide feedback on the reviewer’s synthesized result. The reviewer then revises its decision based on this feedback. This back-and-forth continues until all agents agree on the untangling outcome or a maximum number of rounds is reached.
Also Read:
- Automating Software Test Oracles with Large Language Models and JDK Javadocs
- Automating Gas Savings in Blockchain Contracts
Impressive Results and Key Insights
ColaUntangle was evaluated on two widely-used datasets, one for C# (1,612 tangled commits) and one for Java (14,000 tangled commits). The results were remarkable: ColaUntangle outperformed the best existing baseline, showing a 44% improvement on the C# dataset and a 100% improvement on the Java dataset in terms of accuracy.
An important finding from the study was the critical role of the collaborative consultation mechanism. While providing structured context information and having both explicit and implicit worker agents improved accuracy, the multi-agent collaboration itself contributed the most significant performance boost. This highlights the power of AI agents working together, simulating human-like consultation to resolve complex problems.
The research also explored how ColaUntangle performs with different LLMs, including GPT-4o, Claude-4-sonnet, and DeepSeek-V3. All models effectively handled the task, with DeepSeek-V3 achieving an optimal balance of accuracy and efficiency, requiring fewer rounds of consultation to reach a consensus.
While highly effective, the paper also discusses error cases, often related to the granularity of untangling (splitting a commit too finely or too coarsely compared to human expectations). These insights will guide future improvements, potentially involving human-in-the-loop interactions and more realistic dataset construction.
In conclusion, ColaUntangle represents a significant step forward in automated commit untangling. By combining LLM-driven agents with a collaborative consultation framework that understands both explicit and implicit code dependencies, it offers a more accurate, transparent, and practical solution for managing complex code changes in software development.


