Context-Aware AI: Boosting Log Level Prediction with Code Semantics and Developer Ownership

TLDR: OmniLLP is a new framework that significantly improves LLM-based log level prediction by using context-aware retrieval. It clusters source code files based on their functional similarity (semantic clustering) and shared developer contributions (ownership clustering). By combining these two signals through multiplex clustering, OmniLLP provides more relevant examples to LLMs, leading to a substantial increase in prediction accuracy (up to 0.96 AUC) compared to traditional random example selection, making logging more accurate and efficient.

In the world of software development, logging is a crucial activity. Developers insert logging statements into their code to capture important runtime information, which is essential for maintaining and debugging software systems. However, choosing the right “log level” – such as DEBUG, INFO, WARN, ERROR, or FATAL – is a tricky part of this process. The log level controls how much information is recorded, directly impacting system observability and performance. Too little logging can make debugging difficult, while too much can overwhelm developers and consume excessive resources.

Recent advancements have seen Large Language Models (LLMs) being used to predict appropriate log levels, showing promising results. However, a key limitation of these existing LLM-based log level predictors (LLPs) is their reliance on randomly selected examples for “in-context learning.” This approach often overlooks the unique structure and diverse logging practices that exist within different parts of a large software project. For instance, different teams or functional areas within the same project might have distinct logging conventions, which random example selection fails to account for.

To address this challenge, researchers have proposed a new framework called OmniLLP. This innovative system aims to significantly enhance LLM-based log level prediction by providing more relevant and context-aware examples to the LLMs. OmniLLP achieves this by intelligently clustering source code files based on two important factors: semantic similarity and developer ownership cohesion.

The first approach, semantic clustering, groups source code files that have similar functional purposes. The idea is that files performing similar tasks are likely to share similar logging behaviors. OmniLLP uses advanced embedding models to understand the “meaning” of the code and group related files together. The second approach, ownership clustering, groups files based on shared developer contributions. The intuition here is that files maintained by the same developers tend to follow consistent coding and logging conventions. By analyzing Git history, OmniLLP identifies which developers frequently modify which files and groups them accordingly.

The most powerful aspect of OmniLLP is its multiplex clustering, which combines both semantic and ownership signals. This creates a unified view where files are clustered not only by what they do but also by who maintains them. When a developer needs a log level prediction for a new statement, OmniLLP identifies the relevant cluster for that file and retrieves the most contextually similar logging examples from within that cluster. These examples are then fed to the LLM, helping it make a more accurate prediction.

The empirical evaluation of OmniLLP across four large open-source Java projects (Hadoop, HBase, Elasticsearch, and Cassandra) demonstrated impressive results. The research showed that both semantic and ownership-aware clusterings statistically significantly improved the accuracy of LLM-based LLPs compared to using randomly selected examples. Specifically, semantic clustering showed improvements in AUC (Area Under the ROC Curve) by up to 8%, while ownership clustering also provided notable gains.

However, the most significant improvements were observed when OmniLLP leveraged the combined semantic and ownership signals through multiplex clustering. This approach achieved an impressive AUC between 0.88 and 0.96 across the evaluated projects, representing a substantial increase compared to random prediction. This highlights the critical value of integrating software engineering-specific context, such as code semantics and developer ownership, into LLM-based log level prediction. The framework is also computationally efficient, making it practical for real-time use.

Also Read:

OmniLLP offers developers a more accurate and contextually-aware approach to logging, ultimately enhancing system maintainability and observability. The findings from this research paper, available at arXiv:2508.08545, pave the way for more intelligent logging automation tools that better align with real-world software development practices.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Context-Aware AI: Boosting Log Level Prediction with Code Semantics and Developer Ownership

Gen AI News and Updates

Generative AI Revolutionizes Engineering: Startups and Enterprises Drive Measurable ROI in 2025

Foxglove Secures $40 Million Series B to Advance Physical AI Data Platform

SecureVibes Unveils AI-Powered Multi-Language Code Vulnerability Scanner Leveraging Claude AI Agents

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates