Optimizing Video Call Quality: A Data-Driven Approach from Microsoft Teams

TLDR: A new data-driven framework for bandwidth estimation, developed by Microsoft, significantly improves the Quality of Experience (QoE) in real-time video communication. By training objective QoE reward models from subjective user evaluations and utilizing a novel distributional offline reinforcement learning algorithm on 1 million real-world Microsoft Teams call traces, the system reduced the subjective poor call ratio by 11.41% and enhanced video quality. This approach ensures safer deployment by learning from historical data and is now active in Microsoft Teams.

In today’s interconnected world, video conferencing has become an indispensable tool for work, education, and social interaction. However, the quality of these real-time video calls, often referred to as Quality of Experience (QoE), can be significantly impacted by how accurately the system estimates the available internet bandwidth between participants. This estimation is a complex challenge due to constantly changing network conditions, diverse device types, and the difficulty of truly understanding what makes a user’s experience good or bad.

Understanding the Challenge of Video Call Quality

When you’re on a video call, your device constantly tries to figure out how much data it can send without overwhelming the network. If it sends too much, you get congestion, leading to frustrating issues like video freezes, choppy audio, and dropped packets. Send too little, and you’re not using the network’s full potential, resulting in lower quality video and audio than what’s possible. The goal is to find that sweet spot for optimal QoE, which goes beyond simple technical metrics like speed and packet loss to truly capture user satisfaction.

A New Approach to Bandwidth Estimation

Researchers at Microsoft have developed a sophisticated, data-driven framework designed to tackle these challenges. Their approach integrates human feedback into the system, using advanced machine learning to predict and optimize the quality of experience. This framework is already deployed in Microsoft Teams, serving millions of users daily.

Measuring User Experience: QoE Reward Models

A core part of this system involves creating objective models that can predict audio and video quality. This starts with extensive subjective user evaluations, where real people rate the quality of audio and video samples according to international standards (ITU-T P.808 and P.910). These human ratings are then used to train AI models that can measure audio and video quality in real-time. To ensure these models are efficient and privacy-preserving for deployment on user devices, they are ‘distilled’ into simpler versions that rely on key media metrics (like audio receive rate, jitter, packet loss concealment for audio; and resolution, frame rate, freezes for video) rather than raw audio or video signals. The final QoE reward is a weighted combination of these predicted audio and video quality scores, ensuring the system optimizes for what users actually perceive.

Learning from Real-World Data with Offline Reinforcement Learning

To train the bandwidth estimator, the team collected an enormous dataset: approximately 1 million network traces from actual Microsoft Teams calls. These traces were rich with information, including network conditions and the QoE rewards predicted by the newly developed models. Instead of using traditional online reinforcement learning, which can be risky in live production environments due to potential for suboptimal actions, they employed a novel distributional offline reinforcement learning (RL) algorithm. This ‘offline’ approach allows the AI to learn optimal strategies from historical data without needing to experiment in real-time, making deployment much safer. The algorithm, called DIQL (Distributional Implicit Q-learning), is designed to handle the complex, partially observable nature of network conditions and predict the full range of possible QoE outcomes, not just an average.

Real-World Impact: Microsoft Teams Deployment

The true test of this framework came with a large-scale A/B test conducted within Microsoft Teams. Over two weeks, involving more than 25 million calls globally, the new bandwidth estimator was compared against the existing baseline system. The results were highly encouraging: the proposed approach led to an 11.41% reduction in the subjective poor call ratio – meaning significantly fewer users reported a bad call experience. There were also statistically significant improvements in objective video quality scores, while audio quality remained consistently high.

Also Read:

Beyond the Lab: Robustness and Generalization

Further evaluations in controlled testbed environments demonstrated the algorithm’s robust performance across a wide variety of network conditions, including fluctuating bandwidth and different types of packet loss. It consistently outperformed other state-of-the-art offline reinforcement learning methods. To prove its versatility, the DIQL algorithm was also benchmarked on standard continuous control tasks from the D4RL suite, showing competitive performance even outside the specific domain of bandwidth estimation.

This work represents a significant step forward in optimizing real-time video communication. By combining human-aligned QoE modeling with safe, data-driven offline reinforcement learning, Microsoft has successfully deployed a system that genuinely enhances user experience in a complex, latency-sensitive environment. For more in-depth technical details, you can read the full research paper available here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Optimizing Video Call Quality: A Data-Driven Approach from Microsoft Teams

Understanding the Challenge of Video Call Quality

A New Approach to Bandwidth Estimation

Measuring User Experience: QoE Reward Models

Learning from Real-World Data with Offline Reinforcement Learning

Real-World Impact: Microsoft Teams Deployment

Beyond the Lab: Robustness and Generalization

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates