Rethinking Instruction Tuning: The Impact of Prompt and Response Token Weighting

TLDR: A new study introduces Weighted Instruction Tuning (WIT), demonstrating that assigning low-to-moderate weight to prompt tokens and moderate-to-high weight to response tokens during instruction tuning significantly improves language model generalization and robustness. This approach consistently outperforms conventional methods that solely focus on response tokens, highlighting the critical role of loss function design in developing more effective and reliable LLMs.

Large Language Models (LLMs) have become incredibly powerful, but getting them to reliably follow user instructions is a key challenge. This is where “instruction tuning” comes in, a crucial step after initial training that helps these models understand and respond to specific commands. However, a recent study delves into a fundamental, yet often overlooked, aspect of this process: the loss function used during instruction tuning.

Traditionally, when instruction tuning an LLM, the loss (a measure of how “wrong” the model’s predictions are) is calculated only on the response tokens, completely ignoring the prompt or instruction tokens. This conventional approach assumes that the model only needs to learn to generate the correct output, not necessarily to deeply understand the input instruction itself. But is this truly the most effective way?

Researchers from the Indian Institute of Technology Delhi and Adobe Inc. systematically investigated this question. They propose a new approach called Weighted Instruction Tuning (WIT), which allows for differential weighting of prompt and response tokens during the loss calculation. This means that instead of simply ignoring prompt tokens, WIT can assign them a specific weight (from zero to one), and similarly for response tokens, offering more granular control over what the model learns.

The study involved extensive experiments across five different language models of varying sizes and families, three finetuning datasets of different scales, and five diverse evaluation benchmarks. Their findings are quite compelling: the standard instruction tuning loss, where prompt tokens are ignored and response tokens are fully weighted, often leads to suboptimal performance and limited robustness when faced with slight variations in input prompts.

What they discovered is that the best-performing models consistently emerged when a low-to-moderate weight (between 0 and 0.6) was assigned to prompt tokens, coupled with a moderate-to-high weight (between 0.4 and 1) for response tokens. This suggests that allowing the model to learn from the prompt tokens, even with a smaller emphasis, significantly improves its ability to generalize and understand instructions better. In some cases, this “weighted” approach led to an average relative gain of about 6.55% over the conventional method.

Furthermore, the benefits of WIT extend beyond the initial instruction tuning phase. The models fine-tuned with WIT also served as better starting points for subsequent preference alignment training, such as Direct Preference Optimization (DPO). This indicates that the improved foundational understanding gained through WIT carries over, leading to even better performance after further alignment.

The research also highlighted an interesting trade-off: lower response weights tended to improve instruction adherence (how well the model follows specific commands), while higher response weights were preferred for conversational fluency. This implies that the optimal weighting might depend on the desired behavior of the instruction-tuned model.

Another intriguing observation was that even tuning solely on prompt tokens (ignoring responses) could enhance the base model’s capabilities, particularly in instruction following, especially with large and diverse datasets. This opens up possibilities for leveraging unannotated data to improve instruction-following abilities.

The study also explored the impact of WIT on model robustness to prompt variations. They found that models tuned with conventional loss were often more sensitive to minor changes in prompts. In contrast, lower response weights in WIT consistently led to reduced sensitivity, suggesting that a moderate response weight strikes a good balance between performance and robustness.

Also Read:

In essence, this research challenges the long-standing practice in instruction tuning and proposes a more nuanced approach to loss function design. By differentially weighting prompt and response tokens, Weighted Instruction Tuning (WIT) offers a path toward developing more robust, generalizable, and instruction-adherent language models. The code for this research is open-sourced and can be found at this link.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Rethinking Instruction Tuning: The Impact of Prompt and Response Token Weighting

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Avalara Secures $500 Million Investment from BlackRock to Propel AI-Powered Tax Automation

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates