Optimizing In-Context Learning with Linear-Time Demonstration Selection

TLDR: This paper introduces ICL-GradSel, a novel algorithm for efficiently selecting demonstration examples for in-context learning (ICL) in large language models. It uses a first-order approximation based on gradients of the model output to estimate performance on demonstration subsets with less than 1% error. This approach results in a linear-time algorithm, achieving up to 37.7x speed-up and outperforming existing selection methods by 11% on average, while significantly reducing computational cost. The method enhances both the efficiency and effectiveness of ICL.

In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities, particularly through a technique known as in-context learning (ICL). This allows models to adapt to new tasks by simply conditioning on a few examples provided within the prompt, rather than undergoing extensive fine-tuning. However, the effectiveness of ICL is highly sensitive to the quality and relevance of these demonstration examples. The challenge lies in efficiently selecting the best examples from a potentially vast pool, a problem that has significant implications for areas like prompt tuning and chain-of-thought reasoning.

Traditional methods for demonstration selection often fall into two categories: those based on the similarity of input embeddings and those that directly evaluate model losses. Similarity-based approaches, while identifying relevant examples, can overlook how the model’s output is conditioned on demonstration labels and treat examples independently, ignoring crucial interactions between them. On the other hand, methods that evaluate model losses directly, such as forward selection or random ensemble selection, can be computationally prohibitive, especially when dealing with a large number of demonstrations or very large models.

A recent research paper, titled “Linear-Time Demonstration Selection for In-Context Learning via Gradient Estimation,” introduces a groundbreaking algorithm to address this efficiency bottleneck. Authored by Ziniu Zhang, Zhenshuo Zhang, Dongyue Li, Lu Wang, Jennifer Dy, and Hongyang R. Zhang, this work proposes a novel approach that leverages the gradients of the model output with respect to the input embeddings. The core idea is to use a first-order Taylor expansion to accurately estimate model outputs for various demonstration subsets without needing to run full model inference repeatedly.

How the New Approach Works

The algorithm, referred to as ICL-GradSel, operates in three main stages:

1. Pre-computing Gradients: Initially, the model’s functional outputs and gradients (with respect to the embedding vector) are computed once on the entire training set. This is a one-time cost that sets up the estimation process.

2. Gradient Estimation: Using the pre-computed gradients and a first-order approximation, the algorithm estimates the model’s outputs for multiple randomly sampled subsets of demonstrations. This stage avoids costly full inference for each subset, significantly speeding up the evaluation process. The researchers empirically validated that this gradient estimation yields approximations with less than 1% error across various LLMs and datasets, even for models with up to 34 billion parameters.

3. Demonstration Selection: Finally, an influence score is calculated for each demonstration example based on the aggregated estimated outcomes from the sampled subsets. The ‘k’ most relevant examples (those with the lowest scores, indicating better performance) are then selected to form the prompt for in-context learning.

Also Read:

Efficiency and Performance Gains

This gradient-based estimation procedure results in a linear-time algorithm relative to model and training set sizes. This is a significant improvement over existing methods, which can incur much higher computational costs. The paper demonstrates that ICL-GradSel achieves up to a 37.7x speed-up compared to full-model inference methods like forward selection and random ensemble selection, all while maintaining high accuracy.

Beyond efficiency, the selected demonstration sets also lead to superior in-context learning performance. Experiments across six diverse datasets (including sentiment classification and math reasoning tasks) show that ICL-GradSel outperforms strong baselines based on input embeddings by an average of 11%, using up to 49% less computation. In long-context scenarios, the method can match the performance of existing baselines with significantly shorter context lengths, demonstrating its ability to select highly impactful examples.

The research highlights that this gradient estimation framework is flexible and can be instantiated to accelerate various subset selection methods, such as gradient-based random ensemble (ICL-GradRE) and gradient-based forward selection (ICL-GradFS). The code to replicate these findings is available on GitHub, underscoring the practical applicability of this work. For more technical details, you can read the full paper here.

This work represents a crucial step forward in making in-context learning more efficient and effective, especially as LLMs continue to grow in size and complexity. By providing a scalable method for demonstration selection, it paves the way for broader applications of ICL in real-world scenarios.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Optimizing In-Context Learning with Linear-Time Demonstration Selection

How the New Approach Works

Efficiency and Performance Gains

Gen AI News and Updates

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Generative AI Powers Next-Gen Autonomous Emergency Response

Enhancing Large Language Model Reasoning with Concise Outputs

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates