spot_img
HomeResearch & DevelopmentEfficiently Combining Tasks for AI on Your Phone

Efficiently Combining Tasks for AI on Your Phone

TLDR: This research introduces ‘Learnable Calibration’, a new method enabling Large Language Models (LLMs) on devices like smartphones to perform multiple tasks simultaneously (e.g., summarizing and translating text) in a single step. It achieves high performance with minimal additional storage by calibrating existing task-specific adapters, addressing the limitations of current inefficient or ineffective approaches for on-device compositional multi-tasking.

Large Language Models (LLMs) have transformed how we interact with technology, excelling in tasks from answering questions to summarizing text. While powerful, these models often require substantial computational resources, making their deployment on everyday devices like smartphones a significant challenge. A common approach to adapt LLMs for specific tasks is through Parameter-Efficient Fine-Tuning (PEFT), particularly using Low-Rank Adapters (LoRAs). These adapters allow models to learn new skills with minimal additional parameters, making them suitable for on-device use.

Traditionally, LLMs are fine-tuned for one task at a time. For instance, you might have one adapter for summarization and another for translation. When a user needs to perform multiple tasks, like summarizing a text and then translating that summary, current methods often fall short. Existing ‘model merging’ techniques, which combine multiple task-specific models into one, are typically designed for scenarios where only one task is performed at a time. They struggle when a single input requires the simultaneous execution of multiple tasks, a concept the researchers call ‘compositional multi-tasking’. Imagine needing a translated summary of a long document – this requires both summarization and translation to happen concurrently and efficiently.

The core problem addressed by this research is enabling this ‘compositional multi-tasking’ on resource-constrained devices. Simple, multi-step pipelines (where one task is done, then its output is fed to another task) are inefficient, requiring multiple processing passes and longer times. Training a completely new adapter for every possible combination of tasks is also impractical due to storage limitations on devices.

Introducing Learnable Calibration

To overcome these limitations, researchers from Samsung R&D Institute UK and Samsung Research propose a novel method called Learnable Calibration. The key idea is to leverage the existing task-specific LoRAs already present on a device and then ‘calibrate’ them with a very small number of additional, learnable parameters. This calibration process allows the merged adapters to handle multiple tasks simultaneously in a single inference pass, achieving high performance without significant computational or storage overhead.

The method works by taking linearly merged single-task LoRAs as a starting point. Then, a small set of ‘calibration parameters’ are learned on compositional task data. These parameters are designed to be shared across different layers of the model, further enhancing efficiency. Two variations of Learnable Calibration were explored: a smaller version that learns column-wise biases (Learnable Calibration) and a larger, more expressive version that introduces two low-rank matrices (Learnable Calibration++).

A New Benchmark for On-Device Multi-tasking

To properly evaluate compositional multi-tasking, the researchers developed a new benchmark. This benchmark includes four practical compositional tasks:

  • Summarization combined with translation (English to Spanish, French, or German)
  • Summarization combined with tone adjustment (professional, casual, witty, paraphrase)
  • Reply suggestion combined with translation
  • Reply suggestion combined with tone adjustment

This comprehensive benchmark allows for a thorough assessment of how different approaches handle the complexities of simultaneous task execution on devices with limited resources.

Also Read:

Key Findings and Impact

The experimental results, using models suitable for on-device deployment (like LLaMA 3.2 1B, Qwen2.5 1.5B, and StableLM2 1.6B), demonstrate the effectiveness of Learnable Calibration. Traditional merging strategies and zero-shot approaches performed poorly for compositional multi-tasking. While inefficient baselines (like multi-step LoRA usage or training a dedicated ‘joint-expert’ LoRA for each compositional task) showed good performance, they were resource-intensive.

Learnable Calibration, especially its ‘++’ variation, achieved comparable or even superior performance to these inefficient baselines, but with significantly less overhead. It requires only a minimal number of additional parameters (approximately 0.08–0.56% of a joint-expert LoRA’s parameters), translating to less than 0.5 MB of additional storage. This makes it highly suitable for on-device applications.

The research also showed that Learnable Calibration scales well with different model sizes and can even handle combinations involving three tasks (e.g., summarization, tone adjustment, and translation). The qualitative analysis revealed that while other methods often fail to perform one or both tasks in a compositional setting, Learnable Calibration consistently succeeds in executing all required tasks. This work lays crucial groundwork for advancing the capabilities of LLMs in real-world, resource-constrained multi-tasking scenarios. For more details, you can read the full paper here.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -