Advancing AutoML with In-Context Learning for Diverse ML Workflows

TLDR: This research introduces CASH+, an extension of AutoML for optimizing complex machine learning pipelines that include fine-tuning and ensembling, not just hyperparameter optimization. It proposes PS-PFN, a new method that uses prior-data fitted networks (PFNs) and in-context learning to efficiently select and adapt these diverse pipelines by modeling their varied performance and cost characteristics. Experimental results show PS-PFN outperforms existing methods, offering a more flexible and effective approach to modern AutoML challenges.

Automated Machine Learning, or AutoML, has been a game-changer for building powerful machine learning models. Traditionally, a key challenge in AutoML has been Combined Algorithm Selection and Hyperparameter Optimization (CASH), which involves picking the best machine learning algorithm and fine-tuning its settings for a specific task. However, with the rise of advanced pre-trained models, modern machine learning workflows are becoming much more complex. They often involve not just hyperparameter tuning, but also fine-tuning, combining multiple models (ensembling), and other specialized adaptation techniques.

The core problem remains: how do we find the best-performing model for a given task? But the increasing variety and complexity of these ML pipelines demand new approaches to AutoML. This is where a new framework called CASH+ comes in. It extends the traditional CASH framework to handle the selection and adaptation of these modern, diverse ML pipelines.

The researchers propose a new method called PS-PFN, which stands for Posterior Sampling using Prior-Data Fitted Networks. This method is designed to efficiently explore and optimize these complex ML pipelines. It tackles the problem as a “Max K-armed Bandit” problem, which is like having several slot machines (each representing an ML pipeline) and trying to figure out which one will give the highest payout over time, especially when the payouts can change and vary significantly.

PS-PFN uses a clever technique called “prior-data fitted networks” (PFNs). Think of PFNs as smart, pre-trained neural networks that can quickly learn from a small amount of data to estimate the potential performance of each ML pipeline. This “in-context learning” allows PS-PFN to make quick and informed decisions about which pipeline to try next, even when the performance patterns are unusual or change over time. Unlike older methods that assume all pipelines behave similarly, PS-PFN is flexible enough to handle pipelines with very different performance characteristics and even varying costs (like how long it takes to run a particular optimization step).

The paper highlights three main challenges that PS-PFN addresses: heterogeneous reward distributions (different pipelines perform very differently), changes in reward over time (a pipeline might improve or its performance pattern might shift), and varying costs associated with running different pipelines. By using PFNs, PS-PFN can model these complex behaviors effectively.

Experimental results on both new and existing benchmark tasks show that PS-PFN performs better than other common bandit and AutoML strategies. This indicates a significant step forward in making AutoML systems more adaptable and efficient for the increasingly diverse world of machine learning. The code and data for this research are openly available for others to use and build upon. You can find more details in the full research paper: In-Context Decision Making for Optimizing Complex AutoML Pipelines.

Also Read:

While PS-PFN offers superior performance, the authors acknowledge some limitations. It can be computationally more expensive than simpler methods, especially for very fast optimization steps. Also, its theoretical analysis is complex due to the use of synthetic data generation and machine learning models to approximate distributions. However, for typical AutoML tasks where individual optimization steps are already time-consuming, the benefits of PS-PFN often outweigh these costs.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Advancing AutoML with In-Context Learning for Diverse ML Workflows

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates