TLDR: A new AI pipeline, featuring AnalyzerVLM and EvaluatorVLM, automates model discovery by using multi-step reasoning and visual analysis. AnalyzerVLM proposes models by generating and executing code for in-depth data analysis, while EvaluatorVLM evaluates them visually for fitness and generalizability using a novel Visual Information Criterion. This approach leads to models that accurately capture data details and generalize effectively to unseen data, outperforming existing methods and reducing reliance on human experts.
In the rapidly evolving world of data science, finding the perfect model to represent a dataset is a crucial but often challenging task. Traditionally, this process, known as model discovery, has relied heavily on human experts. However, as datasets grow in size and complexity, manual model discovery becomes increasingly difficult and time-consuming. This is where automated model discovery steps in, aiming to accelerate scientific progress by reducing the need for human intervention and efficiently exploring vast model spaces.
A recent research paper introduces an innovative approach to automated model discovery: a multi-modal and multi-step pipeline. This new system leverages the power of advanced AI, specifically two vision-language-based modules (VLMs), to propose and evaluate models in a way that mimics human reasoning and perception. You can read the full paper here: Automated Model Discovery via Multi-modal & Multi-step Pipeline.
How the Pipeline Works: AnalyzerVLM and EvaluatorVLM
The core of this new pipeline consists of two intelligent AI agents: AnalyzerVLM and EvaluatorVLM. These agents work together in an iterative, four-stage process: model proposal, model fitting, model evaluation, and model selection.
AnalyzerVLM: The Model Proposer
AnalyzerVLM acts like a seasoned data scientist, autonomously planning and executing multi-step analyses to suggest effective candidate models. It doesn’t just guess; it actively generates and runs Python code to analyze data and existing models. This multi-step reasoning allows it to delve deep into the data, identifying patterns and characteristics that might be missed by simpler approaches. For instance, it can visualize data, check residuals, and analyze periodicity, much like a human expert would, using libraries like NumPy and Matplotlib. This iterative analysis helps AnalyzerVLM propose model structures and even suggest initial parameters, leading to better-fitting models.
EvaluatorVLM: The Visual Judge
Once AnalyzerVLM proposes candidate models, EvaluatorVLM steps in to assess them. Unlike traditional methods that rely solely on numerical metrics, EvaluatorVLM evaluates models both quantitatively and perceptually. It uses a novel approach called the Visual Information Criterion (VIC), which incorporates how humans perceive a model’s fit and generalizability. By looking at visualizations of model predictions against the actual data, EvaluatorVLM scores models based on two key aspects:
- Visual Fitness: How well the model’s predictions visually match the data’s trends and details. It also considers the uncertainty of the predictions, penalizing models with overly large or suddenly expanding confidence regions.
- Visual Generalizability: Whether the model’s predictions maintain their structural consistency in extrapolated regions beyond the training data. This helps identify models that generalize well to unseen data, rather than just overfitting to the training set.
By combining these visual scores with traditional metrics, VIC helps select models that are not only accurate but also intuitively plausible and robust.
Also Read:
- AI Agents Transform Data Analysis: A Comprehensive Overview
- SciExplorer: An AI Agent for Autonomous Discovery in Physics
Why This Approach Matters
The research demonstrates that this multi-modal and multi-step pipeline effectively discovers models that capture fine details and ensure strong generalizability. Experiments show that it consistently achieves lower prediction errors compared to several existing methods, including traditional forecasting techniques and other AI-based model discovery approaches. The ablation studies further highlight that both the multi-modality (using both visual and text information) and the multi-step reasoning are crucial for the pipeline’s success.
The pipeline’s ability to interpret visual plots and dynamically adapt its analysis, much like a human expert, makes it a powerful tool. It can even be applied to other areas like symbolic regression, where it helps discover underlying mathematical functions that best describe data.
In essence, this work represents a significant step towards creating more intelligent and autonomous AI systems for scientific discovery, making the complex process of model discovery more efficient, accurate, and interpretable.


