ELATE: Automating Feature Creation for Better Time-Series Predictions

TLDR: ELATE is a novel framework that uses large language models (LLMs) within an evolutionary optimization process to automate feature engineering for time-series data. It significantly improves forecasting accuracy (average 8.4% RMSE reduction) and is more efficient than traditional methods, while also providing interpretable feature code. The system leverages LLMs’ contextual understanding to propose relevant transformations, which are then evaluated and pruned iteratively.

Time-series prediction, which involves forecasting future values from historical data, is a critical task across many industries, from predicting stock prices to understanding disease progression. While machine learning models have become increasingly popular for these tasks, a significant challenge remains: feature engineering. This process, where existing data features are transformed into new, more informative ones, is crucial for model performance but is often manual, time-consuming, and requires deep domain expertise.

Traditional attempts to automate feature engineering often rely on exhaustive enumeration, which can be computationally expensive and lacks the nuanced understanding that a human data scientist brings to the table. These methods might miss valuable transformations or struggle with the sheer volume of possibilities, especially in complex time-series data where temporal relationships are key.

Introducing ELATE: A New Approach to Automated Time-Series Feature Engineering

Researchers Andrew Murray, Danial Dervovic, and Michael Cashmore from JP Morgan AI Research have introduced a novel solution called ELATE, which stands for Evolutionary Language model for Automated Time-series Engineering. This innovative framework combines the power of large language models (LLMs) with an evolutionary optimization process to automate the creation of features for time-series data.

ELATE addresses the limitations of previous automation efforts by leveraging the extensive domain knowledge embedded within LLMs. Instead of blindly trying every possible transformation, the language model proposes new, contextually relevant feature transformations. This is a significant departure from older methods, as LLMs can understand the ‘why’ behind a feature, such as recognizing that Body Mass Index (BMI) is a useful indicator for diabetes prediction, even if it requires multiple intermediate calculation steps.

How ELATE Works

The ELATE system operates by maintaining a dynamic collection of features. It starts with an initial set, and then, in an iterative process, the LLM is prompted to generate new features. This prompt includes a description of the dataset, examples of existing features, and information about previously generated features and their performance scores. This feedback loop helps the LLM learn and propose increasingly effective transformations.

Once a new feature is proposed (in the form of Python code), it undergoes a validation process to ensure the code is correct and safe to execute. If valid, the feature is then evaluated using specific time-series statistical measures, namely Granger causality and mutual information. These measures help quantify the predictive power of the new feature on the target variable, capturing both linear and non-linear relationships.

To manage the growing number of features, ELATE employs a SHAP (SHapley Additive exPlanations) filter. This filter intelligently prunes low-scoring or redundant features, ensuring that the system maintains a compact set of high-quality, impactful features. This evolutionary cycle of generation, evaluation, and selection allows ELATE to continuously refine and improve its feature set over multiple generations.

Demonstrated Performance and Efficiency

The researchers conducted extensive experiments across seven diverse time-series prediction tasks, including forecasting influenza cases, store sales, electricity transformer temperature, and energy demand. ELATE consistently outperformed eight baseline methods, including traditional feature engineering packages like VEST and TSFRESH, as well as LSTM neural networks.

On average, ELATE improved forecasting accuracy by 8.4% in terms of Root Mean Squared Error (RMSE) and 9.6% in Mean Absolute Error (MAE) compared to models without any feature engineering. Notably, ELATE proved to be significantly more time and memory efficient than exhaustive expand-and-reduce approaches like TSFRESH, which often exceeded memory limits on larger datasets. ELATE was able to engineer features for problems with nearly 180,000 rows in a matter of hours, a task that would typically take data scientists days.

The study also explored the cost-effectiveness of using different LLMs. While GPT-4o yielded slightly better results, the more cost-effective GPT-3.5 Turbo still provided significant improvements over base features at a fraction of the cost, making ELATE a viable option for various budgets.

Also Read:

Interpretability and Future Directions

Unlike complex deep neural networks that learn features internally and are often difficult to interpret, ELATE explicitly returns the Python code used to generate each feature, along with a description of its potential utility. This interpretability is a major advantage, especially in high-stakes applications like healthcare and finance, where understanding the model’s decisions is crucial.

While ELATE represents a significant leap forward, the authors acknowledge areas for future improvement, such as optimizing LLM querying costs and exploring more advanced prompting strategies. They also emphasize that ELATE is designed to augment, not replace, human data scientists, as human oversight is still valuable to ensure the generated features make practical sense for the given task.

For more in-depth details, you can read the full research paper: ELATE: Evolutionary Language model for Automated Time-series Engineering.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

ELATE: Automating Feature Creation for Better Time-Series Predictions

Introducing ELATE: A New Approach to Automated Time-Series Feature Engineering

How ELATE Works

Demonstrated Performance and Efficiency

Interpretability and Future Directions

Gen AI News and Updates

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

A New Way to Disentangle Data for Scientific Exploration

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates