spot_img
HomeResearch & DevelopmentBalancing Accuracy and Robustness: A Deep Dive into Demand...

Balancing Accuracy and Robustness: A Deep Dive into Demand Forecasting Evaluation Functions

TLDR: This research paper compares two evaluation functions for demand forecasting models: FMAE and HEF. FMAE focuses on minimizing mean absolute errors and is computationally efficient, while HEF, a hierarchical multi-metric function, prioritizes explanatory power, global accuracy, and robustness against large errors. Experiments show HEF consistently outperforms FMAE in global metrics, making it suitable for long-term strategic planning, whereas FMAE is more efficient for short-term operational applications due to its focus on average error and faster execution.

Demand forecasting is a crucial element for businesses to plan effectively, manage resources, and adapt to market changes. However, predicting future demand, especially for multiple products over time, is complex due to fluctuating data, inherent uncertainties, and sudden market shifts. Traditional methods often rely on single evaluation metrics, which can sometimes lead to biased results and limit how well a model performs in real-world situations.

A recent research paper, Hierarchical Evaluation Function (HEF): A Multi-Metric Approach for Optimizing Demand Forecasting Models, by Adolfo González and Víctor Parada, delves into this challenge by comparing two specialized evaluation functions: FMAE (Focused Mean Absolute Error) and HEF (Hierarchical Evaluation Function). The study aims to find better ways to optimize demand forecasting models, ensuring they are more accurate and robust.

Understanding the Evaluation Functions

The paper highlights that relying on a single metric like Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE) can be limiting. While MAE is good for understanding average errors and is less affected by extreme values, RMSE heavily penalizes larger errors. Neither, however, provides a complete picture of a model’s performance, especially when comparing across different types of data or business goals.

To address this, the researchers propose and evaluate two functions:

  • FMAE (Focused Mean Absolute Error): This function is straightforward, focusing on minimizing the average absolute difference between predicted and actual values. It’s computationally simple and effective when the primary goal is to control the average error.

  • HEF (Hierarchical Evaluation Function): This is a more sophisticated approach. HEF combines three key metrics: R² (Coefficient of Determination), MAE, and RMSE. R² measures how well the model explains the variability in the data, while MAE and RMSE focus on error magnitude. HEF also includes a system of progressive penalties for large errors or illogical predictions (like negative demand forecasts), making it more robust. It even adapts its tolerance thresholds based on the variability of the data, meaning it’s stricter for stable demand patterns and more lenient for highly volatile ones.

The Experiment and Key Findings

The researchers conducted extensive experiments using various demand forecasting models, from traditional statistical methods like ARIMA to modern machine learning techniques such as XGBoost and deep neural networks like LSTM. They tested these models across different datasets (Walmart, M3, M4, M5) and with various data splits for training and testing (91:9, 80:20, and 70:30). To optimize the models, they used three different hyperparameter optimizers: Grid Search, Particle Swarm Optimization (PSO), and Optuna (based on Bayesian optimization).

The results showed a clear and consistent pattern, regardless of the data split or the optimizer used:

  • HEF’s Strengths: HEF consistently outperformed FMAE in global metrics. This includes R² (indicating better explanatory power), Global Relative Accuracy (measuring overall cumulative accuracy), RMSE, and RMSSE (both sensitive to and penalizing large errors). This means models optimized with HEF were better at explaining the overall trends in demand and were more robust against significant forecasting errors.

  • FMAE’s Strengths: FMAE maintained advantages in local metrics like MAE and MASE (Mean Absolute Scaled Error), which focus on average absolute errors. It also generally resulted in shorter execution times, making it more computationally efficient.

A crucial finding was that the improvements observed with HEF were directly attributable to the design of the evaluation function itself, not to the specific optimization method employed. Statistical tests confirmed these differences were highly significant, ruling out chance as a factor.

Also Read:

Choosing the Right Tool for the Job

The study concludes that there’s a clear trade-off between the two evaluation functions. HEF is the more robust choice for strategic business planning and long-term forecasting, where understanding the overall explanatory power of the model and minimizing the impact of large, potentially costly errors is paramount. Its ability to adapt to data volatility and penalize undesirable predictions makes it ideal for complex, uncertain environments.

On the other hand, FMAE is more efficient for short-term operational applications or in situations where computational simplicity and strict control over average errors are the main priorities. It’s a practical option for environments with limited resources or where quick, consistent average error reduction is key.

Ultimately, the research provides a flexible framework for optimizing predictive models in dynamic settings, emphasizing that the choice of evaluation function should align directly with the specific objectives and context of the demand forecasting task.

Nikhil Patel
Nikhil Patelhttps://blogs.edgentiq.com
Nikhil Patel is a tech analyst and AI news reporter who brings a practitioner's perspective to every article. With prior experience working at an AI startup, he decodes the business mechanics behind product innovations, funding trends, and partnerships in the GenAI space. Nikhil's insights are sharp, forward-looking, and trusted by insiders and newcomers alike. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -