To properly evaluate the accuracy of a demand forecasting model, it is important to use reliable and standard evaluation metrics, incorporate multiple time horizons into the analysis, compare the model’s forecasts to naive benchmarks, test the model on both training and holdout validation datasets, and continuously refine the model based on accuracy results over time.
Some key evaluation metrics that should be calculated include mean absolute percentage error (MAPE), mean absolute deviation (MAD), and root mean squared error (RMSE). These metrics provide a sense of the average error and deviation between the model’s forecasts and actual observed demand values. MAPE in particular gives an easy to understand error percentage. Forecast accuracy should be calculated based on multiple time horizons, such as weekly, monthly, and quarterly, to ensure the model can accurately predict demand over different forecast windows.
It is also important to compare the model’s forecast accuracy to some simple benchmark or naive models as a way to establish whether the proposed model actually outperforms simple alternatives. Common benchmarks include seasonal naïve models that forecast based on historical seasonality, or drift models that assume demand will remain flat relative to the previous period. If the proposed model does not significantly outperform these basic approaches, it may not be sophisticated enough to truly improve demand forecasts.
Model evaluation should incorporate forecasts made on both the data used to train the model, as well as newly observed holdout test datasets not involved in the training process. Comparing performance on the initial training data versus later holdout periods helps indicate whether the model has overfit to past data patterns or can generalize to new time periods. Significant degradation in holdout accuracy may suggest the need for additional training data, different model specifications, or increased regularization.
Forecast accuracy tracking should be an ongoing process as new demand data becomes available over time. Regular re-evaluation allows refinement of the model based on accuracy results, helping to continually improve performance. Key areas that could be adapted based on ongoing accuracy reviews include variables included in the model, algorithm tuning parameters, data preprocessing techniques, and overall model design.
When conducting demand forecast evaluations, other useful metrics may include analysis of directional errors to determine whether the model tends to over or under forecast on average, tracking of accuracy over time to identify degrading performance, calculation of error descriptors like skew and kurtosis, and decomposition of total error into systemic versus irregular components. Graphical analysis through forecast error plots and scatter plots against actuals is also an insightful way to visually diagnose sources of inaccuracy.
Implementing a robust forecast accuracy monitoring process as described helps ensure the proposed demand model can reliably and systematically improve prediction quality over time. Only through detailed, ongoing model evaluations using multiple standard metrics, benchmark comparisons, and refinements informed by accuracy results can the true potential of a demand forecasting approach be determined. Proper evaluation also helps facilitate continuous improvements to support high-quality decision making dependent on these forecasts. With diligent accuracy tracking and refinement, data-driven demand modelling can empower organizations through more accurate demand visibility and insightful predictive analytics.
To adequately evaluate a demand forecasting model, reliability metrics should be used to capture average error rates over multiple time horizons against both training and holdout test data. The model should consistently outperform naive benchmarks and its accuracy should be consistently tracked and improved through ongoing refinements informed by performance reviews. A thoughtful, methodical evaluation approach as outlined here is required to appropriately determine a model’s real-world forecasting capabilities and ensure continuous progress towards high prediction accuracy.