Assessing the performance of deep learning models for multivariate probabilistic energy forecasting

Aleksei Mashlakov, Toni Kuronen, Lasse Lensu, Arto Kaarna, Samuli Honkapuro
2021 Applied Energy  
A B S T R A C T Deep learning models have the potential to advance the short-term decision-making of electricity market participants and system operators by capturing the complex dependences and uncertainties of power system operation. Currently, however, the adoption of global deep learning models for multivariate energy forecasting in power systems is far behind the developments in the deep learning research field. In this context, the objectives of this study are to review recent
more » ... in the field of probabilistic, multivariate, and multihorizon time series forecasting and empirically evaluate the performance of novel global deep learning models for forecasting wind and solar generation, electricity load, and wholesale electricity price for intraday and day-ahead time horizons. Two forecast types, deterministic and probabilistic forecasts, are studied. The evaluation data consist of real-world datasets with hourly resolution at the levels of an individual customer and regional and national electricity market bidding zones. The model evaluation criteria include achievable levels of forecasting accuracy and uncertainty risks, hyperparameter sensitivity, the effect of exogenous variables and fieldwise dataset split, and run-time efficiency factors, such as memory utilization, simulation time, electricity consumption, and convergence rate. We conclude that the performance of the global models is more beneficial for intraday forecasts of heterogeneous datasets with nonuniform patterns of time series, but can be affected by the hyperparameter sensitivity and hardware limitations with the growth of dataset dimensionality. The results can serve as a reference point for the quantitative evaluation of deep learning models for probabilistic multivariate energy forecasting in power systems. .fi (A. Mashlakov). However, there is a strong trend in the academia and interest in the industry toward the transition from the deterministic forecasts to probabilistic methods with uncertainty quantification [2] . A probabilistic forecast provides a possible range of forecasting errors with the respective probability in contrast to pointwise deterministic forecasts. Moreover, the adoption of a forecasting tool/product enables reducing the operating costs if appropriately applied for solving riskconstrained decision-making problems [3] . The risk management is especially important in the conditions of volatile spot and reserve market prices caused by the large uncertainties of varying electricity load and weather-dependent renewable generation from wind and solar radiation. Therefore, the quantification of uncertainty as a vital part of risk management is critical for truly optimal decision-making in power systems. Furthermore, a substantial body of research in the probabilistic energy forecasting literature focuses on local (univariate) forecasting A. Mashlakov et al. problems [4] where marginal predictive densities are estimated per individual time series assuming (conditional) independence of time series in high-dimensional settings. However, these local approaches exclude the important effects of complex temporal, spatial, and cross-lagged correlations in power systems [5] . The examples of such correlations are the dependence between successive lead times of electricity market price, the parameters of renewable generation (e.g., solar radiation, wind speed) between power plant locations, and the lagged effect of weather parameters on a load profile. Disregarding these dependences with marginal description in power systems-related operational problems with multiple power plants or optimization time periods leads to suboptimal decisions, and hence, is often insufficient forecasting approach [6] . In contrast, the ability to extract and leverage the timeinvariant patterns by simultaneously considering several variables can potentially provide more accurate predictions and lower costs [7] . As a result, these factors have provoked an interest in probabilistic forecasting problems with multivariate predictive distributions that can leverage spatio-temporal and cross-lagged correlations with a single global (multivariate) model [8] . The key challenges for accurate and efficient forecasting in the probabilistic multivariate forecasting problems include the following: (a) recognizing short-term and long-term dynamics and noise characteristics of individual time series; (b) discovering nonlinear covariate and latent relationships between the exogenous (i.e., field-independent series or outside influences) and endogenous (i.e., field-dependent) series; (c) sharply (i.e., by minimal variance) and reliably (i.e., by minimal bias) estimating the uncertainty of model predictions; (d) mitigating the effects of a varying time series scale; (e) making predictions in the conditions of data sparsity and "cold start", i.e., new variable or system changes; and (f) being scalable for a large amount of time series [9] [10] [11] . Traditionally, statistical multivariate forecasting techniques such as vector autoregression (VAR) and vector autoregressive integrated moving average (VARIMA) [4], linear support vector regression (LSVR) [12], multivariate generalized autoregressive conditional heteroskedasticity (MGARCH) models [13], linear ridge (LRidge) regression [14] and Gaussian processes (GP) [15] have been used for such problems, but they have several limitations related to nonlinearity and scalability [16]. Recently, several novel global deep learning (DL) architectures for multivariate time series forecasting with the capabilities to tackle these challenges have been proposed: autoregressive recurrent networks (DeepAR) [9], deep factor models with random effects (DFM-RF) [10], long-and short-term time series network (LSTNet) [16], temporal pattern attention (TPA) [17], deep temporal convolutional network (DeepTCN) [18], and dual selfattention network (DSANet) [19], to name a few. These models have demonstrated superior accuracy. It is worth noting that any DL method initially designed solely for point forecasts can be extended to provide an estimate of the related uncertainty by applying variational approximation by the Monte Carlo (MC) dropout [20] . Furthermore, progress has been made in explainability and interpretability of DL models also in the context of time series [21] (e.g., through an attention mechanism's weights [22] or saliency maps [23] over time dimensions and features), which gradually changes the general opinion of them as representing fully "black box" models and brings them closer to industry acceptance. These factors along with the wealth of data being collected in power systems all the time and the rapid increase in computing capabilities potentially make them a promising method for probabilistic multivariate energy forecasting.
doi:10.1016/j.apenergy.2020.116405 fatcat:rtotx52vgzcqljij2lysllazcq