Robust machine-learned algorithms for efficient grid operation

Nicolas Christianson; Christopher Yeh; Tongxin Li; Mehdi Hosseini; Mahdi Torabi Rad; Azarang Golmohammadi; Adam Wierman

doi:10.1017/eds.2024.28

Robust machine-learned algorithms for efficient grid operation

Published online by Cambridge University Press: 22 April 2025

Tongxin Li ,

Azarang Golmohammadi and

Adam Wierman

Show author details

Nicolas Christianson*: Affiliation:
Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA
Christopher Yeh: Affiliation:
Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA
Tongxin Li: Affiliation:
School of Data Science, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong, China
Mehdi Hosseini: Affiliation:
Beyond Limits, Glendale, California, USA
Mahdi Torabi Rad: Affiliation:
Beyond Limits, Glendale, California, USA
Azarang Golmohammadi: Affiliation:
Beyond Limits, Glendale, California, USA
Adam Wierman: Affiliation:
Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA
*: Corresponding author: Nicolas Christianson; Email: [email protected]

Article contents

Abstract
Impact Statement
Introduction
Model and Preliminaries
Algorithm
Experimental Results and Discussion
Conclusions
Proof of Theorem 3.1
Author contribution
Competing interest
Data availability statement
Ethical standard
Funding statement
Footnotes
References

Abstract

Increasing penetration of variable and intermittent renewable energy resources on the energy grid poses a challenge for reliable and efficient grid operation, necessitating the development of algorithms that are robust to this uncertainty. However, standard algorithms incorporating uncertainty for generation dispatch are computationally intractable when costs are nonconvex, and machine learning-based approaches lack worst-case guarantees on their performance. In this work, we propose a learning-augmented algorithm, RobustML, that exploits the good average-case performance of a machine-learned algorithm for minimizing dispatch and ramping costs of dispatchable generation resources while providing provable worst-case guarantees on cost. We evaluate the algorithm on a realistic model of a combined cycle cogeneration plant, where it exhibits robustness to distribution shift while enabling improved efficiency as renewables penetration increases.

Keywords

energy systems learning-augmented algorithms machine learning robustness online algorithms

Type: Methods Paper
Information: Environmental Data Science , Volume 4 , 2025 , e24

DOI: https://doi.org/10.1017/eds.2024.28 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Open Practices: Open data Open materials
Copyright: © Beyond Limits, Inc. and the Author(s), 2025. Published by Cambridge University Press

Impact Statement

We confront the challenge of designing efficient and robust algorithms for grid operation with the paradigm of learning-augmented algorithms, which leverage the excellent performance of machine learning while provably guaranteeing worst-case performance bounds and providing significant computational speedups over standard algorithms. Beyond the setting of generation dispatch, we propose that learning-augmented algorithms hold promise for safely utilizing the power of machine learning in a broader array of energy and sustainability-related control and optimization tasks.

1. Introduction

The need to reduce greenhouse gas emissions to mitigate the impacts of anthropogenic climate change is driving an energy transition characterized by large amounts of renewable generation resources being added to the grid. The variability of solar and wind energy and a relative scarcity of large-scale energy storage require the operation of dispatchable resources, such as fossil fuel generation, to balance out the fluctuations in renewable energy production and maintain reliable grid operation. However, conventional fossil fuel generators incur significant additional costs from the frequent cycling and ramping they must perform under high penetration of renewables, due both to decreased fuel efficiency and increased maintenance required from operating in this regime (Troy et al., Reference Troy, Denny and O’Malley2010). Moreover, most dispatchable resources are limited in their ramp rate, and thus under high penetrations of renewables, such resources must be operated in a manner that anticipates system ramp needs, taking into account the high costs of frequent ramping while ensuring adequate supply to meet demand.

Operating generation optimally requires minimizing fuel costs while accounting for the intertemporal coupling of decisions, including both ramp costs and ramp limits. A natural approach for this problem is model predictive control (MPC), an algorithm that utilizes near-term forecasts of demand and other conditions to choose decisions that minimize aggregate cost over a fixed lookahead horizon (Morari & Lee, Reference Morari and Lee1999). In addition to theoretical work confirming its good performance (e.g., Lin et al., Reference Lin, Hu, Shi, Sun, Qu and Wierman2021), MPC performs well in practice and has been deployed in a number of energy and sustainability-related domains, including control of wind turbines and solar photovoltaics (Lio et al., Reference Lio, Rossiter and Jones2014; Sultana et al., Reference Sultana, Sahoo, Sukchai, Yamuna and Venkatesh2017), smart buildings (Carli et al., Reference Carli, Cavone, Ben Othman and Dotoli2020; Killian & Kozek, Reference Killian and Kozek2016), energy storage (Meng et al., Reference Meng, Dong, Xu and Weller2015; Morstyn et al., Reference Morstyn, Hredzak, Aguilera and Agelidis2018), and electric vehicle charging (Lee et al., Reference Lee, Chang, Jin, Lee, Lee, Lee and Low2018). Moreover, several regional power system operators in the US use MPC to settle the real-time electricity market (Ela & O’Malley, Reference Ela and O’Malley2016), and it is widely understood that such lookahead algorithms will play an increasingly important role in enabling power systems to reliably absorb renewable energy volatility (Hua et al., Reference Hua, Schiro, Zheng, Baldick and Litvinov2019; Zhao et al., Reference Zhao, Zheng and Litvinov2020).

However, MPC suffers computational complexity that grows exponentially with the lookahead horizon if the system model or costs are nonconvex, such as in the case of combined cycle generators (Bjelogrlic, Reference Bjelogrlic2000). In practice, it is thus typically necessary to use heuristic solvers that operate on a faster timescale but generally produce suboptimal decisions (Cao et al., Reference Cao, Lai and Alam2017; Deng et al., Reference Deng, Sun, Li, Lu, Brouwer, Mehta, Zhou and Chakraborty2015; Kelman et al., Reference Kelman, Ma and Borrelli2013). One promising avenue for overcoming the computational complexity of nonconvex MPC to enable improved performance in practice is the development of machine learning (ML) models that imitate its behavior, bypassing the need to solve an independent nonconvex optimization problem to generate each decision. This approach of “learning to control/optimize” has seen significant recent interest in the ML, control, and power systems communities (Fioretto et al., Reference Fioretto, Mak and Van Hentenryck2020; Kotary et al., Reference Kotary, Fioretto and Van Hentenryck2021; Nellikkath & Chatzivasileiadis, Reference Nellikkath and Chatzivasileiadis2022; Pan et al., Reference Pan, Chen, Zhao and Low2022; Pon Kumar et al., Reference Pon Kumar, Tulsyan, Gopaluni and Loewen2018; Spielberg et al., Reference Spielberg, Gopaluni and Loewen2017; Zhang et al., Reference Zhang, Tabas and Zhang2023). While several works develop approaches to guarantee constraint satisfaction for ML controllers (e.g., Zhang et al., Reference Zhang, Tabas and Zhang2023), none of these learning-based approaches come with a priori guarantees on their cost under distribution shift or on out-of-sample problem instances, jeopardizing their performance at deployment time. To counter this potential for poor performance and enable the reliable deployment of learned proxies for MPC in real-world settings, this work proposes an algorithm to robustify the behavior of such an ML proxy. We follow the paradigm of the emerging literature on learning-augmented algorithms (e.g., Lee et al., Reference Lee, Maghakian, Hajiesmaili, Li, Sitaraman and Liu2021; Lykouris and Vassilvtiskii, Reference Lykouris and Vassilvtiskii2018; Purohit et al., Reference Purohit, Svitkina and Kumar2018), specifically building upon the line of work (Antoniadis et al., Reference Antoniadis, Coester, Elias, Polak and Simon2020; Christianson, Handina, et al., Reference Christianson, Handina and Wierman2022; Christianson et al., Reference Christianson, Shen and Wierman2023; Li et al., Reference Li, Yang and Ren2022; Rutten et al., Reference Rutten, Christianson, Mukherjee and Wierman2023) designing algorithms for online optimization with switching costs, a generalization of the dispatch problem, that can exploit the performance of an ML algorithm while providing worst-case guarantees on cost.

1.1. Contributions

Our contributions are twofold. First, we propose a learning-augmented algorithm RobustML for online optimization with switching costs that achieves the best deterministic performance bound in the setting of general nonconvex cost functions. Specifically, when provided with an ML algorithm for the problem as well as a heuristic baseline algorithm, for any desired $ \unicode{x025B}, \delta >0 $ , our algorithm achieves a cost at most $ \left(1+\unicode{x025B} +\delta \right) $ times the cost incurred by the ML algorithm while maintaining a worst-case cost bound of $ \mathcal{O}\left(\frac{\mathrm{C}}{\delta }+\frac{D}{\unicode{x025B}}\right) $ , where $ \mathrm{C} $ is the cost of the heuristic baseline and $ D $ is the diameter of the decision space. In other words, RobustML achieves performance that is close to that of ML, yet robust in comparison to the chosen baseline algorithm. This is the best-known tradeoff for deterministic algorithms, as all prior deterministic learning-augmented algorithms with bounded worst-case robustness paid at least three times the cost of the ML algorithm (Antoniadis et al., Reference Antoniadis, Coester, Elias, Polak and Simon2020).Footnote ¹ Second, we empirically evaluate the performance of RobustML on a realistic model of a combined cycle cogeneration power plant under increasing penetration of renewable energy. We find that an ML-based approach can improve computation time over MPC by three orders of magnitude, and while the performance of a pure ML approach suffers in response to distribution shift, our algorithm RobustML takes advantage of the good performance of ML in the nominal case (improving over the baseline by 10–15%) while preserving robustness to distribution shift. Moreover, RobustML can achieve better performance than both the pure ML approach and the baseline approach when there is increasing renewable penetration.

Our work has the potential for both direct and downstream impact on the problem of climate change. Our results suggest that using RobustML for real-world grid operation could yield modest but tangible efficiency improvements, leading to reduced emissions. Moreover, RobustML’s robustness guarantees and lookahead use could enable greater penetration of renewables while maintaining grid reliability. More generally, we see great promise in using learning-augmented algorithms like RobustML to achieve efficiency improvements without sacrificing robustness in other energy and sustainability-related domains where MPC is widely used (Carli et al., Reference Carli, Cavone, Ben Othman and Dotoli2020; Killian & Kozek, Reference Killian and Kozek2016; Lee et al., Reference Lee, Chang, Jin, Lee, Lee, Lee and Low2018; Lio et al., Reference Lio, Rossiter and Jones2014; Meng et al., Reference Meng, Dong, Xu and Weller2015; Morstyn et al., Reference Morstyn, Hredzak, Aguilera and Agelidis2018; Sultana et al., Reference Sultana, Sahoo, Sukchai, Yamuna and Venkatesh2017).

2. Model and Preliminaries

We consider the problem of operating a combined cycle cogeneration power plant to meet both electricity and steam demand in the presence of exogenous variable renewable generation; see Figure 1 for a schematic. Specifically, we consider a plant with three gas turbines and a single steam turbine; we index the gas and steam turbines as $ \left\{\mathrm{1,2,3}\right\} $ and $ \left\{4\right\} $ , respectively. At each time $ t\in \left\{1,\dots, T\right\} $ , the plant operator observes an electricity demand (net of renewables) $ {d}_t\in {\mathrm{\mathbb{R}}}_{+} $ and a steam demand $ {q}_t\in {\mathrm{\mathbb{R}}}_{+} $ . In response, they choose energy dispatch setpoints $ {p}_t^{(i)}\in \left[{\underline{p}}_i,{\overline{p}}_i\right] $ and steam dispatch setpoints $ {s}_t^{(i)}\in \left[{\underline{s}}_i,{\overline{s}}_i\right] $ for all the turbines $ i=1,\dots, 4 $ , where $ {\underline{p}}_i,{\overline{p}}_i $ are the lower and upper bounds, respectively, for the energy dispatch, and $ {\underline{s}}_i,{\overline{s}}_i $ are the lower and upper bounds, respectively, for the steam dispatch. Note that while the gas turbines $ i=1,2,3 $ produce steam, and thus have positive steam dispatches, the steam turbine $ i=4 $ consumes steam, so $ {s}_t^{(4)}<0 $ . The plant operator’s goal is to minimize the total cost of its dispatch decisions and its cost for ramping electricity generation while producing sufficient electricity and steam to meet demand. Formally, the plant operator faces the following constrained minimization problem:

(2.1)

$$ \underset{\begin{array}{c}{\mathbf{p}}_1,\dots, {\mathbf{p}}_T\in [\underline {\mathbf{p}},\overline{\mathbf{p}}]\\ {}{\mathbf{s}}_1,\dots, {\mathbf{s}}_T\in \left[\underline {\mathbf{s}},\overline{\mathbf{s}}\right]\end{array}}{\min}\hskip1em \sum \limits_{t=1}^T{f}_t\left({\mathbf{p}}_t,{\mathbf{s}}_t\right)+\alpha \parallel {\mathbf{p}}_t-{\mathbf{p}}_{t-1}{\parallel}_1 $$

(2.2)

$$ \mathrm{s}.\mathrm{t}.\hskip1em \sum \limits_{i=1}^4{p}_t^{(i)}={d}_t\hskip2em \mathrm{for}\hskip0.42em \mathrm{all}\hskip0.42em t=1,\dots, T $$

(2.3)

$$ \sum \limits_{i=1}^4{s}_t^{(i)}={q}_t\hskip2em \mathrm{for}\hskip0.42em \mathrm{all}\hskip0.42em t=1,\dots, T, $$

where $ {\mathbf{p}}_t={\left({p}_t^{(i)}\right)}_{i=1}^4 $ , $ {\mathbf{s}}_t={\left({s}_t^{(i)}\right)}_{i=1}^4 $ , $ {f}_t $ is a per-round fuel cost function that may depend on ambient conditions such as temperature, pressure, and humidity, and $ \alpha $ is a parameter determining the extent to which ramping energy generation is penalized. Importantly, the demands $ {d}_t,{q}_t $ and the cost functions $ {f}_t $ are not all known in advance, so the plant operator cannot solve problem (2.1) all at once. Instead, the plant operator only knows the current timestep’s problem parameters $ {f}_t,{d}_t $ , and $ {q}_t $ exactly, though they may have (possibly noisy) access to predictions of these parameters in a short lookahead window.

Figure 1. Schematic of the cogeneration power plant model. The plant operator chooses how much electricity (yellow arrow) and steam (blue arrow) the three gas turbines (left cooling tower) produce, as well as how much steam is directed to the steam turbine (right cooling tower) to produce additional electricity. At each time $ t $ , ambient conditions (e.g., temperature) together with electricity and steam demand are represented by the vector $ {\boldsymbol{\theta}}_t $ , and electricity and steam dispatch decisions are represented by the vector $ {\mathbf{x}}_t $ .

The cogeneration dispatch problem we consider can be formulated as a specific instance of the more general problem of online optimization with switching costs (Chen et al., Reference Chen, Comden, Liu, Gandhi and Wierman2016). Abstractly, online optimization with switching costs can be considered as a game in which at each time $ t\in \left\{1,\dots, T\right\} $ , a decision-maker receives a vector $ {\boldsymbol{\theta}}_t\in {\mathrm{\mathbb{R}}}^n $ parametrizing a cost function $ f\left(\cdot; {\boldsymbol{\theta}}_t\right):{\mathrm{\mathbb{R}}}^d\to {\mathrm{\mathbb{R}}}_{+} $ . In response, the decision-maker must choose a decision $ {\mathbf{x}}_t\in {\mathrm{\mathbb{R}}}^d $ and incurs the hitting cost $ f\left({\mathbf{x}}_t;{\boldsymbol{\theta}}_t\right) $ as well as the switching cost $ \parallel {\mathbf{x}}_t-{\mathbf{x}}_{t-1}\parallel $ resulting from that decision, where $ \parallel \cdot \parallel $ is some norm. We assume that the decision $ {\mathbf{x}}_t $ does not impact future parameters $ {\boldsymbol{\theta}}_{\tau } $ for $ \tau >t $ . In the context of the cogeneration dispatch application, $ {\boldsymbol{\theta}}_t $ is a vector containing all ambient conditions such as temperature, pressure, humidity, and the electricity and steam demands $ {d}_t,{q}_t $ at time $ t $ , $ {\mathbf{x}}_t $ is a vector including the plant operator’s energy and steam dispatch decisions at time $ t $ , and the function $ f $ maps ambient conditions and generator dispatches to a fuel cost while penalizing violation of constraints including capacity limits and supply–demand balance. The switching term $ \parallel {\mathbf{x}}_t-{\mathbf{x}}_{t-1}\parallel $ represents the ramping cost.Footnote ² The problem is online, meaning that the decision-maker only has access to the parameters $ {\boldsymbol{\theta}}_1,\dots, {\boldsymbol{\theta}}_t $ that have been revealed by time $ t $ when making the decision $ {\mathbf{x}}_t $ . However, the decision-maker may have access to (possibly inaccurate) forecasts $ {\hat{\boldsymbol{\theta}}}_{t+1\mid t},\dots, {\hat{\boldsymbol{\theta}}}_{t+w\mid t} $ of parameters within a lookahead window of length $ w\in \mathrm{\mathbb{N}} $ , which can help them anticipate and reduce future switching costs. Such forecasts could be obtained using standard ML methods for predicting near-term weather or energy demand.

We consider two standard algorithms for the problem of online optimization with switching costs. The first, Greedy, is a myopic algorithm that simply chooses the decision $ {\mathbf{x}}_t $ that minimizes $ f\left(\cdot; {\boldsymbol{\theta}}_t\right) $ at each time $ t $ . This algorithm has worst-case cost guarantees under mild assumptions on the structure of the cost function $ f $ (Zhang et al., Reference Zhang, Jiang, Lu and Yang2021), and it resembles the single-stage dispatch algorithm widely used by power system operators. Its behavior is characterized formally as follows:

$$ \mathrm{G}\mathrm{REEDY}:{\boldsymbol{\theta}}_t\hskip0.5em \mapsto \hskip0.5em \underset{\mathbf{x}\in {\mathrm{\mathbb{R}}}^d}{\arg \hskip0.1em \min }\ f\left(\mathbf{x};{\boldsymbol{\theta}}_t\right)=:{\mathbf{x}}_t. $$

That is, Greedy can be viewed as a function that, when provided with parameter vector $ {\boldsymbol{\theta}}_t\in {\mathrm{\mathbb{R}}}^n $ , returns the minimizer of $ f\left(\cdot; {\boldsymbol{\theta}}_t\right) $ as a dispatch decision. Note that, since we assume that $ f\left(\cdot; {\boldsymbol{\theta}}_t\right) $ penalizes all constraints from the original cogeneration problem formulation, Greedy will necessarily satisfy these constraints.

The second algorithm we consider is model predictive control (MPC). MPC solves a lookahead optimization problem using near-term predictions of parameters to choose a decision. Formally, at time $ t $ , given the perfectly known current parameter vector $ {\boldsymbol{\theta}}_t $ and forecasts $ {\hat{\boldsymbol{\theta}}}_{t+1\mid t},\dots, {\hat{\boldsymbol{\theta}}}_{t+w\mid t} $ of parameters over the next $ w $ timesteps, MPC chooses its decision as the optimal solution of the following optimization problem:

(2.4)

$$ \mathrm{MPC}:\Theta \hskip0.5em \mapsto \underset{\begin{array}{c}\mathbf{x}\in {\mathrm{\mathbb{R}}}^d\\ {}{\mathbf{y}}_1,\dots, {\mathbf{y}}_w\in {\mathrm{\mathbb{R}}}^d\end{array}}{\arg \hskip0.1em \min}\hskip0.24em f\left(\mathbf{x};{\boldsymbol{\theta}}_t\right)+\sum \limits_{\tau =1}^wf\left({\mathbf{y}}_{\tau };{\hat{\boldsymbol{\theta}}}_{t+\tau \mid t}\right)+\parallel {\mathbf{y}}_{\tau }-{\mathbf{y}}_{\tau -1}\parallel =:{\mathbf{x}}_t $$

where $ {\mathbf{y}}_0:=\mathbf{x} $ and we have concatenated the parameter vectors $ \left({\boldsymbol{\theta}}_t,{\hat{\boldsymbol{\theta}}}_{t+1\mid t},\dots, {\hat{\boldsymbol{\theta}}}_{t+w\mid t}\right) $ into a single entity $ \Theta \in {\left({\mathrm{\mathbb{R}}}^n\right)}^{w+1} $ for brevity. Note that only the minimizer $ \mathbf{x} $ corresponding to the decision made for time $ t $ is binding, i.e., the chosen decision $ {\mathbf{x}}_t $ is simply the optimal $ \mathbf{x} $ in the above optimization; all of the other variables $ {\mathbf{y}}_1,\dots, {\mathbf{y}}_w $ are ignored after the solution is obtained. Note that, similar to Greedy, MPC produces decisions that satisfy the constraints of the original cogeneration problem.

As previously discussed, MPC can be computationally prohibitive if $ f\left(\cdot; \boldsymbol{\theta} \right) $ is nonconvex. However, there has been a great deal of recent work on learning proxies for power systems-related constrained optimization problems that operate more quickly than traditional optimization solvers (Donti et al., Reference Donti, Rolnick and Kolter2020; Park & Van Hentenryck, Reference Park and Van Hentenryck2023; Zhang et al., Reference Zhang, Tabas and Zhang2023). Following this approach, in our work, we train an ML model to approximate the input–output behavior of MPC in an unsupervised fashion. That is, given a dataset $ \mathcal{D}={\left\{{\Theta}_i\right\}}_{i=1}^N $ of cost function parameters, we train a neural network $ \mathrm{ML}:{\left({\mathrm{\mathbb{R}}}^n\right)}^{w+1}\to {\left({\mathrm{\mathbb{R}}}^d\right)}^{w+1} $ to minimize MPC’s lookahead objective (2.4) on average over this dataset. We employ the method of Zhang et al. (Reference Zhang, Tabas and Zhang2023) and include a gauge map in the output layer of the network to ensure that the network always produces decisions that satisfy the capacity limits and supply–demand balance constraints for both energy (2.2) and steam (2.3).

While this unsupervised training approach may yield low empirical error on the training set and the gauge map ensures constraint satisfaction, this does not guarantee that the ML algorithm will perform well on out-of-sample instances or under distribution shift. In the next section, we introduce an algorithm to address this challenge and provide provable cost guarantees for the learned proxy ML.

We conclude this section with some notation. For an algorithm Alg producing decisions $ {\mathbf{x}}_1,\dots, {\mathbf{x}}_T $ , define $ {\mathrm{ALG}}_t:={\mathbf{x}}_t $ as Alg’s decision at time $ t $ , and define $ {\mathrm{C}}_{\mathrm{ALG}}\left(s,t\right):={\sum}_{\tau =s}^tf\left({\mathbf{x}}_{\tau };{\boldsymbol{\theta}}_{\tau}\right)+\parallel {\mathbf{x}}_{\tau }-{\mathbf{x}}_{\tau -1}\parallel $ as Alg’s cost from time $ s $ through $ t $ . For brevity, we write the total cost as $ {\mathrm{C}}_{\mathrm{ALG}}:={\mathrm{C}}_{\mathrm{ALG}}\left(1,T\right) $ .

3. Algorithm

We propose in Algorithm 1 a novel algorithm, RobustML, that “robustifies” the algorithm ML, giving a sequence of dispatch decisions with cost that can be bounded in terms of the costs incurred by both ML and Greedy. RobustML behaves as follows: it begins by following ML’s decisions, but if ML surpasses a cost threshold and Greedy is performing relatively well, RobustML will switch to following Greedy’s decisions (line 8). However, if Greedy begins to perform worse relative to ML, then RobustML will switch back to following ML (line 15). The specific thresholds for switching are determined by the parameters $ \unicode{x025B}, \delta >0 $ , which reflect the algorithm’s confidence in ML. That is, when $ \unicode{x025B} $ and $ \delta $ are small (close to $ 0 $ ), RobustML will spend more time following the decisions of ML; on the other hand, when $ \unicode{x025B} $ and $ \delta $ are large, RobustML will spend more of its time following the decisions of Greedy. Tuning of $ \unicode{x025B} $ and $ \delta $ thus enables the plant operator to choose the particular tradeoff they wish to achieve between exploiting the (typically excellent) performance of ML and obtaining worst-case performance guarantees comparable to that of Greedy, even in cases of distribution shift.

Algorithm 1: The algorithm $ \mathrm{R}\mathrm{OBUST}\mathrm{ML}\left(\unicode{x025B}, \delta \right) $ . Note that all algorithms are assumed to begin in the same initial state $ {\mathbf{x}}_0 $ , so $ {\mathbf{a}}_0={\mathbf{r}}_0={\mathbf{x}}_0 $ (where $ {\mathbf{a}}_0={\mathrm{ML}}_0 $ and $ {\mathbf{r}}_0={\mathrm{GREEDY}}_0 $ ).

Our main analytic result on the performance of RobustML is the following bound on its cost. Note that we assume that the decision space has diameter $ D $ , so $ \parallel {\mathrm{ML}}_t-{\mathrm{GREEDY}}_t\parallel \le D $ for all $ t\in \left\{1,\dots, T\right\} $ .

Theorem 3.1. The algorithm RobustML (Algorithm 1) achieves cost bounded as

$$ {\mathrm{C}}_{ROBUSTML}\le \min \left\{\left(1+\epsilon +\delta \right){\mathrm{C}}_{ML},\left(1+\frac{1+\epsilon }{\delta}\right){\mathrm{C}}_{GREEDY}+\left(1+\frac{2}{\varepsilon}\right)D\right\}. $$

We present a proof of Theorem 3.1 in Section 6. In particular, the theorem tells us that by selecting $ \epsilon, \delta $ arbitrarily small, RobustML can achieve performance arbitrarily close to that achieved by ML, at the cost of possibly worse performance relative to Greedy. However, by selecting moderate $ \epsilon, \delta $ , it is possible to trade off the exploitation of ML with robustness with respect to Greedy’s performance. Moreover, rather than selecting fixed $ \epsilon, \delta $ a priori to achieve a particular, fixed tradeoff, it is possible to apply techniques from online learning to optimally tune these parameters in an adaptive fashion (see, e.g., Khodak et al., Reference Khodak, Balcan, Talwalkar and Vassilvitskii2022; Li et al., Reference Li, Yang, Qu, Shi, Yu, Wierman and Low2022; Zeynali et al., Reference Zeynali, Sun, Hajiesmaili and Wierman2021).

It is also important to note that the above result does not depend in any crucial way on the choice of the algorithm Greedy. That is, the plant operator can choose any strategy as their baseline “robust” algorithm, including heuristic methods that are currently employed in practice, and the performance bound given by Theorem 3.1 on Algorithm 1 will hold, with $ {\mathrm{C}}_{\mathrm{GREEDY}} $ replaced by the cost of the operator’s chosen strategy. Thus, Algorithm 1 gives a method for power plant operators to take advantage of machine learning’s good performance while maintaining worst-case guarantees relative to their existing strategies, whatever they may be.

4. Experimental Results and Discussion

We investigate the performance of our algorithm for cogeneration dispatch on a modified form of the CogenEnv environment from SustainGym, a reinforcement learning benchmarks suite (Yeh et al., Reference Yeh, Li, Datta, Arroyo, Christianson, Zhang, Chen, Hosseini, Golmohammadi, Shi, Yue and Wierman2023); all of the code, models, and data employed in conducting experiments can be found at https://doi.org/10.5281/zenodo.13623809. This cogeneration plant has behavior modeled in a black-box fashion via neural networks, and thus the fuel cost is differentiable as a function of the dispatch decisions. We train an ML model to replicate the behavior of MPC with lookahead $ w=6 $ in an unsupervised fashion as described in the previous section. The model is trained on a dataset of 100 days of ambient conditions (temperature, pressure, humidity) and municipal demands for energy and steam on a 15 minute basis. To solve for the Greedy and MPC decisions, we use sequential least-squares programming in SciPy (Virtanen et al., Reference Virtanen, Gommers, Oliphant, Haberland, Reddy, Cournapeau, Burovski, Peterson, Weckesser, Bright, van der Walt, Brett, Wilson, Millman, Mayorov, Nelson, Jones, Kern and Larson2020).

We begin by comparing the runtimes of Greedy, MPC, and ML to produce a day’s worth of dispatch decisions on an AWS EC2 instance with 4 virtual CPU cores and an NVIDIA T4 GPU (Figure 2). We find that while Greedy can produce a day of dispatches in about 30 seconds, MPC with lookahead $ w=6 $ takes nearly 100 times as long ( $ \sim $ 45 minutes). On the other hand, ML is three orders of magnitude faster than MPC, returning a day’s worth of decisions in under two seconds on average. This is unsurprising, since while Greedy and MPC require solving optimization problems to produce dispatch decisions, ML simply requires a neural network’s forward pass.

Figure 2. Number of seconds (mean and standard deviation) for each algorithm to produce a day’s worth of dispatch decisions.

We next examine the performance of ML, Greedy, and RobustML on the baseline system (no renewables) when the lookahead predictions are unreliable, as in the case when there is distribution shift. That is, we compare the setting of perfect predictions (i.e., $ {\hat{\boldsymbol{\theta}}}_{t+1\mid t}={\boldsymbol{\theta}}_{t+1},\dots, {\hat{\boldsymbol{\theta}}}_{t+w\mid t}={\boldsymbol{\theta}}_{t+w} $ ) to settings with i.i.d. Gaussian noise and increasing standard deviation $ \sigma $ on the predictions (i.e., $ {\hat{\boldsymbol{\theta}}}_{t+1\mid t}={\boldsymbol{\theta}}_{t+1}+{\mathbf{z}}_{t+1\mid t},\dots, {\hat{\boldsymbol{\theta}}}_{t+w\mid t}={\boldsymbol{\theta}}_{t+w}+{\mathbf{z}}_{t+w\mid t} $ , with $ {\mathbf{z}}_{\tau \mid t}\overset{\mathrm{i}.\mathrm{i}.\mathrm{d}.}{\sim}\mathcal{N}\left(\mathbf{0},\sigma \mathbf{I}\right) $ ). We show the results in Figure 3 for several choices of the RobustML parameters $ \unicode{x025B}, \delta $ . In particular, we observe that while ML performs better than Greedy when predictions are good ( $ \sigma \le 30 $ ), its performance degrades significantly as the noise grows ( $ \sigma \to 100 $ ). Nonetheless, for the cases of larger $ \unicode{x025B} $ and $ \delta $ – specifically, $ \unicode{x025B} =\delta =0.7 $ and $ \unicode{x025B} =\delta =1.0 $ – RobustML gracefully transitions between the good performance of ML for small $ \sigma $ to nearly matching or beating the performance of Greedy in the large $ \sigma $ regime. Thus, even though the quality of predictions is unknown a priori, RobustML exploits the good performance of ML when predictions are good while preserving robustness in the worst case. Notably, however, this is not the case when $ \unicode{x025B} =\delta $ is small, as can be seen in the top row of Figure 3; in this case, RobustML stays too close to the ML algorithm, and its performance suffers as a result.

Figure 3. Cost of each algorithm under increasing noise $ \sigma $ on the lookahead predictions, normalized by Greedy’s cost. Curves indicate mean cost and shaded regions cover $ \pm $ one standard deviation. Several choices of parameters are shown for RobustML: $ \unicode{x025B} =\delta =0.1 $ (top left), $ \unicode{x025B} =\delta =0.4 $ (top right), $ \unicode{x025B} =\delta =0.7 $ (bottom left), $ \unicode{x025B} =\delta =1.0 $ (bottom right).

We further examine the performance of the algorithms under increasing penetration of wind energy, displaying the results in Figure 4. We find that the efficiency improvement of ML over Greedy widens for moderate wind penetration (up to 200 MW), highlighting the value of using lookahead alongside ML to increase efficiency when renewable generation leads to more frequent ramp events. Notably, however, ML’s performance gets closer to that of Greedy when wind penetration nears 400 MW, possibly reflecting the challenge of adequately anticipating future ramp needs in a high-renewables environment even when equipped with moderate lookahead. Nonetheless, we find that when $ \unicode{x025B} =\delta =1.0 $ , RobustML yields uniformly improved performance over both ML and Greedy and thus enables better efficiency even in challenging high-renewables scenarios. While surprising, this improvement highlights that adaptive switching between the ML and Greedy algorithms enables RobustML to exploit each algorithm when it performs well, leading to better average performance than either algorithm.

Figure 4. Cost of each algorithm under increasing wind penetration, normalized by Greedy’s cost. Curves indicate mean cost and shaded regions cover $ \pm $ one standard deviation. Several choices of parameters are shown for RobustML: $ \unicode{x025B} =\delta =0.1 $ (top left), $ \unicode{x025B} =\delta =0.4 $ (top right), $ \unicode{x025B} =\delta =0.7 $ (bottom left), $ \unicode{x025B} =\delta =1.0 $ (bottom right).

5. Conclusions

In this work, we proposed a learning-augmented algorithm, RobustML, for the operation of cogeneration plants with ramp costs, finding that it effectively balances exploitation of the good performance of machine-learned algorithms with worst-case robustness in cases of distribution shift. In particular, RobustML allows for taking advantage of lookahead predictions without suffering the computational complexity of MPC, and depending on how its parameters are tuned, it can yield performance that uniformly improves on both a pure ML approach and existing baseline algorithms used in practice.

Several interesting avenues remain open for future work. First, it would be interesting to apply RobustML to larger energy systems with more generators to examine how performance gains scale on larger systems. Second, while RobustML currently accounts for ramping costs in its performance guarantee, it does not ensure the satisfaction of strict ramp limits, which constrain the operation of many dispatchable generators in the real world. As such, extending the algorithm to allow for such strict limits would be a valuable step toward real-world algorithm deployment. Finally, while in this work, we trained the ML model to mimic MPC, there are numerous variants of MPC that are explicitly designed for greater robustness, e.g., Bemporad and Morari (Reference Bemporad, Morari, Garulli and Tesi1999), Christianson, Werner, et al. (Reference Christianson, Werner, Wierman and Low2022), Langson et al., (Reference Langson, Chryssochoos, Raković and Mayne2004), and Schwarm and Nikolaou (Reference Schwarm and Nikolaou1999). It would be interesting to examine how such robustified MPC variants perform when input to a learning-augmented algorithm such as RobustML and whether they can improve both average and worst-case performance.

6. Proof of Theorem 3.1

We begin by showing $ {\mathrm{C}}_{\mathrm{ROBUSTML}}\le \left(1+\unicode{x025B} +\delta \right){\mathrm{C}}_{\mathrm{ML}} $ . Note that the algorithm consists of phases in which RobustML first coincides with ML, and then switches to following Greedy, before switching back to ML, and so on. We will assume that RobustML ends the instance coinciding with ML, so $ {\mathbf{x}}_T={\mathbf{a}}_T $ ; the case in which RobustML ends at $ {\mathbf{r}}_T $ is similar. Let $ {t}_i $ denote the timestep in which RobustML switches from Greedy back to ML for the $ i $ th time, with $ {t}_0:=1 $ since RobustML always begins by following ML. Similarly, let $ {m}_i $ denote the timestep in which RobustML switches from ML to Greedy for the $ i $ th time. Clearly, we have $ 1={t}_0<{m}_1<{t}_1<\cdots <{m}_k<{t}_k\le T $ , for some $ k\in \mathrm{\mathbb{N}} $ . Even though RobustML ends at ML, define $ {m}_{k+1}:=T+1 $ for notational simplicity. Then the cost of RobustML may be written as

(6.1)

$$ {\displaystyle \begin{array}{l}{\mathrm{C}}_{\mathrm{ROBUSTML}}=\sum \limits_{\tau =1}^{m_1-1}{f}_{\tau}\left({\mathbf{a}}_{\tau}\right)+\parallel {\mathbf{a}}_{\tau }-{\mathbf{a}}_{\tau -1}\parallel \\ {}\hskip8.0em +\sum \limits_{i=1}^k\left({f}_{m_i}\left({\mathbf{r}}_{m_i}\right)+\parallel {\mathbf{r}}_{m_i}-{\mathbf{a}}_{m_i-1}\parallel +\sum \limits_{\tau ={m}_i+1}^{t_i-1}{f}_{\tau}\left({\mathbf{r}}_{\tau}\right)+\parallel {\mathbf{r}}_{\tau }-{\mathbf{r}}_{\tau -1}\parallel \right.\\ {}\hskip13.8em \left.+{f}_{t_i}\left({\mathbf{a}}_{t_i}\right)+\parallel {\mathbf{a}}_{t_i}-{\mathbf{r}}_{t_i-1}\parallel +\sum \limits_{\tau ={t}_i+1}^{m_{i+1}-1}{f}_{\tau}\left({\mathbf{a}}_{\tau}\right)+\parallel {\mathbf{a}}_{\tau }-{\mathbf{a}}_{\tau -1}\parallel \right)\\ {}\hskip6.4em \le {\mathrm{C}}_{\mathrm{ML}}\left(1,{m}_1-1\right)+\sum \limits_{i=1}^k\left({\mathrm{C}}_{\mathrm{GREEDY}}\left({m}_i,{t}_i-1\right)+\parallel {\mathbf{r}}_{m_i-1}-{\mathbf{a}}_{m_i-1}\parallel \right.\\ {}\hskip22.5em \left.+{\mathrm{C}}_{\mathrm{ML}}\left({t}_i,{m}_{i+1}-1\right)+\parallel {\mathbf{a}}_{t_i-1}-{\mathbf{r}}_{t_i-1}\parallel \right)\end{array}} $$

(6.2)

$$ \hskip8.5em \le {\mathrm{C}}_{\mathrm{ML}}\left(1,{m}_1-1\right)+2 kD+\sum \limits_{i=1}^k{\mathrm{C}}_{\mathrm{GREEDY}}\left({m}_i,{t}_i-1\right)+{\mathrm{C}}_{\mathrm{ML}}\left({t}_i,{m}_{i+1}-1\right) $$

(6.3)

$$ \hskip-11.4em \le (1+\epsilon ){\mathrm{C}}_{\mathrm{ML}}+\sum \limits_{i=1}^k{\mathrm{C}}_{\mathrm{GREEDY}}({m}_i,{t}_i-1) $$

(6.4)

$$ \hskip-25em \le (1+\epsilon +\delta ){\mathrm{C}}_{\mathrm{ML}} $$

where (6.1) follows from the triangle equality on $ \parallel {\mathbf{r}}_{m_i}-{\mathbf{a}}_{m_i-1}\parallel $ and $ \parallel {\mathbf{a}}_{t_i}-{\mathbf{r}}_{t_i-1}\parallel $ , and (6.2) follows by the diameter bound. The inequality (6.3) follows by line 6 of the algorithm, which states that the algorithm will switch from following ML to following Greedy at time $ t $ only if $ {\mathrm{C}}_{\mathrm{ML}}\left(s,t\right)\ge \frac{2D}{\unicode{x025B}} $ . Noting that at the start of any timestep $ t $ , $ s $ is exactly

$$ s=\underset{i:{m}_i+1\le t}{\max }{m}_i+1 $$

(with $ {m}_0:=0 $ for notational convenience), it follows that for each $ i\in \left[k\right] $ , $ {\mathrm{C}}_{\mathrm{ML}}\left({m}_{i-1}+1,{m}_i\right)\ge \frac{2D}{\unicode{x025B}} $ . Thus

$$ 2 kD\le \unicode{x025B} \sum \limits_{i=1}^k{\mathrm{C}}_{\mathrm{ML}}\left({m}_{i-1}+1,{m}_i\right)=\unicode{x025B} \cdot {\mathrm{C}}_{\mathrm{ML}}\left(1,{m}_k\right)\le \unicode{x025B} \cdot {\mathrm{C}}_{\mathrm{ML}}. $$

Finally, (6.4) follows from

$$ \sum \limits_{i=1}^k{\mathrm{C}}_{\mathrm{GREEDY}}\left({m}_i,{t}_i-1\right)\le {\mathrm{C}}_{\mathrm{GREEDY}}\left(1,{t}_k-1\right)<\delta \cdot {\mathrm{C}}_{\mathrm{ML}}\left(1,{t}_k-1\right)\le \delta \cdot {\mathrm{C}}_{\mathrm{ML}}, $$

since by definition, $ {\mathbf{x}}_{t_k-1}={\mathbf{r}}_{t_k-1} $ , which by line 12 of the algorithm means that $ {\mathrm{C}}_{\mathrm{GREEDY}}\left(1,{t}_k-1\right)<\delta \cdot {\mathrm{C}}_{\mathrm{ML}}\left(1,{t}_k-1\right) $ . Thus, we have proved the desired bound $ {\mathrm{C}}_{\mathrm{ROBUSTML}}\le \left(1+\unicode{x025B} +\delta \right){\mathrm{C}}_{\mathrm{ML}} $ .

We now turn to showing $ {\mathrm{C}}_{\mathrm{ROBUSTML}}\le \left(1+\frac{1+\unicode{x025B}}{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}+\left(1+\frac{2}{\unicode{x025B}}\right)D $ . First suppose RobustML finishes the instance coinciding with ML, so $ {\mathbf{x}}_T={\mathbf{a}}_T $ . Let $ \tau \in \left\{0,\dots, T-1\right\} $ denote the last time at which RobustML coincided with Greedy, or that $ {\mathbf{x}}_{\tau }={\mathbf{r}}_{\tau } $ . Thus the cost can be bounded as

(6.5)

$$ {\displaystyle \begin{array}{cc}{\mathrm{C}}_{\mathrm{ROBUSTML}}& ={\mathrm{C}}_{\mathrm{ROBUSTML}}(1,\tau +1)+{\mathrm{C}}_{\mathrm{ROBUSTML}}(\tau +2,T)\\ {}& \le (1+\epsilon +\delta ){\mathrm{C}}_{\mathrm{ML}}(1,\tau +1)+{\mathrm{C}}_{\mathrm{ML}}(\tau +2,T)\end{array}} $$

(6.6)

$$ \le \max \left\{\left(1+\frac{1+\epsilon }{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}(1,\tau +1)+\frac{2D}{\epsilon },\left(1+\frac{1+\epsilon }{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}\right\} $$

(6.7)

$$ \le \left(1+\frac{1+\epsilon }{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}+\frac{2D}{\epsilon } $$

where (6.5) follows via the previously proved inequality $ {\mathrm{C}}_{\mathrm{ROBUSTML}}\le \left(1+\unicode{x025B} +\delta \right){\mathrm{C}}_{\mathrm{ML}} $ , and (6.6) follows by the fact (according to line 14 of the algorithm) that RobustML switching from Greedy to ML at time $ \tau +1 $ means that $ {\mathrm{C}}_{\mathrm{GREEDY}}\ge \delta \cdot {\mathrm{C}}_{\mathrm{ML}}\left(1,\tau +1\right) $ , as well as from the following observation: since RobustML coincides with ML between times $ \tau +1 $ and $ T $ , line 6 of the algorithm tells us that either $ {\mathrm{C}}_{\mathrm{ML}}\left(\tau +2,T\right)<\frac{2D}{\unicode{x025B}} $ or $ {\mathrm{C}}_{\mathrm{GREEDY}}\ge \delta \cdot {\mathrm{C}}_{\mathrm{ML}} $ _.

Finally, suppose RobustML finishes the instance coinciding with Greedy, so $ {\mathbf{x}}_T={\mathbf{r}}_T $ . Let $ \sigma \in \left\{0,\dots, T-1\right\} $ denote the last time at which RobustML coincided with ML, or that $ {\mathbf{x}}_{\sigma }={\mathbf{a}}_{\sigma } $ . By the previous case’s inequality (6.7), we have

$$ {\displaystyle \begin{array}{c}{\mathrm{C}}_{\mathrm{ROBUSTML}}={\mathrm{C}}_{\mathrm{ROBUSTML}}\left(1,\sigma \right)+{\mathrm{C}}_{\mathrm{ROBUSTML}}\left(\sigma +1,T\right)\\ {}\le \left(1+\frac{1+\unicode{x025B}}{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}\left(1,\sigma \right)+\frac{2D}{\unicode{x025B}}+{f}_{\sigma +1}\left({\mathbf{r}}_{\sigma +1}\right)+\parallel {\mathbf{r}}_{\sigma +1}-{\mathbf{a}}_{\sigma}\parallel +{\mathrm{C}}_{\mathrm{GREEDY}}\left(\sigma +2,T\right)\\ {}\le \left(1+\frac{1+\unicode{x025B}}{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}\left(1,\sigma \right)+\frac{2D}{\unicode{x025B}}+D+{\mathrm{C}}_{\mathrm{GREEDY}}\left(\sigma +1,T\right)\\ {}\le \left(1+\frac{1+\unicode{x025B}}{\delta}\right){\mathrm{C}}_{\mathrm{GREEDY}}+\left(1+\frac{2}{\unicode{x025B}}\right)D.\end{array}} $$

Acknowledgments

The authors thank James Chen for helpful discussions.

Author contribution

Conceptualization: all authors; Methodology: N.C., C.Y.; Data curation: N.C., M.H., M.T.R., A.G.; Formal analysis: N.C.; Data visualization: N.C.; Writing—original draft: N.C.; Writing—review and editing: all authors.

Competing interest

Competing interests: during the completion of this work, M.H., M.T.R., and A.G. are or were employed at Beyond Limits, which funded a portion of this work.

Data availability statement

The code, models, and data used in conducting the experiments presented in this work are publicly available at https://doi.org/10.5281/zenodo.13623809.

Ethical standard

The research meets all ethical guidelines, including adherence to the legal requirements of the study country.

Funding statement

This research was supported by NSF grants CNS-2146814, CPS-2136197, CNS-2106403, and NGSDI-2105648, Amazon AWS, the Caltech Resnick Sustainability Institute, and Beyond Limits. N.C. was supported by an NSF Graduate Research Fellowship (DGE-1745301). Tongxin Li was supported by the start-up funding UDF01002773 of CUHK-Shenzhen. The NSF, Amazon AWS, and the Caltech Resnick Sustainability Institute had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Beyond Limits had a role in the study design, data collection, and preparation of the final manuscript.

Footnotes

This research article was awarded Open Data and Open Materials badges for transparent practices. See the Data Availability Statement for details.

¹ Antoniadis et al. (Reference Antoniadis, Coester, Elias, Polak and Simon2020) and Christianson et al. (Reference Christianson, Shen and Wierman2023) also consider randomized algorithms for online optimization with switching costs, obtaining bounds on expected algorithm performance. While these bounds can improve upon those of deterministic algorithms in the case of large $ D $ , we restrict our focus in this work to deterministic algorithms in order to obtain guarantees that hold with certainty, rather than in expectation.

² Note that while in (2.1) only ramping energy decisions yields a ramp cost, in this model all dispatch decisions in the vector $ {\mathbf{x}}_t $ are penalized by the ramp cost. However, since the norm $ \parallel \cdot \parallel $ can be arbitrary, we can model the cogeneration application by making the penalty for ramping the steam decisions arbitrarily small.

References

Antoniadis, A., Coester, C., Elias, M., Polak, A., & Simon, B. (2020). Online metric algorithms with untrusted predictions [ISSN: 2640-3498]. In Proceedings of the 37th International Conference on Machine Learning, 345–355. Retrieved September 21, 2022, from https://proceedings.mlr.press/v119/antoniadis20a.html Google Scholar

Bemporad, A., & Morari, M. (1999). Robust model predictive control: A survey. In Garulli, A. & Tesi, A. (Eds.), Robustness in identification and control (pp. 207–226). Springer London. https://doi.org/10.1007/BFb0109870CrossRef Google Scholar

Bjelogrlic, M. (2000). Inclusion of combined cycle plants into optimal resource scheduling. In 2000 Power Engineering Society Summer Meeting (Cat. No.00CH37134), 1, 189–194 vol. 1. https://doi.org/10.1109/PESS.2000.867469CrossRef Google Scholar

Cao, G., Lai, E. M.-K., & Alam, F. (2017). Gaussian process model predictive control of unknown non-linear systems. IET Control Theory & Applications, 11(5), 703–713. https://doi.org/10.1049/iet-cta.2016.1061CrossRef Google Scholar

Carli, R., Cavone, G., Ben Othman, S., & Dotoli, M. (2020). IoT Based Architecture for Model Predictive Control of HVAC Systems in Smart Buildings [Number: 3 Publisher: Multidisciplinary Digital Publishing Institute]. Sensors, 20(3), 781. https://doi.org/10.3390/s20030781CrossRef Google Scholar PubMed

Chen, N., Comden, J., Liu, Z., Gandhi, A., & Wierman, A. (2016). Using Predictions in Online Optimization: Looking Forward with an Eye on the Past. In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 193–206. https://doi.org/10.1145/2896377.2901464CrossRef Google Scholar

Christianson, N., Handina, T., & Wierman, A. (2022). Chasing Convex Bodies and Functions with Black-Box Advice [ISSN: 2640-3498]. In Proceedings of Thirty Fifth Conference on Learning Theory, 867–908. Retrieved September 12, 2022, from https://proceedings.mlr.press/v178/christianson22a.html Google Scholar

Christianson, N., Shen, J., & Wierman, A. (2023). Optimal robustness-consistency tradeoffs for learning-augmented metrical task systems. In International Conference on Artificial Intelligence and Statistics. https://proceedings.mlr.press/v206/christianson23a.html Google Scholar

Christianson, N., Werner, L., Wierman, A., & Low, S. (2022). Dispatch-aware planning for feasible power system operation. Electric Power Systems Research, 212, 108597. https://doi.org/10.1016/j.epsr.2022.108597CrossRef Google Scholar

Deng, K., Sun, Y., Li, S., Lu, Y., Brouwer, J., Mehta, P. G., Zhou, M., & Chakraborty, A. (2015). Model Predictive Control of Central Chiller Plant With Thermal Energy Storage Via Dynamic Programming and Mixed-Integer Linear Programming. IEEE Transactions on Automation Science and Engineering, 12(2), 565–579. https://doi.org/10.1109/TASE.2014.2352280CrossRef Google Scholar

Donti, P. L., Rolnick, D., & Kolter, J. Z. (2020). DC3: A learning method for optimization with hard constraints. In International Conference on Learning Representations. https://openreview.net/forum?id=V1ZHVxJ6dSS Google Scholar

Ela, E., & O’Malley, M. (2016). Scheduling and Pricing for Expected Ramp Capability in Real-Time Power Markets. IEEE Transactions on Power Systems, 31(3), 1681–1691. https://doi.org/10.1109/TPWRS.2015.2461535CrossRef Google Scholar

Fioretto, F., Mak, T. W., & Van Hentenryck, P. (2020). Predicting AC Optimal Power Flows: Combining Deep Learning and Lagrangian Dual Methods. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 630–637. https://doi.org/10.1609/aaai.v34i01.5403CrossRef Google Scholar

Hua, B., Schiro, D. A., Zheng, T., Baldick, R., & Litvinov, E. (2019). Pricing in Multi-Interval Real-Time Markets. IEEE Transactions on Power Systems, 34(4), 2696–2705. https://doi.org/10.1109/TPWRS.2019.2891541CrossRef Google Scholar

Kelman, A., Ma, Y., & Borrelli, F. (2013). Analysis of local optima in predictive control for energy efficient buildings. Journal of Building Performance Simulation, 6(3), 236–255. https://doi.org/10.1080/19401493.2012.671959CrossRef Google Scholar

Khodak, M., Balcan, N., Talwalkar, A., & Vassilvitskii, S. (2022). Learning Predictions for Algorithms with Predictions. In Advances in Neural Information Processing Systems. https://openreview.net/forum?id=5OLcPQaYTVg Google Scholar

Killian, M., & Kozek, M. (2016). Ten questions concerning model predictive control for energy efficient buildings. Building and Environment, 105, 403–412. https://doi.org/10.1016/j.buildenv.2016.05.034CrossRef Google Scholar

Kotary, J., Fioretto, F., & Van Hentenryck, P. (2021). Learning Hard Optimization Problems: A Data Generation Perspective. Advances in Neural Information Processing Systems, 34, 24981–24992. Retrieved September 12, 2022, from https://proceedings.neurips.cc/paper/2021/hash/d1942a3ab01eb59220e2b3a46e7ef09d-Abstract.html Google Scholar

Langson, W., Chryssochoos, I., Raković, S., & Mayne, D. (2004). Robust model predictive control using tubes. Automatica, 40(1), 125–133. https://doi.org/10.1016/j.automatica.2003.08.009CrossRef Google Scholar

Lee, R., Maghakian, J., Hajiesmaili, M., Li, J., Sitaraman, R., & Liu, Z. (2021). Online Peak-Aware Energy Scheduling with Untrusted Advice. In Proceedings of the Twelfth ACM International Conference on Future Energy Systems, 107–123. https://doi.org/10.1145/3447555.3464860CrossRef Google Scholar

Lee, Z. J., Chang, D., Jin, C., Lee, G. S., Lee, R., Lee, T., & Low, S. H. (2018). Large-Scale Adaptive Electric Vehicle Charging. In 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 1–7. https://doi.org/10.1109/SmartGridComm.2018.8587550CrossRef Google Scholar

Li, P.,Yang, J., & Ren, S. (2022). Expert-Calibrated Learning for Online Optimization with Switching Costs. In Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 85–86. https://doi.org/10.1145/3489048.3530961CrossRef Google Scholar

Li, T., Yang, R., Qu, G., Shi, G., Yu, C.,Wierman, A., & Low, S. (2022). Robustness and Consistency in Linear Quadratic Control with Untrusted Predictions. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6(1), 1–35. https://doi.org/10.1145/3508038Google Scholar

Lin, Y., Hu, Y., Shi, G., Sun, H., Qu, G., & Wierman, A. (2021). Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems. Advances in Neural Information Processing Systems, 34, 5174–5185. Retrieved September 12, 2022, from https://papers.nips.cc/paper/2021/hash/298f587406c914fad5373bb689300433-Abstract.html Google Scholar

Lio, W. H., Rossiter, J., & Jones, B. L. (2014). A review on applications of model predictive control to wind turbines. In 2014 UKACC International Conference on Control (CONTROL), 673–678. https://doi.org/10.1109/CONTROL.2014.6915220CrossRef Google Scholar

Lykouris, T., & Vassilvtiskii, S. (2018). Competitive Caching with Machine Learned Advice [ISSN: 2640-3498]. In Proceedings of the 35th International Conference on Machine Learning, 3296–3305. Retrieved January 25, 2022, from https://proceedings.mlr.press/v80/lykouris18a.html Google Scholar

Meng, K., Dong, Z. Y., Xu, Z., & Weller, S. R. (2015). Cooperation-Driven Distributed Model Predictive Control for Energy Storage Systems. IEEE Transactions on Smart Grid, 6(6), 2583–2585. https://doi.org/10.1109/TSG.2015.2449760CrossRef Google Scholar

Morari, M., & Lee, J. H. (1999). Model predictive control: Past, present and future. Computers and Chemical Engineering, 16.Google Scholar

Morstyn, T., Hredzak, B., Aguilera, R. P., & Agelidis, V. G. (2018). Model Predictive Control for Distributed Microgrid Battery Energy Storage Systems. IEEE Transactions on Control Systems Technology, 26(3), 1107–1114. https://doi.org/10.1109/TCST.2017.2699159CrossRef Google Scholar

Nellikkath, R., & Chatzivasileiadis, S. (2022). Physics-Informed Neural Networks for AC Optimal Power Flow. Electric Power Systems Research, 212, 108412. https://doi.org/10.1016/j.epsr.2022.108412CrossRef Google Scholar

Pan, X., Chen, M., Zhao, T., & Low, S. H. (2022). DeepOPF: A Feasibility-Optimized Deep Neural Network Approach for AC Optimal Power Flow Problems [arXiv:2007.01002 [cs, eess]]. https://doi.org/10.48550/arXiv.2007.01002CrossRef Google Scholar

Park, S., & Van Hentenryck, P. (2023). Self-supervised primal-dual learning for constrained optimization. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, 37, 4052–4060. https://doi.org/10.1609/aaai.v37i4.25520CrossRef Google Scholar

Pon Kumar, S. S., Tulsyan, A., Gopaluni, B., & Loewen, P. (2018). A Deep Learning Architecture for Predictive Control. IFAC-PapersOnLine, 51(18), 512–517. https://doi.org/10.1016/j.ifacol.2018.09.373CrossRef Google Scholar

Purohit, M., Svitkina, Z., & Kumar, R. (2018). Improving Online Algorithms via ML Predictions. In Advances in Neural Information Processing Systems, 31. Retrieved January 31, 2022, from https://proceedings.neurips.cc/paper/2018/hash/73a427badebe0e32caa2e1fc7530b7f3-Abstract.html Google Scholar

Rutten, D., Christianson, N., Mukherjee, D., & Wierman, A. (2023). Smoothed Online Optimization with Unreliable Predictions. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 7(1), 1–36. https://doi.org/10.1145/3579442CrossRef Google Scholar

Schwarm, A. T., & Nikolaou, M. (1999). Chance-constrained model predictive control. AIChE Journal, 45(8), 1743–1752. https://doi.org/10.1002/aic.690450811CrossRef Google Scholar

Spielberg, S., Gopaluni, R., & Loewen, P. (2017). Deep reinforcement learning approaches for process control. In 2017 6th International Symposium on Advanced Control of Industrial Processes (AdCONIP), 201–206. https://doi.org/10.1109/ADCONIP.2017.7983780CrossRef Google Scholar

Sultana, W. R., Sahoo, S. K., Sukchai, S., Yamuna, S., & Venkatesh, D. (2017). A review on state of art development of model predictive control for renewable energy applications. Renewable and Sustainable Energy Reviews, 76, 391–406. https://doi.org/10.1016/j.rser.2017.03.058CrossRef Google Scholar

Troy, N., Denny, E., & O’Malley, M. (2010). Base-Load Cycling on a System With Significant Wind Penetration. IEEE Transactions on Power Systems, 25(2), 1088–1097. https://doi.org/10.1109/TPWRS.2009.2037326CrossRef Google Scholar

Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P.,Weckesser, W, Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., … SciPy 1.0 Contributors. (2020). SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17, 261–272. https://doi.org/10.1038/s41592-019-0686-2CrossRef Google Scholar PubMed

Yeh, C., Li, V., Datta, R., Arroyo, J., Christianson, N., Zhang, C., Chen, Y., Hosseini, M., Golmohammadi, A., Shi, Y., Yue, Y., & Wierman, A. (2023). SustainGym: Reinforcement learning environments for sustainable energy systems. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track. https://openreview.net/forum?id=vZ9tA3o3hr Google Scholar

Zeynali, A., Sun, B., Hajiesmaili, M., & Wierman, A. (2021). Data-driven Competitive Algorithms for Online Knapsack and Set Cover. In Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10833–10841. https://doi.org/10.1609/aaai.v35i12.17294CrossRef Google Scholar

Zhang, L., Jiang, W., Lu, S., & Yang, T. (2021). Revisiting Smoothed Online Learning. Advances in Neural Information Processing Systems, 34, 13599–13612. Retrieved September 21, 2022, from https://proceedings.neurips.cc/paper/2021/hash/70fc5f043205720a49d973d280eb83e7-Abstract.html Google Scholar

Zhang, L., Tabas, D., & Zhang, B. (2023). An Efficient Learning-Based Solver for Two-Stage DC Optimal Power Flow with Feasibility Guarantees. https://arxiv.org/abs/2304.01409 Google Scholar

Zhao, J., Zheng, T., & Litvinov, E. (2020). A Multi-Period Market Design for Markets With Intertemporal Constraints. IEEE Transactions on Power Systems, 35(4), 3015–3025. https://doi.org/10.1109/TPWRS.2019.2963022CrossRef Google Scholar

Figure 1. Schematic of the cogeneration power plant model. The plant operator chooses how much electricity (yellow arrow) and steam (blue arrow) the three gas turbines (left cooling tower) produce, as well as how much steam is directed to the steam turbine (right cooling tower) to produce additional electricity. At each time $ t $, ambient conditions (e.g., temperature) together with electricity and steam demand are represented by the vector $ {\boldsymbol{\theta}}_t $, and electricity and steam dispatch decisions are represented by the vector $ {\mathbf{x}}_t $.

Figure 2. Number of seconds (mean and standard deviation) for each algorithm to produce a day’s worth of dispatch decisions.

Article contents

Robust machine-learned algorithms for efficient grid operation

Abstract

Keywords

Impact Statement

1. Introduction

1.1. Contributions

2. Model and Preliminaries

3. Algorithm

4. Experimental Results and Discussion

5. Conclusions

6. Proof of Theorem 3.1

Acknowledgments

Author contribution

Competing interest

Data availability statement

Ethical standard

Funding statement

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests