114:00 — Duality, upper bounds, and policy quality in Multistage Stochastic Programming

The standard SDDP algorithm produces lower estimates of the value functions at each stage, which in turn induce a policy. In the risk-neutral setting, the cost of such a policy can be estimated by Monte-Carlo, which also yields an upper bound for the optimal value of the problem. In the risk-averse case, due to the variable weighting of scenarios, such elementary estimates are not easily available. A natural alternative is to construct upper bounds for the value functions themselves. This can be performed either with inner approximations or through the dual problem, which both require an estimate of the Lipschitz constant of the value functions. We show that the dual method yields policies with guaranteed upper bounds on the risk-adjusted costs, similarly to policies induced by inner approximations. This contrasts to the policies constructed from lower bounds, which could have poor quality even for a good lower bound. We further show that an appropriate relaxation of the dual method is equivalent to the inner approximations, which provides an interesting link between the upper bound algorithms.

214:30 — Data-Driven Stochastic Dual Dynamic Programming: Performance Guarantees and Regularization Schemes

We propose a data-driven scheme for multistage stochastic linear programming with Markovian random parameters by extending the stochastic dual dynamic programming (SDDP) algorithm. In our data-driven setting, only a finite number of historical trajectories are available. The proposed SDDP scheme evaluates the cost-to-go functions only at the observed sample points, where the conditional expectations are estimated empirically using kernel regression. The scheme thus avoids the construction of scenario trees, which may incur exponential time complexity during the backward induction step. However, if the training data is sparse, the resulting SDDP algorithm exhibits a high optimistic bias that gives rise to poor out-of-sample performances. To mitigate the small sample effects, we adopt ideas from the distributionally robust optimization (DRO), which replaces the empirical conditional expectation in the cost-to-go function with a worst-case conditional expectation over a polyhedral ambiguity set. We derive the theoretical out-of-sample performance guarantee of the data-driven SDDP scheme and show that the dependence of its sample complexity on the number of time stages is merely polynomial. Finally, we validate the effectiveness of the proposed algorithm and demonstrate its superiority over existing data-driven schemes through extensive numerical experiments.

315:00 — Multistage stochastic inventory optimization with contracting

We consider a multistage stochastic production planning problem over a 12 month planning horizon faced by New Zealand's largest dairy products manufacturer. The sales price for each product follows a random process. In each month, the company decides a mix of products to make, what products to sell immediately, and what products to store for later sales. In the absence of contracting we present a separation result that can be used to determine a production plan independent of a sales/inventory policy. When the company arranges forward sales contracts, the optimal production and sales policies become coupled. We present some numerical results for policies computed using SDDP and compare them with policies derived from model predictive control.

415:30 — A Multicut Approach to Compute Upper Bounds for Risk-Averse SDDP

Stochastic Dual Dynamic Programming (SDDP) is a widely used and fundamental algorithm for solving multistage stochastic optimization problems. Although SDDP has been frequently applied to solve risk-averse models with the Conditional Value-at-Risk (CVaR), it is known that the estimation of upper bounds is a methodological challenge, and many methods are computationally intensive. In practice, this leaves most SDDP implementations without a practical and clear stopping criterion. In this paper, we propose using the information already contained in a multicut formulation of SDDP to solve this problem with a simple and computationally efficient methodology.

The multicut version of SDDP, in contrast with the typical average cut, preserves the information about which scenarios give rise to the worst costs, thus contributing to the CVaR value. We use this fact to modify the standard sampling method on the forward step so the average of multiple paths approximates the nested CVaR cost. We highlight that minimal changes are required in the SDDP algorithm and there is no additional computational burden for a fixed number of iterations.

We present multiple case studies to empirically demonstrate the effectiveness of the method. First, we use a small hydrothermal dispatch test case, in which we can write the deterministic equivalent of the entire scenario tree to show that the method perfectly computes the correct objective values. Then, we present results using a standard approximation of the Brazilian operation problem and a real hydrothermal dispatch case based on data from Colombia. Our numerical experiments showed that this method consistently calculates upper bounds higher than lower bounds for those risk-averse problems and that lower bounds are improved thanks to the better exploration of the scenarios tree.