ISMP 2024

1 — 08:30 — Reinforcement Learning for Portfolio Optimization in Model with Regime-Switching

David Saunders, University of Waterloo

We study a reinforcement learning approach to portfolio selection when the underlying asset price process follows a regime-switching diffusion model. An analytical solution to the HJB equation for the exploratory mean-variance model under regime switching is derived and implementation in practice employing a deep neural network will be discussed. Empirical results investigating robustness with respect to model mis-specification will also be presented.

2 — 09:00 — Optimal trade execution with learning

Andrew Lim, National University of Singapore
Galvin Ng, Ecole Polytechnique

A trader would like to maximize expected revenue from selling an inventory of stocks by a specified terminal time when expected returns are linear in the trading rate but with unknown intercept (drift) and slope (price impact). Since the trading rate affects the data, this is intrinsically a problem of exploration and exploitation. To sidestep the challenges of solving the high-dimensional dynamic program, we consider a high-noise asymptotic regime and use Taylor Series to extract the exploration component of the optimal policy. Exploration depends, in a complex but intuitive way, on the time horizon, inventory level,
and parameter uncertainty.

3 — 09:30 — On Generalization and Regularization via Wasserstein Distributionally Robust Optimization

Qinyu Wu, University of Waterloo
Jonathan Li, University of Ottawa
Tiantian Mao, University of Science And Technology Of China

Wasserstein distributionally robust optimization (DRO) has found success in operations research and machine learning applications as a powerful means to obtain solutions with favorable out-of-sample performances. Two compelling explanations for the success are the generalization bounds derived from Wasserstein DRO and the equivalency between Wasserstein DRO and the regularization scheme commonly applied in machine learning. Existing results on generalization bounds and the equivalency to regularization are largely limited to the setting where the Wasserstein ball is of a certain type and the decision criterion takes certain forms of an expected function. In this paper, we show that by focusing on Wasserstein DRO problems with affine decision rules, it is possible to obtain generalization bounds and the equivalency to regularization in a significantly broader setting where the Wasserstein ball can be of a general type and the decision criterion can be a general
measure of risk, i.e., nonlinear in distributions. This not only accommodates important applications in classification, regression, and risk minimization, but also expands to general decision-theoretical frameworks previously unaddressed using Wasserstein DRO. Our results are strong in that the generalization bounds do not suffer from the curse of dimensionality and the equivalency to regularization is exact. As a byproduct, our regularization results broaden considerably the class of Wasserstein DRO models that can be solved efficiently via regularization formulations. These general assurances should serve as a strong basis for expanding the application of Wasserstein DRO across diverse domains of data-driven decision problems.

4 — 10:00 — Model Aggregation for Risk Evaluation and Robust Optimization

Tiantian Mao, University of Science And Technology Of China

We introduce a new approach for prudent risk evaluation based on stochastic dominance, which will be called the model aggregation (MA) approach. In contrast to the classic worst-case risk (WR) approach, the MA approach produces not only a robust value of risk evaluation but also a robust distributional model, independent of any specific risk measure. The MA risk evaluation can be computed through explicit formulas in the lattice theory of stochastic dominance, and under some standard assumptions, the MA robust optimization admits a convex-program reformulation. The MA approach for Wasserstein and mean-variance uncertainty sets admits explicit formulas for the obtained robust models. Via an equivalence property between the MA and the WR approaches, new axiomatic characterizations are obtained for the Value-at-Risk (VaR) and the Expected Shortfall (ES, also known as CVaR). The new approach is illustrated with various risk measures and examples from portfolio optimization.

Schedule

Friday, July 26, 2024 08:30 - 10:30 — Parallel Session

FA247 — Latest Developments in Stochastic Optimization

1 — 08:30 — Reinforcement Learning for Portfolio Optimization in Model with Regime-Switching

2 — 09:00 — Optimal trade execution with learning

3 — 09:30 — On Generalization and Regularization via Wasserstein Distributionally Robust Optimization

4 — 10:00 — Model Aggregation for Risk Evaluation and Robust Optimization