114:00 — Optimal hedging of the interest rate swap book

With an optimization model interest rate curves are measured with increased accuracy from Overnight Index Swaps. Principal Component Analysis identifies the significant risk factors in interest rate markets. With these a Stochastic Programming model is formulated to determine the optimal hedge of the Overnight Index Swap book, where significant improvements are found relative to traditional delta hedging.

214:30 — Predicting Limit Order Book Prices: A Machine Learning Approach

In high-frequency trading, the dynamics of a double auction market
emerge, with "buy" orders and "sell" orders queued and managed using a
limit order book (LOB). Within this framework, accurate prediction of
price movements becomes crucial for optimizing the performance of
high-frequency algorithms. This talk explores machine learning models
for forecasting the LOB mid-price, comparing Convolutional Neural
Networks (CNN) and Long Short-Term Memory (LSTM). First we will
explore various methods for modeling and predicting LOB mid-prices.
This forecasting involves diverse methodologies, including stochastic
differential equations, which capture market dynamics such as
mean-reversion, momentum, and volatility asymmetry. Second, we will
discuss the practical implementation across various scenarios. The
models will yield both regression and classification outputs,
incorporating time-defined and event-defined intervals, the latter
offering advantages in mitigating intraday seasonal effects. We will
also explore the impact of excluding the first 30 minutes and the last
30 minutes of data on performance, as price volatility and market
activity during these intervals tend to be greater than the rest of
the day.Finally, we will evaluate performance using machine learning
metrics such as the F1-score and R squared.

Numerical experiments will utilize a full year of high-frequency data
from the S&P/TSX 60, comprising 60 major companies spanning ten
distinct sectors traded on the Toronto Stock Exchange. Machine
learning and deep learning methods present challenges in finance due
to data volatility and underlying complexity. CNNs excel in capturing
spatial patterns, originally designed for image data analysis, while
LSTMs effectively handle complex time series, even with noisy input
sequences.

315:00 — Contextual Portfolio Optimization: Some Recent Results

Smart Predict and Optimize' is an emerging paradigm where the parameters and optimal solutions for an optimization problem are jointly determined. We present some recent results using this paradigm for mean-variance optimization. We show that the integrated approach can result in parameter estimations and optimal solutions that perform better out of sample than those obtained from the 'predict or estimate first, then optimize' approach. Computational challenges will be discussed which motivates the development of decomposition methods based on the alternating direction method of multipliers (ADMM) for the optimization layer in the neural network that represents the integrated (joint) approach.

415:30 — Fair Mixed Effects Support Vector Machine

When using machine learning for automated prediction, it is important to account for fairness in the prediction. Fairness in machine learning aims to ensure that biases in the data and model inaccuracies do not lead to discriminatory decisions. E.g., predictions from fair machine learning models should not discriminate against sensitive variables such as sexual orientation and ethnicity.
A fundamental assumption in machine learning is the independence of observations. However, this assumption often does not hold true for data describing social phenomena, where data points are often clustered based. Hence, if the machine learning models do not account for the cluster correlations, the results may be biased. Especially high is the bias in cases where the cluster assignment is correlated to the variable of interest.
We present a fair mixed effects support vector machine algorithm that can handle both problems simultaneously. With a reproducible simulation study we demonstrate the impact of clustered data on the quality of fair machine learning predictions.