ISMP 2024

1 — 08:30 — ODTlearn: A Python Package for Learning Optimal Decision Trees for Prediction and Prescription

Patrick Vossler, Stanford University
Sina Aghaei, University of Southern California
Nathan Justin, University of Southern California
Nathanael Jo, Massachusetts Institute of Technology
Andres Gomez, University of Southern California
Phebe Vayanos, University of Southern California

ODTLearn is an open-source Python package that provides methods for learning optimal decision trees for high-stakes predictive and prescriptive tasks based on the mixed-integer optimization (MIO) framework proposed in (Aghaei et al., 2019) and several of its extensions. The current version of the package provides implementations for learning optimal classification trees, optimal fair classification trees, optimal classification trees robust to distribution shifts, and optimal prescriptive trees from observational data. We have designed the package to be easy to maintain and extend as new optimal decision tree problem classes, reformulation strategies, and solution algorithms are introduced. To this end, the package follows object-oriented design principles and supports both commercial (Gurobi) and open source (COIN-OR branch and cut) solvers. The package documentation and an extensive user guide can be found at https://d3m-research-group.github.io/odtlearn/. Additionally, users can view the package source code and submit feature requests and bug
reports by visiting https://github.com/D3M-Research-Group/odtlearn.

2 — 09:00 — Finding Regions of Counterfactual Explanations via Robust Optimization

Donato Maragno, University of Amsterdam
Jannis Kurtz, University of Amsterdam
Tabea Röber, University of Amsterdam
Rob Goedhart, University of Amsterdam
Dick Den Hertog, Amsterdam Business School, University of Amsterdam, The Netherlands
Ilker Birbil, University of Amsterdam

Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this work we derive an iterative method to calculate robust CEs, i.e., CEs that remain valid even after the features are slightly perturbed. To this end, our method provides a whole region of CEs allowing the user to choose a suitable recourse to obtain a desired outcome. We use algorithmic ideas from robust optimization and prove convergence results for the most common machine learning methods including decision trees, tree ensembles, and neural networks. Our experiments show that our method can efficiently generate globally optimal robust CEs for a variety of common data sets and classification models.

3 — 09:30 — Learning Optimal Prescriptive Trees from Observational Data

Nathanael Jo, Massachusetts Institute of Technology
Sina Aghaei, University of Southern California
Andres Gomez, University of Southern California
Phebe Vayanos, University of Southern California

We consider the problem of learning an optimal prescriptive tree (i.e., an interpretable treatment assignment policy in the form of a binary tree) of moderate depth, from observational data. This problem arises in numerous socially important domains such as public health and personalized medicine, where interpretable and data-driven interventions are sought based on data gathered in deployment -- through passive collection of data -- rather than from randomized trials. We propose a method for learning optimal prescriptive trees using mixed-integer optimization (MIO) technology. We show that under mild conditions our method is asymptotically exact in the sense that it converges to an optimal out-of-sample treatment assignment policy as the number of historical data samples tends to infinity. Contrary to existing literature, our approach: 1) does not require data to be randomized, 2) does not impose stringent assumptions on the learned trees, and 3) has the ability to model domain specific constraints. Through extensive computational experiments, we demonstrate that our asymptotic guarantees translate to significant performance improvements in finite samples, as well as showcase our uniquely flexible modeling power by incorporating budget and fairness constraints.

4 — 10:00 — Counterfactual Explanations for Linear Optimization

Jannis Kurtz, University of Amsterdam
Ilker Birbil, University of Amsterdam
Dick Den Hertog, Amsterdam Business School, University of Amsterdam, The Netherlands

In recent years, there has been a rising demand for transparent and explainable machine learning (ML) models. A large stream of works focuses on algorithmic methods to derive so called counterfactual explanations (CE). Although significant progress has been made in generating CEs for ML models, this topic has received minimal attention in the Operations Research (OR) community. However, algorithmic decisions in OR are made by complex algorithms which cannot be considered to be explainable or transparent. In this work we argue that there exist many OR applications where counterfactual explanations are needed. We translate the concept of CEs into the world of linear optimization problems and define three different classes of CEs: strong, weak and relative counterfactual explanations. For all three types we derive problem formulations and analyze the structure of it. We show that the weak and strong CE formulations have some undesirable properties while relative CEs can be derived by solving a convex optimization problem. We test all concepts on a real-world diet problem and we show that relative CEs can be calculated efficiently on NETLIB instances.

Schedule

Thursday, July 25, 2024 08:30 - 10:30 — Parallel Session

ThA211 — Explainable Artificial Intelligence and Mathematical Programming

1 — 08:30 — ODTlearn: A Python Package for Learning Optimal Decision Trees for Prediction and Prescription

2 — 09:00 — Finding Regions of Counterfactual Explanations via Robust Optimization

3 — 09:30 — Learning Optimal Prescriptive Trees from Observational Data

4 — 10:00 — Counterfactual Explanations for Linear Optimization