1 — 08:30 — ODTlearn: A Python Package for Learning Optimal Decision Trees for Prediction and Prescription
ODTLearn is an open-source Python package that provides methods for learning optimal decision trees for high-stakes predictive and prescriptive tasks based on the mixed-integer optimization (MIO) framework proposed in (Aghaei et al., 2019) and several of its extensions. The current version of the package provides implementations for learning optimal classification trees, optimal fair classification trees, optimal classification trees robust to distribution shifts, and optimal prescriptive trees from observational data. We have designed the package to be easy to maintain and extend as new optimal decision tree problem classes, reformulation strategies, and solution algorithms are introduced. To this end, the package follows object-oriented design principles and supports both commercial (Gurobi) and open source (COIN-OR branch and cut) solvers. The package documentation and an extensive user guide can be found at https://d3m-research-group.github.io/odtlearn/. Additionally, users can view the package source code and submit feature requests and bug
reports by visiting https://github.com/D3M-Research-Group/odtlearn.
2 — 09:00 — Finding Regions of Counterfactual Explanations via Robust Optimization
Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this work we derive an iterative method to calculate robust CEs, i.e., CEs that remain valid even after the features are slightly perturbed. To this end, our method provides a whole region of CEs allowing the user to choose a suitable recourse to obtain a desired outcome. We use algorithmic ideas from robust optimization and prove convergence results for the most common machine learning methods including decision trees, tree ensembles, and neural networks. Our experiments show that our method can efficiently generate globally optimal robust CEs for a variety of common data sets and classification models.
3 — 09:30 — Learning Optimal Prescriptive Trees from Observational Data
We consider the problem of learning an optimal prescriptive tree (i.e., an interpretable treatment assignment policy in the form of a binary tree) of moderate depth, from observational data. This problem arises in numerous socially important domains such as public health and personalized medicine, where interpretable and data-driven interventions are sought based on data gathered in deployment -- through passive collection of data -- rather than from randomized trials. We propose a method for learning optimal prescriptive trees using mixed-integer optimization (MIO) technology. We show that under mild conditions our method is asymptotically exact in the sense that it converges to an optimal out-of-sample treatment assignment policy as the number of historical data samples tends to infinity. Contrary to existing literature, our approach: 1) does not require data to be randomized, 2) does not impose stringent assumptions on the learned trees, and 3) has the ability to model domain specific constraints. Through extensive computational experiments, we demonstrate that our asymptotic guarantees translate to significant performance improvements in finite samples, as well as showcase our uniquely flexible modeling power by incorporating budget and fairness constraints.
4 — 10:00 — Counterfactual Explanations for Linear Optimization
In recent years, there has been a rising demand for transparent and explainable machine learning (ML) models. A large stream of works focuses on algorithmic methods to derive so called counterfactual explanations (CE). Although significant progress has been made in generating CEs for ML models, this topic has received minimal attention in the Operations Research (OR) community. However, algorithmic decisions in OR are made by complex algorithms which cannot be considered to be explainable or transparent. In this work we argue that there exist many OR applications where counterfactual explanations are needed. We translate the concept of CEs into the world of linear optimization problems and define three different classes of CEs: strong, weak and relative counterfactual explanations. For all three types we derive problem formulations and analyze the structure of it. We show that the weak and strong CE formulations have some undesirable properties while relative CEs can be derived by solving a convex optimization problem. We test all concepts on a real-world diet problem and we show that relative CEs can be calculated efficiently on NETLIB instances.