1 — 08:30 — A Framework for Data-driven Explainability in Mathematical Optimization
Advancements in mathematical programming have made it possible to efficiently tackle large-scale real-world
problems that were deemed intractable just a few decades ago. However, provably optimal solutions may not be accepted due
to the perception of optimization software as a black box. Although well understood by scientists, this lacks easy accessibility for practitioners. Hence, we advocate for introducing the explainability of a solution as another evaluation criterion, next to its objective value, which enables us to find trade-off solutions between these two criteria.
Explainability is attained by comparing against (not necessarily optimal)
solutions that were implemented in similar situations in the past.
Thus, solutions are preferred that exhibit similar features. Although we prove that
already in simple cases the explainable model is NP-hard, we characterize relevant polynomially solvable cases such as the
explainable shortest path problem. Our numerical experiments on both artificial as well as real-world road networks show the
resulting Pareto front. It turns out that the cost of enforcing explainability can be very small.
2 — 09:00 — Get Off The Hype Train: Why Using Neural Networks Can Be Very Risky
Adversarial manipulations show that many state-of-the-art neural network models suffer from a catastrophic lack of robustness. In this talk, we investigate how a capable adversary can exploit this weakness to disrupt autonomous acoustic sensors. We take the perspective of the adversary, formulating a partial differential equation (PDE) constrained optimization problem that constructs an acoustic signal which optimally disrupts a defender’s classifier. The primary novelty in our formulation is the PDE constraint, which connects the acoustic signal emitted by the adversary to the signal received by the classifier. By incorporating this PDE constraint, our formulation models an adversary which can make changes that impact the ambient acoustic environment but cannot directly manipulate inputs into the classifier. After discussing the problem formulation, we introduce new methods to compute solutions at scale. We conclude by summarizing the implications of our results on applications relevant to the US Navy.
3 — 09:30 — OR with a White Hat : Evidencing Privacy Vulnerabilities in ML Models
We present an optimization-based reconstruction attack that can effectively recover a dataset used to train a random forest model. Our approach utilizes information readily provided by random forests trained with scikit-learn. We frame the dataset reconstruction problem as a combinatorial optimization problem with a maximum likelihood objective and rely on constraint programming to solve it. Our extensive experiments reveal that random forests trained with feature randomization but without bootstrap aggregation are highly vulnerable to complete dataset reconstruction, even with very few trees. When bootstrap aggregation is used, most of the data can still be reconstructed. These findings underscore a critical vulnerability inherent in widely adopted ensemble methods, warranting attention and mitigation.
4 — 10:00 — SMLE: Safe Machine Learning via Embedded Overapproximation
During the last decade Machine Learning systems, in particular Neural Networks, have witnessed a tremendous impact over a wide range of domains, becoming a pervasive technology in our society. Despite their success, however, they still lack formal guarantees on their behavior, representing crucial requirements for their adoption in safety-critical scenarios, such as autonomous driving or aircraft collision avoidance. This is why a wide range of frameworks have been proposed to formally verify safety properties in state-of-the-art AI systems. These methods essentially rely either on Optimization and Searching or on Reachability Analysis. The former perform exact verification in a declarative fashion, suited also for certifying hybrid symbolic and subsymbolic systems, but usually fail to scale to real-world use cases. The latter adopt algorithmic approaches able to offer increased tractability, but often at the price of sacrificed completeness.
In this work, we contribute to bridge the gap between these two paradigms by introducing a novel framework, called SMLE (Safe Machine Learning via Embedded overapproximation), and consisting in equipping the learning model with an independent trainable overapproximator, which can be much simpler than the main model, given that its complexity is completely customizable by the user. Therefore, this auxiliary model can serve as a surrogate to facilitate the verification of safety properties, either in post-hoc phase, or immediately at training time via Projected Gradient Descent or Penalty Methods, hence enabling the construction of safe-by-design systems.