1 — 16:20 — Improving the Geometry of (Conic) Linear Optimization Problems for the Primal-Dual Hybrid Gradient Method (PDHG)
The primal-dual hybrid gradient method (PDHG) (with restarts and heuristic enhancements) has shown significant success in solving huge-scale LP problems, and our recent research has shown that this success depends on certain geometric condition measures of the LP related to error ratios and sharpness. However, depending on the problem instance, these condition measures may take on extreme values and lead to poor performance both in theory and in practice. In such instances the LP problem is perhaps better understood as a special case of a more general conic (nonlinear) optimization problem. This begs the following two questions: (1) can the geometric condition measures for LP be improved by applying suitable transformations of the given LP instance?, and (2) moving beyond LP to conic optimization, to what extent can we generalize the methods, analysis, and computational performance (theory and practice) from LP to conic optimization? Regarding these questions, we show how column rescaling (for LP) and barrier-based rescaling (for conic problems) can improve the condition measures in theory and lead to improved performance of PDHG in theory and in practice. Our theoretical development leads to guidelines for practical implementation of re-scaling based on ideas from analytic centers. Also, our experiments on LP relaxations of the MIPLIB 2017 dataset demonstrate the impact of rescaling on the actual performance of PDHG.
2 — 16:50 — The role of sparsity in differentially-private optimization
With the increasing use of personally-sensitive data in machine learning applications, privacy has become a central concern. In this context, differential privacy (DP) offers a rigorous and quantifiable control of the privacy risk of a machine learning model. One of the main problems of interest in differential privacy is stochastic optimization, where we are interested in computing a model that approximately minimizes the empirical or population excess risk, while satisfying differential privacy with respect to the data used for training the model.
In this talk, we will present various settings where one can obtain nearly dimension independent accuracy rates in differentially private (stochastic) optimization and saddle-point problems. We start with results involving stochastic optimization under polyhedral constraints, popular in sparsity-oriented machine learning, which are based on the variance-reduced stochastic Frank-Wolfe method. Next, we move to stochastic saddle-point problems, where we study the use of stochastic mirror-descent methods and vertex sampling; these results are applied to problems including DP and group-fair machine learning, and DP synthetic data generation. Finally, I will present results on the "dual" counterpart of the above problems: stochastic optimization with sparse gradients, a setting of high relevance in large embedding models. Here, we provide new matching upper and lower bounds both in the convex and nonconvex settings.
3 — 17:20 — Density Separation with Tensor Factorization
A nonnegative tensor factorization model is proposed for separating mixtures of probably densities. The approach uses kernel density estimation to transform raw samples into compressed and discretized densities. The factorization is computed using a specialized block-coordinate descent algorithm that enforces the required simplex constraints and is guaranteed to converged to Nash equilibria. Numerical experiments on real geological and spatial transcriptomics data, based on our open source Julia implementation, illustrate the model's effectiveness at separating the density mixtures.