108:30 — Variational theory and algorithms for a class of asymptotically approachable nonconvex problems

  • Ying Cui, University of California Berkeley

In this talk, motivated by the inverse optimal value based nonlinear optimization problems, we introduce a class of composite nonconvex functions, where the outer function is the sum of univariate extended-real-valued convex functions and the inner function is the limit of difference-of-convex functions. A notable feature of this class is that the inner function can be merely lower semicontinuous instead of continuously differentiable. It covers a range of important yet challenging applications, including the composite value functions of nonlinear programs and the value at risk constraints for continuous random vectors. We propose an asymptotic decomposition of the composite function that guarantees epi-convergence to the original function, leading to necessary optimality conditions for the corresponding minimization problem. The proposed decomposition also enables us to design a numerical algorithm to solve the composite minimization problem such that any accumulation point of the generated sequence, if exists, satisfies the newly introduced optimality conditions. Our derived results expand on the study of so-called amenable functions introduced by Poliquin and Rockafellar in 1992, which are compositions of convex functions with smooth maps, and the prox-linear methods for their minimization. The talk is based on the joint work with Hanyang Li at the UC Berkeley.

209:00 — Approximations of Rockafellians, Lagrangians, and Dual Functions

Optimization problems are prone to instability, with seemingly minor approximations causing disproportional changes in minimizers, minimum values, and even level-sets. Fortunately, every optimization problem can be associated with many substitute problems potentially having more desirable properties. While focusing on the nonconvex case, we develop sufficient conditions under which Rockafellian relaxations, Lagrangian relaxations, and dual problems are well-behaved, in a precise and quantifiable sense, when subject to approximations. Examples from composite and stochastic optimization illustrate the results.

309:30 — Heaviside Composite Optimization: Continuous meets Discrete

A Heaviside function is an indicator function of an semi-infinite interval. A Heaviside composite function is a Heaviside function composed with a multivariate function that may be nonsnooth. This talk presents a touch of this novel class of discontinuous optimization problems that borders on continuous and discrete optimization. Among the rich applications of such problems, we highlight one pertaining to classification and treatment with constraints solvable by an elementary yet novel progressive integer programming (PIP) method that can be applied broadly to many other problems, including indefinite quadratic programs and linear programs with linear complementarity constraints. Computational results demonstrate the excellent performance of the PIP idea and its superiority over a full integer programming approach.

410:00 — Computing Wasserstein Barycenter via operator splitting: the method of averaged marginals

The Wasserstein barycenter (WB) is an important tool for summarizing sets of probability measures. It finds applications in applied probability, clustering, image processing, etc. When the measures' supports are finite, computing a (balanced) WB can be done by solving a linear optimization problem whose dimensions generally exceed standard solvers' capabilities. In the more general setting where measures have different total masses, we propose a convex nonsmooth optimization formulation for the so-called unbalanced WB problem. Due to their colossal dimensions, we introduce a decomposition scheme based on the Douglas-Rachford splitting method that can be applied to both balanced and unbalanced WB problem variants. Our algorithm, which has the interesting interpretation of being built upon averaging marginals, operates a series of simple (and exact) projections that can be parallelized and even randomized, making it suitable for large-scale datasets. Numerical comparisons against state-of-the-art methods on several data sets from the literature illustrate the method's performance.