114:00 — A Bestiary of Counterexamples in Smooth Convex Optimization

Counterexamples to some longstanding optimization problems in the smooth convex coercive setting are provided. Block-coordinate, steepest descent with exact search, or Bregman descent methods do not generally converge. Other failures of various desirable features are established: directional convergence of Cauchy’s gradient curves, convergence of Newton’s flow, finite length of Tikhonov path, convergence of central paths, or smooth Kurdyka-Lojasiewicz inequality. All examples are planar.
These examples rely on a new convex interpolation result (and some variants): given a decreasing sequence of positively curved k differentiable smooth convex compact sets in the plane, we provide a k-times differentiable convex function having these sets as level sets. The integer k is arbitrarily large, and in addition, the Hessian is positive definite out of the solution set.

214:30 — First Order Methods for Convex Bi-Level Optimization Problems

Recently, bilevel optimization problems have received growing attention within the optimization community, because of their numerous important applications in machine learning and data science. We will discuss some of the recent developments in bilevel optimization problems in terms of algorithms and their theoretical convergence/complexity analysis, as well as some of the attendant difficulties and open challenges.

315:00 — Survey Descent and Its Initialization

For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is difficult to bound, leading to convergences guarantees that are sublinear relative to the cumulative number of gradient evaluations. We instead propose a multipoint generalization of the gradient descent iteration for local optimization call Survey Descent. In this method, one first leverages a one-time initialization procedure to gather a "survey" of points. Then, during each iterate of the method, the survey points are updated in parallel using a simple, four-line procedure inspired by gradient descent. The survey points will then collectively move towards the objective minimizer. We prove linear convergence when the objective is itself max-of-smooth, and experiments suggest a more general phenomenon. While designed with general objectives in mind, we are motivated by a ``max-of-smooth'' model that captures the subdifferential dimension at optimality. I will review the Survey Descent method, what constitutes a good survey initialization, and discuss a potential heuristic for this initialization.

415:30 — A Unified Framework for Convergence Analysis of Proximal Gradient Methods in Nonsmooth Nonconvex Minimax Problems

Nonconvex minimax problems are prevalent in modern applications.  We focus on nonsmooth nonconvex minimax, thus departing from the more common weakly convex/concave and smooth models assumed in the recent literature. We present a unified framework for the convergence analsyis of proximal gradient methods. In contrast to the current literature which addresses the complexity of obtaining  only --nearly-- approximate stationary solutions, here we derive pointwise global convergence and refined complexity results. Furthemore, our approach allows to expand the scope of minimax problems that can be addressed through Non Euclidean proximal gradient schemes . Based on joint work with E. Cohen.