1 — 08:30 — Asymptotic Normality and Optimality in Nonsmooth Stochastic Optimization
Polyak and Juditsky famously showed that the stochastic gradient method for minimizing smooth and strongly convex functions enjoys a central limit theorem: the error between the running average of the iterates and the minimizer, normalized by the square root of the iteration counter, converges to a normal random vector. Moreover, the covariance matrix of the limiting distribution is in a precise sense “optimal” among any estimation procedure. A long-standing open question is whether similar guarantees – asymptotic normality and optimality – exist for nonsmooth optimization and, more generally, for equilibrium problems.
In this talk, we obtain such guarantees under mild conditions that hold both in concrete circumstances (e.g. nonlinear programming) and under generic linear perturbations. The key insight is that, for various constrained or nonsmooth problems, there exists an active manifold ( aka. active set) in algorithmically useful ways. Based on this, we can prove that the shadow sequence of the (stochastic) subgradient method along the active manifold is precisely an inexact Riemannian gradient method with an implicit retraction. As a consequence, we can leverage the "partial smoothness" of nonsmooth problems and extend results for the smooth case.
This talk is based on joint work with Damek Davis and Dmitriy Drusvyatskiy.
2 — 09:00 — A simple uniformly optimal method without line search for convex optimization
Line search (or backtracking) procedures have been widely employed into first-order methods for solving convex optimization problems, especially those with unknown problem parameters (e.g., Lipschitz constant). In this paper, we show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori. In particular, we present a novel accelerated gradient descent type algorithm called auto-conditioned fast gradient method (AC-FGM) that can achieve an optimal rate of convergence for smooth convex optimization without requiring the estimate of a global Lipschitz constant or the employment of line search procedures. We then extend AC-FGM to solve convex optimization problems with H\"{o}lder continuous gradients and show that it automatically achieves the optimal rates of convergence uniformly for all problem classes with the desired accuracy of the solution as the only input. Finally, we report some encouraging numerical results that demonstrate the advantages of AC-FGM over the previously developed parameter-free methods for convex optimization.
3 — 09:30 — Universal First-Order Methods for Convex Optimization Problems
We present a generic algorithmic framework for convex optimization without knowing the levels of convexity and smoothness of the problem. In particular, the framework includes an adaptive subgradient method and an adaptive proximal bundle method. One advantage of the framework is that it does not perform any line search on the strong convexity parameter. To the best of our knowledge, this is the first universal method that does not use a backtracking procedure on the strong convexity parameter. We present complexity results in terms of both the optimality gap and the stationarity. Their complexity bounds are as good as those obtained with known convexity and smoothness parameters.
4 — 10:00 — Optimal and parameter-free gradient minimization methods for convex and nonconvex optimization
We propose novel optimal and parameter-free algorithms for computing an approximate solution with small (projected) gradient norm. Specifically, for computing an approximate solution such that the norm of its (projected) gradient does not exceed ε, our complexity results match the lower complexity bounds of the convex and strongly cases up to some universal constant, and achieve the best known complexity bound for the nonconvex case for the first time in the literature. Moreover, for all the convex, strongly convex, and nonconvex cases, we propose parameter-free algorithms that do not require the input of any problem parameters. To the best of our knowledge, there do not exist such parameter-free methods before especially for the strongly convex and nonconvex cases. Since most regularity conditions (e.g., strong convexity and lower curvature) are imposed over a global scope, the corresponding problem parameters are notoriously difficult to estimate. However, gradient norm minimization equips us with a convenient tool to monitor the progress of algorithms and thus the ability to estimate such parameters in-situ.