1 — 14:00 — Recent advances in the COPT conic optimizer
The COPT conic optimizer is a state-of-the-art solver for a linear conic problems. This is a well-studied problem for the special case of linear conic optimization over the symmetric cones (that is, linear optimization, second-order cone optimization and semidefinite optimization), and many high-performance implementations are available in both commericial and non-commericial packages. In this talk we give an overview of the COPT conic optimizer, and we discuss recent extensions of the COPT conic optimizer for non-symmetric cones, in particular the exponential cone.
Whereas the last decade has seen abundant research into first-order methods, where non-symmetric cones pose little additional difficulty over symmetric cones (provided first-order derivatives are readily available), then the extension of second-order methods to non-symmetric cones has received considerable less attention, both in academia and industry. A standard, and very successful second-order method for symmetric cones is based on Nesterov-Todd primal-dual scalings, which map primal and dual iterates to the same point, and similar for the derivatives of the barrier functions. As the Nesterov-Tood scalings are defined exclusively for symmetric cones, in the COPT conic optimizer we consider different primal-dual scalings proposed in 2001 by Tuncel. These primal-dual scalings also map primal and dual iterates to the same point, and similarly for the derivatives of the barrier functions. This results in an algorithm with many similarities to the symmetric counter-part based on Nesterov-Todd scalings.
We present numerical results of the algorithm for a range of different problems, including both symmetric cones and exponential cones. We also compare the quality and efficiency of the exponential cone solver with standard algorithms for non-linear optimization, for a range of applications, including logistic regression and geometric programming problems.
2 — 14:30 — ** CANCELLED ** Stochastic Structured Nonconvex Optimization in the Absence of Smoothness
In this talk, we will present new algorithm developments for structured nonconvex stochastic optimization that do not satisfy the standard smoothness conditions. Specifically, we will focus on scenarios where the objective function exhibits weak convexity but lacks smoothness. Through a model-based approach, we will analyze how stochastic optimization can still benefit from minibatching and distributed computing, which are two key techniques traditionally associated with smooth optimization. In the minibatch setting, we will demonstrate that many stochastic algorithms achieve linear speedup over the batch size, even for nonsmooth weakly convex problems. Our analysis is based on a novel sensitivity analysis of the proximal mapping used in each algorithm iteration. In a centralized distributed environment, we will develop an asynchronous stochastic prox-linear method. We will show that the delays only affect the high-order term in the complexity rate and, therefore, are negligible in the long run.
Furthermore, we will extend our study to more general nonsmooth problems without the standard Lipschitz continuity assumption. Based on new adaptive regularization (stepsize) strategies, we will demonstrate that it is still possible to preserve the standard convergence rate with a constant failure rate. Our analyses are based on rather weak assumptions: the Lipschitz parameter can be either bounded by a general growth function of norm(x) or locally estimated through independent random samples.
3 — 15:00 — New complexity results on nonconvex nonsmooth optimization
In this talk, we present several new results on minimizing a nonsmooth and nonconvex function under a Lipschitz condition. Recent works show that while the classical notion of Clarke stationarity is computationally intractable up to some sufficiently small constant tolerance, the randomized first-order algorithms find an approximate Goldstein stationary point with the finite-time convergence guarantee, which is independent of problem dimension. First of all, we present several motivating examples and explain why the notion of Goldstein stationary is important to the finite-time convergence guarantee. Second, we show that the randomization is necessary to obtain a dimension-independent guarantee, by proving a lower bound of $\Omega(d)$ for any deterministic algorithm that has access to both first-order and zeroth-order oracles. We also show that the zeroth-order oracle is essential to obtain a finite-time convergence guarantee, by showing that any deterministic algorithm with only the first-order oracle can not find an approximate Goldstein stationary point within a finite number of iterations up to sufficiently small constant parameter and tolerance. Finally, we establish the relationship between the notion of Goldstein stationary and the uniform smoothing, and propose several new zeroth-order methods with solid finite-time convergence guarantee.
4 — 15:30 — GPU-Centric Algorithm for Non-Linear Programming Problems
Over the past year, a large number of first-order and higher-order algorithms based on GPU and CUDA architectures have emerged for solving large-scale linear and nonlinear programming problems. There are many algorithms that have significant advantages over traditional CPU architecture algorithms in certain large-scale problems. In this report, we will provide a brief summary of this trend, including some new developments in hardware vendors such as Nvidia and library function construction. We will also present several new algorithms for linear programming, quadratic programming, and semi definite programming, and demonstrate their significant computational advantages demonstrated through COPT(Cardinal Solver) solver on suitable GPU hardware.