1 — 16:20 — Connections between Gradient Descent and the Power Method
Applying Gradient Descent (GD) with a fixed step size to minimize a quadratic function is equivalent to running the Power Method (PM) on the gradients. The connection between GD with a fixed step size and the PM, both with and without fixed momentum, is thus established. Consequently, valuable eigen-information is available via GD. By considering the dynamics of the gradients and iterates, new step size strategies are proposed to improve the practical performance of GD.
2 — 16:50 — A Discretization Approach for Low-Dimensional Non-Convex Bilevel Optimization
Bilevel optimization has become an invaluable tool in areas like hyper-parameter optimization, meta-learning, and reinforcement learning. However, most modern bilevel algorithms are developed for problems with strongly convex and/or unconstrained lower-level tasks. On the contrary, in our work, we deal with a problem where the lower-level task is non-convex and constrained. To solve such a challenging problem, we propose a novel value-function-based approach in which the value function and the respective problem reformulation are discretized. Under this approach, the value function’s optimization problem is transformed into an easier convex optimization task. We tackle the discretized problem using a penalty approach and establish the relation between the KKT points, and the local and global minima of the two problems. For the solution of the penalty reformulated problem, we develop a gradient descent-based method and calculate its convergence rate. Our experimental results confirm the effectiveness of the proposed method.
3 — 17:20 — ** CANCELLED ** D-stationary points for optimization with least constraint violation and the convergence of an SQP method
We consider the nonlinear optimization problem with least $\ell_1$-norm measure of constraint violations and introduce the concepts of the D-stationary points with the help of exact penalty function. If the stationary point is feasible, they correspond to the Fritz-John stationary point, the KKT stationary point and the singular stationary point, respectively. A new exact penalty SQP method with inner and outer iterations is proposed which admits the convergence to a D-stationary point and rapid infeasibility detection without driving the penalty parameter to zero. It demonstrates the commentary given in [SIAM J. Optim., 20 (2010), 2281--2299] and can be thought to be a supplement of the theory of nonlinear optimization on rapid detection of infeasibility. Some preliminary numerical results demonstrate that the proposed method is robust and efficient in solving infeasible nonlinear problems and a degenerate problem without LICQ in the literature.