1 — 14:00 — Projection onto a Capped Rotated Second-Order Cone with Applications to Sparse Regression Relaxations
Motivated by solving the continuous relaxation of mixed-integer nonlinear programs (MINLPs) with binary indicator variables, we develop a closed-form expression for projecting onto a capped rotated second-order cone. The rapid computation of the projection onto this set enables the development of effective projection-based methods for solving the SOCP perspective relaxation of sparse optimization problems. As a proof of concept for the applicability of using our projection, we apply accelerated projected gradient method (i.e., FISTA) and ADMM to use our projection technique in order to effectively solve the perspective relaxation of a sparse regression problem. We also generalize the basic sparse regression formulation and solution method for group sparsity. Finally, we also develop an ADMM algorithm that solves a nonsmooth unconstrained formulation, which is known to be equivalent to the sparse regression perspective relaxation, and we further generalize this method for the group sparsity setting.
2 — 14:30 — Inexact Stochastic Proximal Methods for Sparse Group Structure Identification
The sparse learning problem is widely used in machine learning and data analysis, enhancing feature interpretability and optimizing model deployability. This paper studies the regularized stochastic sparse learning (RSSL) problem, specifically when the underlying group structures are known. However, existing optimization methods for RSSL require periodically evaluating exact gradients or storing a history of stochastic gradients. Moreover, few existing results ensure consistent support identification across all iterations beyond a sufficiently large index, i.e., the optimal group structures might be identified in some of the iterations but not in the rest, even if the index is sufficiently large. To address these challenges, we propose an inexact stochastic proximal algorithm, denoted as IS-PStorm, for RSSL without periodic evaluation of exact gradients or extensive storage of history. Our algorithm ensures a near-linear point convergence with a $O(1/k)$ variance reduction property, where $k$ represents the number of iterations. Importantly, we establish that IS-PStorm consistently identifies the group sparse structure in every iteration beyond a fixed threshold index with high probability. Numerical experiments on regularized logistic loss problems guarantee that IS-PStorm outperforms existing methods in various metrics on how efficiently an algorithm optimizes the objective and how robustly an algorithm identifies optimal support.
3 — 15:00 — On the power of greedy local search for sparse statistical problems
In this work, we analyze the greedy local search dynamics for a series of maximum likelihood estimation (MLE) sparse problems, including sparse PCA and sparse regression. The MLE problems can naturally be modeled as integer programs. One interesting conclusion of our work, is the provable suboptimality of these methods as opposed to other time-efficient methods, such as convex relaxations.
4 — 15:30 — Logic Rules and Chordal Graphs for Sparse Learning
We uncover a new relationship between sparse machine learning and chordal graphs, and develop a fast clique generation method to improve the performance of optimization algorithms for l0-regularized and l0-constrained machine learning models. We present novel logic rules, implying the simultaneous inclusion, exclusion, or ranking of pairs of features in any optimal solution. We describe the applicable logic rules, prove their validity, and give an algorithm that generates all maximal clique inequalities on the chordal conflict graph in O(plogp) time.