1 — 14:00 — A Recursive Multilevel Algorithm for Deep Learning
As the use cases for neural networks become increasingly complex, modern neural networks must also become deeper and more intricate to keep up, indicating the need for more efficient learning algorithms. Multilevel methods, traditionally used to solve differential equations using a hierarchy of discretizations, offer the potential to reduce computational effort.
In this talk, we combine both concepts and introduce a multilevel stochastic gradient descent algorithm that accelerates learning through a multilevel strategy. A gradient correction term is needed to establish first-order consistency.
We discuss convergence of the method in the case of a deterministic gradient correction as well as a stochastic gradient correction under additional conditions including step size regularization and an angle condition.
To demonstrate the usefulness of our approach, we apply it to residual neural networks in image classification. The resolution of the images is utilized to generate data sets of varying complexity, which are then used to build a hierarchy of neural networks with a decreasing number of variables. Additionally, we construct corresponding prolongation and restriction operators. Numerical results are presented.
2 — 14:30 — Improved variance reduction extragradient method with line search for stochastic variational inequalities
In this paper, we investigate the numerical methods for solving stochastic variational inequalities. Using line search scheme, we propose an improved variance based stochastic extragradient method with different step sizes in the prediction and correction steps. The range of correction step size which can guarantee the convergence is also given. For the initial line search step size of each iteration, an adaptive method is adopted. Rather than the same scale for each reduction, a proportional reduction related to the problem is used to meet the line search criteria. Under the assumptions of Lipschitz continuous, pseudo-monotone operator and independent identically distributed sampling, the iterative complexity and the oracle complexity are obtained. When estimating the upper bound of the second order moment of the martingale difference sequence, we give a more convenient and comprehensible proof instead of using the Burkholder-Davis-Gundy inequality. The proposed algorithm is applied to fractional programming problems and the $l_2$ regularized logistic regression problem. The numerical results demonstrate its superiority.
3 — 15:00 — Differentially Private Federated Learning via Inexact ADMM with Multiple Local Updates
Differential privacy is a data privacy protection technique that can be applied to a
federated learning model and is designed to maximize the protection of individual privacy
when data is analyzed and processed. However, while ensuring strong data privacy, differential privacy
technology hinders achieving higher learning performance. We propose an
imprecise alternating direction multiplier algorithm with multiple local updates for federated
learning. The random noise generated by the Gaussian distribution with variance
varying with time and the random noise generated by the Laplacian distribution are added
to the output of the subproblems during iteration. We show that our algorithm provides (ϵ, δ)-differential privacy
and ϵ-differential privacy each iteration, respectively, where ϵ is the privacy budget controlled by the
user, and the privacy parameter δ represents the probability of failure that does not meet
the approximate differential privacy definition. The convergence property and convergence rate
are analyzed under the condition that the objective function is smooth convex, non-smooth convex
and strongly convex. Using MNIST data set for image classification, it is shown that our algorithm has
higher accuracy and faster convergence rate than the existing DP algorithm when achieving
the same data privacy level. The tradeoff between classification accuracy and privacy protection are demonstrated numerically.