108:30 — Stochastic gradient methods with adaptive learning rates

It is known that the standard stochastic gradient descent (SGD) method, but also more sophisticated adaptive stochastic optimization methods such as the Adam optimizer can in general not converge if the learning rates do not tend to zero. In practice, one often deals with this problem by employing a deterministic decreasing learning rate schedule. Nevertheless, the optimal learning rates highly depend on the considered optimization problem and are generally difficult to find. In this talk we propose a learning-rate-adaptive approach that updates the learning rate dynamically during the training process depending on the past behavior of the loss function. For a particular class of strongly convex target functions, using results from martingale and Markov chain theory, we prove convergence of the SGD method for such adaptive learning rate sequences. Additionally, we illustrate the performance of the proposed approach for several optimization problems arising in scientific machine learning such as data-driven solutions of partial differential equations.

209:00 — A Tractable Online Learning Algorithm for the Multinomial Logit Contextual Bandit

The talk will be primarily on the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where in every round, a decision maker offers a subset (assortment) of products to a consumer and observes their response. Consumers purchase products to maximize their utility. We assume that a set of attributes describes the products, and the mean utility of a product is linear in the values of these attributes. We model consumer choice behavior using the widely used Multinomial Logit (MNL) model and consider the decision maker’s problem of dynamically learning the model parameters while optimizing cumulative revenue over the selling horizon $T$. Though this problem has recently attracted considerable attention, many existing methods often involve solving an intractable non-convex optimization problem, and their theoretical performance guarantees depend on a problem-dependent parameter, which could be prohibitively large. In particular, existing algorithms for this problem have regret bounded by $O(\sqrt{\kappa d T})$, where $\kappa$ is a problem-dependent constant that can have exponential dependency on the number of attributes. In this paper, we propose an optimistic algorithm and show that the regret is bounded by $O(\sqrt{dT} + \kappa)$, significantly improving the performance over existing methods. Further, we propose a convex relaxation of the optimization step, allowing for tractable decision-making while retaining the favorable regret guarantee.

309:30 — Smoothness-Adaptive Dynamic Pricing with Nonparametric Demand Learning

We study the dynamic pricing problem where the demand function is nonparametric and H\"older smooth, and we focus on adaptivity to the unknown H\"older smoothness parameter $\beta$. Traditional algorithm relies on the knowledge of $\beta$ to achieve a minimax optimal regret of $\widetilde{O}(T^{\frac{\beta+1}{2\beta+1}})$. We firstly prove no policy can adaptively achieve the optimal regret without knowledge of $\beta$. Motivated by the impossibility result, we propose a self-similarity condition to enable adaptivity, which does not compromise the problem's inherent complexity. Furthermore, we develop a smoothness-adaptive algorithm and prove that the algorithm achieves this optimal regret bound without the knowledge $\beta$.

410:00 — A Benders Decomposition Algorithm for the Overland Search and Rescue Problem Considering Search Area Selection

  • Saeid Abbasiparizi, Ph.D. Student At Department of Operations And Decision Systems, Université Laval, Québec, Canada
  • Michael Morin, Université Laval
  • Irène Abi Zeid, Full Professor At Department of Operations And Decision Systems, Université Laval

Search and rescue incidents occur frequently in Canada, some requiring the deployment of major search operations. In such situations, search and rescue decision support systems are vital. In this regard, an overland search and rescue model considering search area selection is proposed to maximize the probability of finding the search object, constrained by the search effort available. A solution to our problem includes the optimal set of search areas (rectangles where to position search patterns), the path between them, and the search pattern within each rectangle. We developed a Benders decomposition to solve the proposed model and compare its performance, for different instance sizes, to that of a MILP model in CPLEX. The obtained results confirm the efficiency of our Benders decomposition.