1 — 08:30 — Active Learning via Density-based Space Transformation
In this work, we study active covering, a variant of the active-learning problem where we are required to find all of the examples with a positive label. We present an algorithm namely Density-Adjusted Adaptive (DAA) learner that queries the labels according to a similarity function based on a density-adjusted metric. Under mild assumptions, we prove that our algorithm discovers all of the positive labels while querying only a truly sublinear number of examples outside the support of positive examples. Our result extends the space of acceptable inputs by modifying the assumptions made by the previous work. Our experiments show that our algorithm significantly improves the prior work on a wide range of standard benchmark datasets, complementing our theoretical result. For example, when measuring the performance by AUC, our algorithm is the best in $19$ out of $20$ experiments.
2 — 09:00 — An exponential penalization, network decomposition algorithm for large LPs
Large linear programs, in the order of hundred of millions or more variables and constraints, with large network-flow sub-structures, naturally arise in applications like mine planning, joint network design and routing problems, pricing problems, and in many other contexts. Most of these instances are impossible to solve with standard commercial LP solvers, but there are some specialized algorithms making headway for this class of problems.
In this presentation we motivate the importance of this class of problems, give a short literature review of the area, both of the theory and of previous implementations and alternative algorithms, and present our current results on an implementation of an exponential-penalization network-sub-structure-aware parallel algorithm; comparing it against both simplex and barrier methods on MIPLIB linear relaxations, and on a set of large instances collected from different applications.
3 — 09:30 — Capacity management via network flows with nonlinear transformations at the nodes
We define and study a class of directed network flow problems where nonlinear transformations are applied at the nodes in the network. We discuss how, given an appropriately selected set of nonlinearities, we can model this problem using mixed-integer programming; we also speculate on alternative algorithmic approaches that could bear fruit. We motivate this class of problems with a connection to how Google manages its compute resources, by showing how this abstraction is a useful substrate for modeling complex capacity management problems.
4 — 10:00 — ** CANCELLED ** Network-aware scheduling of large ML workloads
Given the scale of state-of-the-art models in machine learning, they are often trained in parallel across multiple hardware accelerators (GPUs or TPUs) that must communicate with each other. While smaller models can rely on dedicated links, larger models may partially need to use the broader data center network, where congestion can cause the training to stall and waste resources. To reduce the chance of network congestion, we can optimize the assignment of accelerator resources with network conditions in mind, which can be formulated as a generalized quadratic assignment problem with capacity constraints and other requirements. We discuss a MIP-based approach and specialized heuristics to solve this problem from a practical perspective.