All Hands Meetings on Big Data Optimization - Semester 1, 2018-2019
Venue: Al Khwarizmi Building, KAUST, ROOM: 2107 (2nd floor)Time: Sundays 12:00 - 13:30 (lunch provided)
Date | Speaker | Paper |
December 16, 2018 | ||
December 9, 2018 | No meeting (exams) | |
December 2, 2018 | No meeting
(NIPS) |
|
November 25, 2018 | ||
November 18, 2018 | ||
November 11, 2018 | ||
November 4, 2018 | ||
October 28, 2018 | ||
October 21, 2018 | ||
October 14, 2018 | ||
October 7, 2018 | ||
September 30, 2018 | Dmitry Kovalev |
Update on ongoing research work |
September 23, 2018 | No meeting (Saudi National Day) | |
September 16, 2018 | ||
September 9, 2018 |
Alibek Sailanbayev | Optimization of composition of functions |
September 6, 2018 |
Samuel Horváth | Stochastic nested variance reduction for nonconvex
optimization (Zhou, Xu, Gu
- 6/2018) |
August 30, 2018 |
Sarah Sachs |
Generalizations of Jacobian sketching (summary of
research work done during 6 months of internship at
KAUST) |
Organizers: Filip Hanzely, Aritra Dutta and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 2, 2017-2018
Venue: Al Khwarizmi Building, KAUST, ROOM: 2107 (2nd floor)Time: Tuesdays 12:00 - 13:30 (lunch provided)
Date | Speaker | Paper |
May 27, 2018 |
El Houcine Bergou | A line search algorithm
inspired by the adaptive cubic regularization
framework and complexity analysis (Bergou,
Diouane, Gratton - 5/2018) |
May 6, 2018 |
Matthias Mueller |
Optimization for deep
learning |
April 29, 2018 | Samuel Horváth | Second order stochastic
optimization for machine learning in linear time (Agarwal,
Bullins, Hazan - JMLR 2017) |
April 15, 2018 | Filip Hanzely | On the convergence of
Adam and beyond (Reddi,
Kale, Kumar - ICLR 2018) |
April 8, 2018 | Konstantin Mishchenko | A simple practical accelerated method
for finite sums (Defazio
- NIPS 2016) |
March 25, 2018 |
Adel Bibi |
Analytic expressions for probabilistic moments of
PL-DNN with Gaussian input |
March 18, 2018 |
Konstantin Mishchenko | Penalty formulation for constrained optimization |
March 11, 2018 | Alibek Sailanbayev | SignSGD: Compressed optimization for non-convex
problems (Bernstein,
Wang, Azizzadenesheli, Anandkumar - ICML 2018) |
March 4, 2018 |
Samuel
Horváth |
Fast incremental method for nonconvex optimization (Reddi,
Sra, Poczos, Smola - 3/2016) |
February 27, 2018 |
El
Houcine Bergou |
Random direct search method for unconstrained
minimization |
February 20, 2018 |
Filip Hanzely | The implicit bias of gradient descent on separable
data (Soudry,
Hoffer, Nacson, Gunasekar, Srebro - 10/2017) |
February 13, 2018 |
Nicolas Loizou | Random inexact projection methods |
Organizers: Filip Hanzely, Aritra Dutta and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 1, 2017-2018
Venue: Al Khwarizmi Building, KAUST, ROOM: 2107 (2nd floor)Time: Tuesdays 12:00 - 13:30 (lunch provided)
Date | Speaker | Paper |
December 5, 2017 |
Konstantin Mishchenko | SARAH: A novel method for machine
learning problems using stochastic recursive gradient
(Nguyen,
Liu, Scheinberg, Takac - ICML 2017) |
November 28, 2017 |
Filip
Hanzely |
Relative continuity for non-Lipschitz non-smooth
convex optimization using stochastic (or
deterministic) mirror descent (Lu - 10/2017) |
November 21, 2017 |
Robert
Gower |
SAGA is a variant of
stochastic gradient: new view and new proof |
November 14, 2017 |
Nicolas
Loizou |
First-order adaptive
sample size methods to reduce complexity of empirical
risk minimization (Mokhtari,
Ribeiro - 9/2017) |
November 7, 2017 |
Konstantin
Mishchenko |
Proximal-proximal-gradient
method (Ryu,
Yin - 8/2017) |
October 31, 2017 | Nikita
Doikov |
Regularized Newton
methods for minimizing functions with Hölder
continuous Hessians (Grapiglia,
Nesterov - SIOPT 2017) Cubic regularization of
Newton method and its global performance (Nesterov,
Polyak - MAPR 2006) |
October 24, 2017 | Viktor Lukáček |
Dykstra's algorithm
with Bregman projections: a convergence proof (Bauschke,
Lewis - Optimization 1998) |
October 17, 2017 | Sebastian
Stich |
Approximate steepest coordinate descent
(Stich,
Raj, Jaggi - ICML 2017) |
October 10, 2017 |
Alibek
Sailanbayev |
Breaking locality accelerates block Gauss-Seidel (Tu,
Venkataraman, Wilson, Gittens, Jordan, Recht - ICML
2017) |
October 3, 2017 |
Konstantin Mishchenko | An asynchronous distributed prox-grad algorithm (Mishchenko, Iutzeler, Malick - 2017) |
September 26, 2017 | Konstantin
Mishchenko |
An asynchronous distributed prox-grad algorithm (Mishchenko, Iutzeler, Malick - 2017) |
September 19, 2017 |
Aritra Dutta |
Self-occlusion and disocclusion in causal video
object segmentation (ICCV
2015) |
September 12, 2017 |
Filip Hanzely | Randomized methods for relative smooth optimization
(Hanzely,
Richtarik - 2017) |
August 29, 2017 |
Filip
Hanzely |
Relatively-smooth convex optimization by first-order
methods, and applications (Lu, Freund
and Nesterov - 10/2016) |
August 22, 2017 |
Aritra Dutta |
A Batch-Incremental Video Background Estimation
Model using Weighted Low-Rank Approximation of
Matrices (Dutta,
Li and Richtárik - 7/2017) |
Organizers: Filip Hanzely, Aritra Dutta and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 2, 2016-2017
Venue: James Clerk Maxwell Building ROOM: JCMB 5323 (5th floor)Time: Tuesdays 12:15 - 13:30 (lunch provided)
We thankfully acknowledge support from the Head of School of Mathematics and the Center for Doctoral Training in Data Science
Date | Speaker | Paper |
March 14, 2017 |
Filip Hanzely | Finding Approximate Local Minima for Nonconvex
Optimization in Linear Time (Agarwal,
Allen-Zhu, Bullins, Hazan, and Ma - 11/2016) |
March 7, 2017 |
Marcelo
Pereyra |
Efficient Bayesian computation by
proximal Markov chain Monte Carlo: when Langevin meets
Moreau (Durmus,
Moulines and Pereyra - 12/2016) |
February 28, 2017 |
Jakub Konečný |
QSGD: Randomized quantization for
communication-optimal stochastic gradient descent (Alistarh,
Li, Tomioka and Vojnovic - 10/2016) |
February 21, 2017 |
Nicolas
Loizou |
Global convergence of the Heavy-ball method for
convex optimization (Ghadimi,
Feyzmahdavian and Johansson - 12/2014) |
February 14, 2017 | Kostas
Zygalakis |
A differential equation for modeling Nesterov's
accelerated gradient method: theory and insights (Su,
Boyd
and Candes - NIPS 2014) |
February 7, 2017 | László A.
Végh (LSE) |
Rescaled first-order methods for linear programming
(Dadush,
Végh and Zambelli 11/2016) |
January 31, 2017 | Filip Hanzely | Relatively smooth convex optimization by first-order
methods, and applications (Lu, Freund
and Nesterov - 10/2016) |
January 24, 2017 | Ion
Necoara (Bucharest) |
Linear convergence of first order methods for
non-strongly convex optimization (Necoara,
Nesterov and Glineur - 4/2015) |
January 17, 2017 | Armin
Eftekhari
(The Alan Turing Institute) |
The alternating descent conditional gradient method
for sparse inverse problems (Boyd,
Schiebinger and Recht - 7/2015) |
Organizers: Nicolas Loizou and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 1, 2016-2017
Venue: James Clerk Maxwell Building ROOM: JCMB 6207 (6th floor)Time: Tuesdays 12:15 - 13:30 (lunch provided: thanks to the support of the Head of School)
Date | Speaker | Paper |
December 13, 2016 |
Panos
Parpas |
Using variational techniques to
understand accelerated methods (Wibisono,
Wilson and Jordan - 3/2016) |
December 6, 2016 |
No meeting (NIPS) | |
November 29, 2016 |
Iain
Murray |
Fitting real-valued conditional
distributions. Abstract: Neural networks can be used for regression. Given an input x, guess the output y. The standard optimization task is to minimize some regularized measure of mismatch between guesses and observed training outputs. Neural networks can also express their own uncertainty. For example, we can fit two functions, a guess m(x) and an "error-bar" s(x), by maximizing the total log probability of training outputs under a Gaussian model: \sum_n log N(y_n; m(x_n), s(x_n)^2). Fitting functions representing Gaussian outputs by stochastic steepest descent can be hard: the gradients of the loss with respect to the mean depend strongly on the standard deviation, making it hard to adapt step-sizes. Moving beyond the Gaussian assumption, we might represent p(y|x) with a mixture of Gaussians, or with quantiles. For multivariate y we can use multivariate Gaussians or RNADE. Gaussians are also fitted in stochastic variational inference, sometimes with diagonal covariances, sometimes low-rank + diagonal. We are able to optimize all these things to some extent, but it's harder than conventional neural networks, which hinders wide-spread adoption of the methods. Relevant papers Mixture Density Networks (MDNs), Multivariate MDN, RNADE, Bayesian MDN, matrix manifold optimization for Gaussian mixtures |
November 22, 2016 |
Lukasz Szpruch | An analytical framework for a consensus-based global optimization method (Carrillo, Choi, Totzek and Tse - 1/2016) |
November 15, 2016 |
Dominik Csiba | Stochastic Optimization with Variance
Reduction for Infinite Datasets with Finite-Sum
Structure (Bietti
and Mairal - 10/2016) |
November 8, 2016 |
Aretha
Teckentrup |
Large-scale Gaussian process regression via doubly
stochastic gradient descent (Yan,
Xie,
Song and Boots - 2015) |
November 1, 2016 |
Filip Hanzely |
Variance reduction for faster non-convex
optimization (Allen-Zhu
and Hazan - 3/2016) |
October 25, 2016 | Dominik Csiba | Linear coupling: an ultimate unification of gradient
and mirror descent (Allen-Zhu and
Orecchia - 1/2015) |
October 18, 2016 | Jakub Konečný | Train faster, generalize better: Stability of
stochastic gradient descent (Hardt, Rech
and Singer - 7/2016) |
October 11, 2016 | Nicolas Loizou | Convergence rates for greedy Kaczmarz algorithms,
and faster randomized Kaczmarz rules using the
orthogonality graph (Nutini,
Sepehry,
Laradji, Schmidt, Koepke, Virani - UAI 2016) supplementary
material poster |
October 4, 2016 | Jakub Konečný | Differentially private empirical risk minimization (Chaudhuri,
Monteleoni,
Sarwate - JMLR 2011) |
September 27, 2016 | Dominik Csiba | Online ad allocation via online optimization (Jenatton, Huang, Csiba and Archambeau - 6/2016) |
Organizers: Dominik Csiba and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 2, 2015-2016
Venue: James Clerk Maxwell Building ROOM: JCMB 4312 (4th floor)Time: Tuesdays 12:15 - 13:30 (lunch provided: thanks to the support of the Head of School)
Date | Speaker | Paper |
May 3, 2016 |
JC
Pesquet (Paris) |
A stochastic majorize-minimize subspace
algorithm with application to filter identification (Chouzenoux
and Pesquet - 12/2015) |
April 26, 2016 |
Robert M Gower | Open-ended research discussion on the
topic: "Newton-type methods for solving the empirical
risk minimization problem" |
April 19, 2016 |
Haihao
Lu (MIT) |
Norm-free methods |
April 12, 2016 |
Sebastian
Stich
(CORE) |
A simple, combinatorial algorithm for
solving SDD systems in nearly-linear time (Kelner,
Orecchia, Sidford, Allen-Zhu - 1/2013) |
April 5, 2016 |
No meeting (Easter) |
|
March 29, 2016 |
No meeting (Easter) | |
March 22, 2016 |
Nicolas Loizou | Second order stochastic optimization in
linear time (Agarwal,
Bullins and Hazan - 2/2016) |
March 15, 2016 |
Robert M Gower | Sub-sampled Newton methods I: globally convergent
algorithms (Roosta-Khorasani
and Mahoney - 1/2016) |
March 8, 2016 |
No meeting (I am in Oberwolfach...) |
|
March 1, 2016 | Dominik Csiba | Local smoothness in variance-reduced optimization (Vainsencher,
Liu
and Zhang - NIPS 2015 ) |
February 23, 2016 | Jaroslav
Fowkes |
Submodular function maximization (based on a survey
of Krause and Golovin 2012) |
February 16, 2016 | Jakub Konečný | Taming the wild: a unified analysis of
Hogwild!-style algorithms (De
Sa,
Zhang, Olukotun, Re - NIPS 2015) |
February 9 2016 | No meeting (Dominik, Jakub, Robert
and I will be in Les
Houches) |
|
February 2, 2016 | Nicolas Loizou | Randomized gossip algorithms (Boyd,
Ghosh, Prabhakar and Shah - IEEE Transactions on
Information Theory 2006 and Dimakis,
Kar,
Moura, Rabbat and Scaglione - Proceedings of the
IEEE) |
January 26, 2016 | Jakub Konečný | On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants (Reddi, Hefny, Sra, Poczos and Smola - NIPS 2015) |
Organizers: Jakub Konečný and
Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 1, 2015-2016
Venue: James Clerk Maxwell Building ROOM: JCMB 6311 (6th floor)Time: 12:15 - 13:15 (lunch provided: thanks to the support of the Head of School)
Date | Speaker | Paper |
November 24, 2015 | Nick Polydorides | A quasi Monte Carlo method for large-scale inverse
problems (Polydorides,
Wang
& Bertsekas - 2012) more resources: [regression,
inverse,
DP
chapter] |
November 17, 2015 | Ran Zhang | Path-following methods (Chapter 5 of Wright's "Primal-dual interior-point methods" book) |
November 10, 2015 | Jakub Konečný | Why random reshuffling beats stochastic gradient descent (Gurbuzbalaban, Ozdaglar and Parrilo - 10/2015) |
November 3, 2015 | Nicolas
Loizou |
Stochastic gradient descent, weighted sampling and the randomized Kaczmarz algorithm (Needell, Srebro and Ward - 10/2013) |
October 27, 2015 | Dominik Csiba |
A universal catalyst for first-order optimization (Lin, Mairal
& Harchaoui - 6/2015) |
October 20, 2015 | No meeting |
|
October 13, 2015 | Robert M Gower | Convergence rates of sub-sampled Newton methods (Erdogdu &
Montanari - 8/2015) |
October 6, 2015 | Robert
M Gower |
Newton sketch (Pilanci &
Wainwright - 5/2015) |
September 29, 2015 | Dominik Csiba |
Beyond convexity: stochastic quasi-convex optimization (Hazan, Levy and S-Shwartz - 7/2015) |
September 22, 2015 | Jakub Konečný | Communication Complexity of Distributed Convex Learning and Optimization (Arjevani and Shamir - 6/2015) |
Organizers: Jakub Konečný and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 2, 2014-2015
Venue: James Clerk Maxwell Building ROOM: JCMB 4312 (4th floor)Time: 12:15 - 13:15 (lunch provided: thanks to the support of the Head of School)
Date | Speaker | Paper |
May 19, 2015 | Ian Wallace | HELM: Holomorphic Embedding Load flow Method (papers: 1 and 2 ) |
May 12, 2015 | Andreas Grothey | Contingency generation for AC optimal power flow (Chiang and Grothey - 2012 [Optimization Online]) |
May 5, 2015 | No meeting due to Optimization and Big Data 2015 | |
April 28, 2015 | Zheng Qu | On lower and upper bounds for smooth and strongly convex optimization problems (Arjevani, Shalev-Shwartz and Shamir - 3/2015) |
April 21, 2015 | Alessandro Perelli | Combining ordered subsets and momentum for accelerated X-ray CT image reconstruction (Donghwan, Ramani and Fessler - 1/2015, IEEE link) |
April 14, 2015 | Robert Gower | Research discussion |
April 7, 2015 | No meeting due to Easter Break | |
March 31, 2015 | Dominik Csiba | Stochastic Dual Coordinate Ascent (SDCA): A Dual-Free Analysis (Shai Shalev-Shwartz - 2/2015) |
March 24, 2015 | Jakub Konečný | Greedy coordinate descent vs randomized coordinate descent |
March 17, 2015 | Tom Mayo and Guido Sanguinetti | Challenges for predictive modelling in high-throughput biology (papers: [1] and [2]) |
March 10, 2015 | Zheng Qu | Complexity bounds for primal-dual methods minimizing the model of objective function (Nesterov - 2/2015) |
March 3, 2015 | Kimon Fountoulakis | Randomized numerical linear algebra meets big data optimization (Yang, Chow, Re and Mahoney - 2/2015 and Yang, Meng and Mahoney - 2/2015) |
February 24, 2015 | Robert M. Gower | Action constrained quasi-Newton methods (Gower and Gondzio - 12/2014) |
February 17, 2015 | no meeting due to Innovative Learning Week | |
February 10, 2015 | Chris Williams | Linear dynamical systems applied to condition
monitoring (papers [1]
and [2]).
Abstract: We develop a Hierarchical Switching Linear Dynamical System (HSLDS) for the detection of sepsis in neonates in an intensive care unit. The Factorial Switching LDS (FSLDS) of Quinn et al. (2009) is able to describe the observed vital signs data in terms of a number of discrete factors, which have either physiological or artifactual origin. We demonstrate that by adding a higher-level discrete variable with semantics sepsis/non-sepsis we can detect changes in the physiological factors that signal the presence of sepsis. We demonstrate that the performance of our model for the detection of sepsis is not statistically different from the auto-regressive HMM of Stanculescu et al. (2013), despite the fact that their model is given "ground truth" annotations of the physiological factors, while our HSLDS must infer them from the raw vital signs data. Joint work with Ioan Stanculescu and Yvonne Freer. |
February 3, 2015 | Jakub Konečný | Communication efficient distributed optimization using an approximate Newton-type method (Shamir, Srebro and Zhang - 12/2013) |
January 27, 2015 | Zheng Qu | A lower bound for the optimization of finite sums (Agarwal and Bottou - 10/2014) |
January 20, 2015 | Ilias Diakonikolas | Algorithms in Statistics (papers: long version [1]
and short version [2])
Blurb: A broad class of big data – such as those collected from financial transactions, seismic measurements, neurobiological measurements, sensor nets, or network traffic records – is best modeled as samples from a probability distribution over a very large domain. One of the most basic statistical inference tasks in this setting is this: learn the underlying distribution that generated the data. |
Organizers: Jakub Konečný, Zheng Qu and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 1, 2014-2015
Venue: James Clerk Maxwell Building ROOM: 6311 (6th floor)Time: Tuesdays, 12:15 - 13:15 (lunch provided: thanks to NAIS)
Date | Speaker | Paper |
December 2, 2014 | Charles Sutton | Optimization in Modern Machine Learning: Four Vignettes (Exploratory data analysis: Mining transaction data, Unsupervised learning in neural networks, Signal disaggregation: Understanding household energy usage, Sampling from high dimensional distributions using continuous relaxations) (papers: [1] [2] [3] ) |
November 25, 2014 | Dominik Csiba | Iterative Hessian sketch: fast and accurate solution approximation for constrained least-squares (based on Pilanci and Wainwright - 11/2014) |
November 18, 2014 | Xavier Cabezas | Cycle bases in network synchronization problems (based on [1, 2, 3]) |
November 11, 2014 | Zheng Qu | Large-scale randomized-coordinate descent methods with non-separable linear constraints (Reddy, Hefny, Downey, Dubey and Sra - 10/2014) |
November 4, 2014 | Ademir Ribeiro | Towards a direct search method with adaptive directions/geometry (Ademir will describe some challenges of his ongoing research in the area; paper to read: Konecny and Richtarik - 09/2014) |
October 28, 2014 | Amos Storkey | Machine learning markets (abstract) |
October 21, 2014 | Dominik Csiba | A stochastic PCA algorithm with an exponential convergence rate (Shamir - 09/2014) |
October 14, 2014 | Jakub Konecny | Parallelism in optimization (this is a brainstorming session about the limits of paralleism in optimization and is not based on any papers) |
October 7, 2014 | Robert Gower | A stochastic quasi-Newton method for large-scale optimization (Byrd, Hansen, Nocedal and Singer - 2014) |
September 30, 2014 | Jakub Konecny | Trade-offs of large scale learning (papers: 1 - Bottou and Bousquet, 2 - Bottou and Bousquet, 3 - Bottou) |
September 23, 2014 | Zheng Qu | SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives (Defazio, Bach and Lacoste-Julien - 2014) |
September 16, 2014 | Kimon Fountoulakis | Robust block coordinate descent (Fountoulakis and Tappenden - 2014) |
Organizers: Jakub Konečný, Zheng Qu and Peter Richtárik
All Hands Meetings on Big Data Optimization - Semester 2, 2013-2014
Venue: James Clerk Maxwell Building NEW ROOM: 4312 (4th floor)Time: Tuesdays, 12:15 - 13:15 (refreshments provided: thanks to NAIS)
Organizers: Jakub Konečný and Peter Richtárik