Peter Richtarik

All Hands Meetings on Big Data Optimization - Semester 1, 2018-2019

Venue: Al Khwarizmi Building, KAUST, ROOM: 2107 (2nd floor)
Time: Sundays 12:00 - 13:30 (lunch provided)

Date	Speaker	Paper
December 16, 2018
December 9, 2018	No meeting (exams)
December 2, 2018	No meeting (NIPS)
November 25, 2018
November 18, 2018
November 11, 2018
November 4, 2018
October 28, 2018
October 21, 2018
October 14, 2018
October 7, 2018
September 30, 2018	Dmitry Kovalev	Update on ongoing research work
September 23, 2018	No meeting (Saudi National Day)
September 16, 2018
September 9, 2018	Alibek Sailanbayev	Optimization of composition of functions
September 6, 2018	Samuel Horváth	Stochastic nested variance reduction for nonconvex optimization (Zhou, Xu, Gu - 6/2018)
August 30, 2018	Sarah Sachs	Generalizations of Jacobian sketching (summary of research work done during 6 months of internship at KAUST)

Organizers: Filip Hanzely, Aritra Dutta and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 2, 2017-2018

Venue: Al Khwarizmi Building, KAUST, ROOM: 2107 (2nd floor)
Time: Tuesdays 12:00 - 13:30 (lunch provided)

Date	Speaker	Paper
May 27, 2018	El Houcine Bergou	A line search algorithm inspired by the adaptive cubic regularization framework and complexity analysis (Bergou, Diouane, Gratton - 5/2018)
May 6, 2018	Matthias Mueller	Optimization for deep learning
April 29, 2018	Samuel Horváth	Second order stochastic optimization for machine learning in linear time (Agarwal, Bullins, Hazan - JMLR 2017)
April 15, 2018	Filip Hanzely	On the convergence of Adam and beyond (Reddi, Kale, Kumar - ICLR 2018)
April 8, 2018	Konstantin Mishchenko	A simple practical accelerated method for finite sums (Defazio - NIPS 2016)
March 25, 2018	Adel Bibi	Analytic expressions for probabilistic moments of PL-DNN with Gaussian input
March 18, 2018	Konstantin Mishchenko	Penalty formulation for constrained optimization
March 11, 2018	Alibek Sailanbayev	SignSGD: Compressed optimization for non-convex problems (Bernstein, Wang, Azizzadenesheli, Anandkumar - ICML 2018)
March 4, 2018	Samuel Horváth	Fast incremental method for nonconvex optimization (Reddi, Sra, Poczos, Smola - 3/2016)
February 27, 2018	El Houcine Bergou	Random direct search method for unconstrained minimization
February 20, 2018	Filip Hanzely	The implicit bias of gradient descent on separable data (Soudry, Hoffer, Nacson, Gunasekar, Srebro - 10/2017)
February 13, 2018	Nicolas Loizou	Random inexact projection methods

Organizers: Filip Hanzely, Aritra Dutta and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 1, 2017-2018

Venue: Al Khwarizmi Building, KAUST, ROOM: 2107 (2nd floor)
Time: Tuesdays 12:00 - 13:30 (lunch provided)

Date	Speaker	Paper
December 5, 2017	Konstantin Mishchenko	SARAH: A novel method for machine learning problems using stochastic recursive gradient (Nguyen, Liu, Scheinberg, Takac - ICML 2017)
November 28, 2017	Filip Hanzely	Relative continuity for non-Lipschitz non-smooth convex optimization using stochastic (or deterministic) mirror descent (Lu - 10/2017)
November 21, 2017	Robert Gower	SAGA is a variant of stochastic gradient: new view and new proof
November 14, 2017	Nicolas Loizou	First-order adaptive sample size methods to reduce complexity of empirical risk minimization (Mokhtari, Ribeiro - 9/2017)
November 7, 2017	Konstantin Mishchenko	Proximal-proximal-gradient method (Ryu, Yin - 8/2017)
October 31, 2017	Nikita Doikov	Regularized Newton methods for minimizing functions with Hölder continuous Hessians (Grapiglia, Nesterov - SIOPT 2017) Cubic regularization of Newton method and its global performance (Nesterov, Polyak - MAPR 2006)
October 24, 2017	Viktor Lukáček	Dykstra's algorithm with Bregman projections: a convergence proof (Bauschke, Lewis - Optimization 1998)
October 17, 2017	Sebastian Stich	Approximate steepest coordinate descent (Stich, Raj, Jaggi - ICML 2017)
October 10, 2017	Alibek Sailanbayev	Breaking locality accelerates block Gauss-Seidel (Tu, Venkataraman, Wilson, Gittens, Jordan, Recht - ICML 2017)
October 3, 2017	Konstantin Mishchenko	An asynchronous distributed prox-grad algorithm (Mishchenko, Iutzeler, Malick - 2017)
September 26, 2017	Konstantin Mishchenko	An asynchronous distributed prox-grad algorithm (Mishchenko, Iutzeler, Malick - 2017)
September 19, 2017	Aritra Dutta	Self-occlusion and disocclusion in causal video object segmentation (ICCV 2015)
September 12, 2017	Filip Hanzely	Randomized methods for relative smooth optimization (Hanzely, Richtarik - 2017)
August 29, 2017	Filip Hanzely	Relatively-smooth convex optimization by first-order methods, and applications (Lu, Freund and Nesterov - 10/2016)
August 22, 2017	Aritra Dutta	A Batch-Incremental Video Background Estimation Model using Weighted Low-Rank Approximation of Matrices (Dutta, Li and Richtárik - 7/2017)

Organizers: Filip Hanzely, Aritra Dutta and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 2, 2016-2017

Venue: James Clerk Maxwell Building ROOM: JCMB 5323 (5th floor)
Time: Tuesdays 12:15 - 13:30 (lunch provided)

We thankfully acknowledge support from the Head of School of Mathematics and the Center for Doctoral Training in Data Science

Date	Speaker	Paper
March 14, 2017	Filip Hanzely	Finding Approximate Local Minima for Nonconvex Optimization in Linear Time (Agarwal, Allen-Zhu, Bullins, Hazan, and Ma - 11/2016)
March 7, 2017	Marcelo Pereyra	Efficient Bayesian computation by proximal Markov chain Monte Carlo: when Langevin meets Moreau (Durmus, Moulines and Pereyra - 12/2016)
February 28, 2017	Jakub Konečný	QSGD: Randomized quantization for communication-optimal stochastic gradient descent (Alistarh, Li, Tomioka and Vojnovic - 10/2016)
February 21, 2017	Nicolas Loizou	Global convergence of the Heavy-ball method for convex optimization (Ghadimi, Feyzmahdavian and Johansson - 12/2014)
February 14, 2017	Kostas Zygalakis	A differential equation for modeling Nesterov's accelerated gradient method: theory and insights (Su, Boyd and Candes - NIPS 2014)
February 7, 2017	László A. Végh (LSE)	Rescaled first-order methods for linear programming (Dadush, Végh and Zambelli 11/2016)
January 31, 2017	Filip Hanzely	Relatively smooth convex optimization by first-order methods, and applications (Lu, Freund and Nesterov - 10/2016)
January 24, 2017	Ion Necoara (Bucharest)	Linear convergence of first order methods for non-strongly convex optimization (Necoara, Nesterov and Glineur - 4/2015)
January 17, 2017	Armin Eftekhari (The Alan Turing Institute)	The alternating descent conditional gradient method for sparse inverse problems (Boyd, Schiebinger and Recht - 7/2015)

Organizers: Nicolas Loizou and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 1, 2016-2017

Venue: James Clerk Maxwell Building ROOM: JCMB 6207 (6th floor)
Time: Tuesdays 12:15 - 13:30 (lunch provided: thanks to the support of the Head of School)

Date	Speaker	Paper
December 13, 2016	Panos Parpas	Using variational techniques to understand accelerated methods (Wibisono, Wilson and Jordan - 3/2016)
December 6, 2016	No meeting (NIPS)
November 29, 2016	Iain Murray	Fitting real-valued conditional distributions. Abstract: Neural networks can be used for regression. Given an input x, guess the output y. The standard optimization task is to minimize some regularized measure of mismatch between guesses and observed training outputs. Neural networks can also express their own uncertainty. For example, we can fit two functions, a guess m(x) and an "error-bar" s(x), by maximizing the total log probability of training outputs under a Gaussian model: \sum_n log N(y_n; m(x_n), s(x_n)^2). Fitting functions representing Gaussian outputs by stochastic steepest descent can be hard: the gradients of the loss with respect to the mean depend strongly on the standard deviation, making it hard to adapt step-sizes. Moving beyond the Gaussian assumption, we might represent p(y\|x) with a mixture of Gaussians, or with quantiles. For multivariate y we can use multivariate Gaussians or RNADE. Gaussians are also fitted in stochastic variational inference, sometimes with diagonal covariances, sometimes low-rank + diagonal. We are able to optimize all these things to some extent, but it's harder than conventional neural networks, which hinders wide-spread adoption of the methods. Relevant papers Mixture Density Networks (MDNs), Multivariate MDN, RNADE, Bayesian MDN, matrix manifold optimization for Gaussian mixtures
November 22, 2016	Lukasz Szpruch	An analytical framework for a consensus-based global optimization method (Carrillo, Choi, Totzek and Tse - 1/2016)
November 15, 2016	Dominik Csiba	Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure (Bietti and Mairal - 10/2016)
November 8, 2016	Aretha Teckentrup	Large-scale Gaussian process regression via doubly stochastic gradient descent (Yan, Xie, Song and Boots - 2015)
November 1, 2016	Filip Hanzely	Variance reduction for faster non-convex optimization (Allen-Zhu and Hazan - 3/2016)
October 25, 2016	Dominik Csiba	Linear coupling: an ultimate unification of gradient and mirror descent (Allen-Zhu and Orecchia - 1/2015)
October 18, 2016	Jakub Konečný	Train faster, generalize better: Stability of stochastic gradient descent (Hardt, Rech and Singer - 7/2016)
October 11, 2016	Nicolas Loizou	Convergence rates for greedy Kaczmarz algorithms, and faster randomized Kaczmarz rules using the orthogonality graph (Nutini, Sepehry, Laradji, Schmidt, Koepke, Virani - UAI 2016) supplementary material poster
October 4, 2016	Jakub Konečný	Differentially private empirical risk minimization (Chaudhuri, Monteleoni, Sarwate - JMLR 2011)
September 27, 2016	Dominik Csiba	Online ad allocation via online optimization (Jenatton, Huang, Csiba and Archambeau - 6/2016)

Organizers: Dominik Csiba and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 2, 2015-2016

Venue: James Clerk Maxwell Building ROOM: JCMB 4312 (4th floor)
Time: Tuesdays 12:15 - 13:30 (lunch provided: thanks to the support of the Head of School)

Date	Speaker	Paper
May 3, 2016	JC Pesquet (Paris)	A stochastic majorize-minimize subspace algorithm with application to filter identification (Chouzenoux and Pesquet - 12/2015)
April 26, 2016	Robert M Gower	Open-ended research discussion on the topic: "Newton-type methods for solving the empirical risk minimization problem"
April 19, 2016	Haihao Lu (MIT)	Norm-free methods
April 12, 2016	Sebastian Stich (CORE)	A simple, combinatorial algorithm for solving SDD systems in nearly-linear time (Kelner, Orecchia, Sidford, Allen-Zhu - 1/2013)
April 5, 2016	No meeting (Easter)
March 29, 2016	No meeting (Easter)
March 22, 2016	Nicolas Loizou	Second order stochastic optimization in linear time (Agarwal, Bullins and Hazan - 2/2016)
March 15, 2016	Robert M Gower	Sub-sampled Newton methods I: globally convergent algorithms (Roosta-Khorasani and Mahoney - 1/2016)
March 8, 2016	No meeting (I am in Oberwolfach...)
March 1, 2016	Dominik Csiba	Local smoothness in variance-reduced optimization (Vainsencher, Liu and Zhang - NIPS 2015 )
February 23, 2016	Jaroslav Fowkes	Submodular function maximization (based on a survey of Krause and Golovin 2012)
February 16, 2016	Jakub Konečný	Taming the wild: a unified analysis of Hogwild!-style algorithms (De Sa, Zhang, Olukotun, Re - NIPS 2015)
February 9 2016	No meeting (Dominik, Jakub, Robert and I will be in Les Houches)
February 2, 2016	Nicolas Loizou	Randomized gossip algorithms (Boyd, Ghosh, Prabhakar and Shah - IEEE Transactions on Information Theory 2006 and Dimakis, Kar, Moura, Rabbat and Scaglione - Proceedings of the IEEE)
January 26, 2016	Jakub Konečný	On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants (Reddi, Hefny, Sra, Poczos and Smola - NIPS 2015)

Organizers: Jakub Konečný and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 1, 2015-2016

Venue: James Clerk Maxwell Building ROOM: JCMB 6311 (6th floor)
Time: 12:15 - 13:15 (lunch provided: thanks to the support of the Head of School)

Date	Speaker	Paper
November 24, 2015	Nick Polydorides	A quasi Monte Carlo method for large-scale inverse problems (Polydorides, Wang & Bertsekas - 2012) more resources: [regression, inverse, DP chapter]
November 17, 2015	Ran Zhang	Path-following methods (Chapter 5 of Wright's "Primal-dual interior-point methods" book)
November 10, 2015	Jakub Konečný	Why random reshuffling beats stochastic gradient descent (Gurbuzbalaban, Ozdaglar and Parrilo - 10/2015)
November 3, 2015	Nicolas Loizou	Stochastic gradient descent, weighted sampling and the randomized Kaczmarz algorithm (Needell, Srebro and Ward - 10/2013)
October 27, 2015	Dominik Csiba	A universal catalyst for first-order optimization (Lin, Mairal & Harchaoui - 6/2015)
October 20, 2015	No meeting
October 13, 2015	Robert M Gower	Convergence rates of sub-sampled Newton methods (Erdogdu & Montanari - 8/2015)
October 6, 2015	Robert M Gower	Newton sketch (Pilanci & Wainwright - 5/2015)
September 29, 2015	Dominik Csiba	Beyond convexity: stochastic quasi-convex optimization (Hazan, Levy and S-Shwartz - 7/2015)
September 22, 2015	Jakub Konečný	Communication Complexity of Distributed Convex Learning and Optimization (Arjevani and Shamir - 6/2015)

Organizers: Jakub Konečný and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 2, 2014-2015

Venue: James Clerk Maxwell Building ROOM: JCMB 4312 (4th floor)
Time: 12:15 - 13:15 (lunch provided: thanks to the support of the Head of School)

Date	Speaker	Paper
May 19, 2015	Ian Wallace	HELM: Holomorphic Embedding Load flow Method (papers: 1 and 2 )
May 12, 2015	Andreas Grothey	Contingency generation for AC optimal power flow (Chiang and Grothey - 2012 [Optimization Online])
May 5, 2015	No meeting due to Optimization and Big Data 2015
April 28, 2015	Zheng Qu	On lower and upper bounds for smooth and strongly convex optimization problems (Arjevani, Shalev-Shwartz and Shamir - 3/2015)
April 21, 2015	Alessandro Perelli	Combining ordered subsets and momentum for accelerated X-ray CT image reconstruction (Donghwan, Ramani and Fessler - 1/2015, IEEE link)
April 14, 2015	Robert Gower	Research discussion
April 7, 2015	No meeting due to Easter Break
March 31, 2015	Dominik Csiba	Stochastic Dual Coordinate Ascent (SDCA): A Dual-Free Analysis (Shai Shalev-Shwartz - 2/2015)
March 24, 2015	Jakub Konečný	Greedy coordinate descent vs randomized coordinate descent
March 17, 2015	Tom Mayo and Guido Sanguinetti	Challenges for predictive modelling in high-throughput biology (papers: [1] and [2])
March 10, 2015	Zheng Qu	Complexity bounds for primal-dual methods minimizing the model of objective function (Nesterov - 2/2015)
March 3, 2015	Kimon Fountoulakis	Randomized numerical linear algebra meets big data optimization (Yang, Chow, Re and Mahoney - 2/2015 and Yang, Meng and Mahoney - 2/2015)
February 24, 2015	Robert M. Gower	Action constrained quasi-Newton methods (Gower and Gondzio - 12/2014)
February 17, 2015	no meeting due to Innovative Learning Week
February 10, 2015	Chris Williams	Linear dynamical systems applied to condition monitoring (papers [1] and [2]). Abstract: We develop a Hierarchical Switching Linear Dynamical System (HSLDS) for the detection of sepsis in neonates in an intensive care unit. The Factorial Switching LDS (FSLDS) of Quinn et al. (2009) is able to describe the observed vital signs data in terms of a number of discrete factors, which have either physiological or artifactual origin. We demonstrate that by adding a higher-level discrete variable with semantics sepsis/non-sepsis we can detect changes in the physiological factors that signal the presence of sepsis. We demonstrate that the performance of our model for the detection of sepsis is not statistically different from the auto-regressive HMM of Stanculescu et al. (2013), despite the fact that their model is given "ground truth" annotations of the physiological factors, while our HSLDS must infer them from the raw vital signs data. Joint work with Ioan Stanculescu and Yvonne Freer.
February 3, 2015	Jakub Konečný	Communication efficient distributed optimization using an approximate Newton-type method (Shamir, Srebro and Zhang - 12/2013)
January 27, 2015	Zheng Qu	A lower bound for the optimization of finite sums (Agarwal and Bottou - 10/2014)
January 20, 2015	Ilias Diakonikolas	Algorithms in Statistics (papers: long version [1] and short version [2]) Blurb: A broad class of big data – such as those collected from financial transactions, seismic measurements, neurobiological measurements, sensor nets, or network traffic records – is best modeled as samples from a probability distribution over a very large domain. One of the most basic statistical inference tasks in this setting is this: learn the underlying distribution that generated the data.

Organizers: Jakub Konečný, Zheng Qu and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 1, 2014-2015

Venue: James Clerk Maxwell Building ROOM: 6311 (6th floor)
Time: Tuesdays, 12:15 - 13:15 (lunch provided: thanks to NAIS)

Date	Speaker	Paper
December 2, 2014	Charles Sutton	Optimization in Modern Machine Learning: Four Vignettes (Exploratory data analysis: Mining transaction data, Unsupervised learning in neural networks, Signal disaggregation: Understanding household energy usage, Sampling from high dimensional distributions using continuous relaxations) (papers: [1] [2] [3] )
November 25, 2014	Dominik Csiba	Iterative Hessian sketch: fast and accurate solution approximation for constrained least-squares (based on Pilanci and Wainwright - 11/2014)
November 18, 2014	Xavier Cabezas	Cycle bases in network synchronization problems (based on [1, 2, 3])
November 11, 2014	Zheng Qu	Large-scale randomized-coordinate descent methods with non-separable linear constraints (Reddy, Hefny, Downey, Dubey and Sra - 10/2014)
November 4, 2014	Ademir Ribeiro	Towards a direct search method with adaptive directions/geometry (Ademir will describe some challenges of his ongoing research in the area; paper to read: Konecny and Richtarik - 09/2014)
October 28, 2014	Amos Storkey	Machine learning markets (abstract)
October 21, 2014	Dominik Csiba	A stochastic PCA algorithm with an exponential convergence rate (Shamir - 09/2014)
October 14, 2014	Jakub Konecny	Parallelism in optimization (this is a brainstorming session about the limits of paralleism in optimization and is not based on any papers)
October 7, 2014	Robert Gower	A stochastic quasi-Newton method for large-scale optimization (Byrd, Hansen, Nocedal and Singer - 2014)
September 30, 2014	Jakub Konecny	Trade-offs of large scale learning (papers: 1 - Bottou and Bousquet, 2 - Bottou and Bousquet, 3 - Bottou)
September 23, 2014	Zheng Qu	SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives (Defazio, Bach and Lacoste-Julien - 2014)
September 16, 2014	Kimon Fountoulakis	Robust block coordinate descent (Fountoulakis and Tappenden - 2014)

Organizers: Jakub Konečný, Zheng Qu and Peter Richtárik

All Hands Meetings on Big Data Optimization - Semester 2, 2013-2014

Venue: James Clerk Maxwell Building NEW ROOM: 4312 (4th floor)
Time: Tuesdays, 12:15 - 13:15 (refreshments provided: thanks to NAIS)

Date	Speaker	Paper
June 17, 2014	Mojmír Mutný	Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization (Martin Jaggi - ICML 2013)
June 10, 2014	no meeting (due to this event)
June 3, 2014	Lukas Szpruch	Multilevel Monte Carlo methods for applications in finance (Giles and Szpruch)
May 27, 2014	Jakub Konečný	Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning (Julien Mairal - 2014)
May 13, 2014	Zheng Qu	First-order methods of smooth convex optimization with inexact oracle (Devolder, Glineur and Nesterov - 2011). Preprint here.
May 6, 2014	Robert M. Gower	A Stochastic Quasi-Newton Method for Large-Scale Optimization (Byrd, Hansen, Nocedal and Singer - 2014). Plus maybe also some background from this paper.
April 30, 2014	Olivier Fercoq	Adaptive Subgradient Methods for Online Learning and Stochastic Optimization (Duchi, Hazan and Singer - 2011)
April 22, 2014	no meeting (spring break)
April 15, 2014	no meeting (spring break)
April 8, 2014	no meeting (spring break)
April 1, 2014	Martin Takáč	A Proximal Stochastic Gradient Method with Progressive Variance Reduction (Xiao and Zhang - 2014)
March 25, 2014	no meeting
March 18, 2014	Jakub Konečný	Subgradient Methods for Huge-Scale Optimization Problems (Nesterov - 2012) [Mathematical Programming 2013]
March 11, 2014	Kimon Fountoulakis	Parallel Coordinate Descent Newton for Efficient L1-Regularized Minimization (Bian, Li, Liu and Yang - 2013)
March 4, 2014	Mehrdad Yaghoobi	Efficient Projections onto the L1-Ball for Learning in High Dimensions (Duchi, Shalev-Shwartz, Singer, Chandra - 2008)
Feb 25, 2014	Zheng Qu	Finding the stationary states of Markov chains by iterative methods (Nesterov and Nemirovski - 2013)
Feb 18, 2014	no meeting as many of us will attend this event
Feb 11, 2014	Olivier Fercoq	Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems (Lee and Sidford - 2013)
Feb 4, 2014	Rachael Tappenden	Feature Clustering for Accelerating Parallel Coordinate Descent (Sherrer, Tewari, Halappanavar and Haglin - 2012)
Jan 28, 2014	Jakub Konečný	Minimizing Finite Sums with the Stochastic Average Gradient (Schmidt, Le Roux and Bach - 2013)

Organizers: Jakub Konečný and Peter Richtárik