Video Talks

The role of local training in federated learning (90 mins)
Better AI Meetup, Bratislava, Slovakia, 6/2022

ProxSkip: Yes! Local gradient steps provably lead to communication acceleration! Finally! (71 mins)
Federated Learning One World Seminar, 5/2022

Permutation compressors for provably faster distributed nonconvex optimization (5 mins)
ICLR video talk, 3/2022

Permutation compressors for provably faster distributed nonconvex optimization (78 mins)
Federated Learning One World Seminar, 2/2022

Permutation compressors for provably faster distributed nonconvex optimization (67 mins)
Machine Learning NeEDS Mathematical Optimization, 2/2022

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback (62 mins)
Federated Learning One World Seminar, 7/2021

Beyond Local and Gradient Methods for Federated Learning (80 mins)
Federated Learning One World Seminar, 4/2021

Distributed Second Order Methods with Fast Rates and Compressed Communication
All Russian Optimization Seminar, 4/2021

On Second Order Methods and Randomness (72 mins)
Montreal MLOpt Seminar, 5/2020

On Second Order Methods and Randomness (73 mins)
One World Optimization Seminar, 4/2020

A Guided Walk Through the ZOO of Stochastic Gradient Descent Methods (5 hrs)
MIPT, Moscow, Russia, 9/2019

Variance Reduction for Gradient Compression (38 mins)
Rutgers University, 9/2019

Stochastic Quasi-Gradient Methods: Variance Reduction via Jacobian Sketching (33 mins)
Simons Institute, Berkeley, 9/2018

SGD: General Analysis and Improved Rates (start: 35:40 end: 53:11)
ICML 2019

Empirical Risk Minimization: Complexity, Duality, Sampling, Sparsity and Big Data (85 mins)
Yandex, Russia, 12/2017

Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling (1 hr)
MIPT, Moscow, Russia, 10/2017

Introduction to Randomized Optimization 1 2 3 4 5 (5 hrs)
Ecole Polytechnique, France, 8/2017

Stochastic Dual Ascent for Solving Linear Systems (31 mins)
The Alan Turing Institute, London, UK, 10/2016

Introduction to Big Data Optimization (55 mins)
Portsmouth, UK, 9/2016

Accelerated, Parallel and Proximal Coordinate Descent (90 mins)
Moscow, Russia, 2/2014

Parallel Coordinate Descent Methods for Big Data Optimization (55 mins)
Simons Institute, Berkeley, 10/2013