All papers are listed below in reverse chronological order in which they appeared online.

Prepared in 2021

[169] Zhize Li and Peter Richtárik
CANITA: Faster rates for distributed convex optimization with communication compression
Federated Learning Paper
[arXiv]

[168] 50+ authors
A field guide to federated optimization
Federated Learning Paper
[arXiv]

[167] Peter Richtárik, Igor Sokolov and Ilyas Fatkhullin
EF21: A new, simpler, theoretically better, and practically faster error feedback
Federated Learning Paper
[arXiv] [slides] [code: EF21, EF21+]

[166] Dmitry Kovalev, Elnur Gasanov, Peter Richtárik and Alexander Gasnikov
Lower bounds and optimal algorithms for smooth and strongly convex decentralized optimization over time-varying networks
[arXiv] [code: ADOM+]

[165] Bokun Wang, Mher Safaryan and Peter Richtárik
Smoothness-aware quantization techniques
Federated Learning Paper
[arXiv] [code: DCGD+, DIANA+]

[164] Adil Salim, Lukang Sun and Peter Richtárik
Complexity analysis of Stein variational gradient descent under Talagrand's inequality T1
[arXiv] [code: SVGD]

[163] Laurent Condat and Peter Richtárik
MURANA: A generic framework for stochastic variance-reduced optimization
Federated Learning Paper
[arXiv] [code: MURANA, ELVIRA]

[162] Mher Safaryan, Rustem Islamov, Xun Qian and Peter Richtárik
FedNL: Making Newton-type methods applicable to federated learning
Federated Learning Paper
[arXiv] [code: FedNL, FedNL-PP, FedNL-CR, FedNL-LS, FedNL-BC, N0, NS]

[161] Grigory Malinovsky, Alibek Sailanbayev and Peter Richtárik
Random reshuffling with variance reduction: new analysis and better rates
[arXiv] [3 min video] [code: RR-SVRG, SO-SVRG, Cyclic-SVRG]

[160] Zhize Li and Peter Richtárik
ZeroSARAH: Efficient nonconvex finite-sum optimization with zero full gradient computation
Federated Learning Paper
[arXiv] [code: Zero-SARAH]

[159] Adil Salim, Laurent Condat, Dmitry Kovalev and Peter Richtárik
An optimal algorithm for strongly convex minimization under affine constraints
[arXiv]

[158] Zhen Shi, Nicolas Loizou, Peter Richtárik and Martin Takáč
AI-SARAH: Adaptive and implicit stochastic recursive gradient methods
[arXiv] [code: AI-SARAH]

[157] Dmitry Kovalev, Egor Shulgin, Peter Richtárik, Alexander Rogozin and Alexander Gasnikov
ADOM: Accelerated decentralized optimization method for time-varying networks
38th International Conference on Machine Learning (ICML 2021)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
[arXiv] [5 min video] [poster] [code: ADOM]

[156] Konstantin Mishchenko, Bokun Wang, Dmitry Kovalev and Peter Richtárik
IntSGD: Floatless compression of stochastic gradients
Federated Learning Paper
[arXiv] [5 min video] [code: IntSGD, IntDIANA]

[155] Eduard Gorbunov, Konstantin Burlachenko, Zhize Li and Peter Richtárik
MARINA: faster non-convex distributed learning with compression
38th International Conference on Machine Learning (ICML 2021)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [5 min video] [70 min video] [poster] [code: MARINA, VR-MARINA, PP-MARINA]

[154] Mher Safaryan, Filip Hanzely and Peter Richtárik
Smoothness matrices beat smoothness constants: better communication compression techniques for distributed optimization
ICLR Workshop: Distributed and Private Machine Learning
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [5 min video][code: DCGD+, DIANA+, ADIANA+]

[153] Rustem Islamov, Xun Qian and Peter Richtárik
Distributed second order methods with fast rates and compressed communication
38th International Conference on Machine Learning (ICML 2021)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [5 min video] [80 min video] [slides] [poster] [code: NS, MN, NL1, NL2, CNL]

[152] Konstantin Mishchenko, Ahmed Khaled and Peter Richtárik
Proximal and federated random reshuffling
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [8 min video] [code:ProxRR, FedRR]

Prepared in 2020

[151] Samuel Horváth, Aaron Klein, Peter Richtárik and Cedric Archambeau
Hyperparameter transfer learning with adaptive complexity
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
[arXiv]

[150] Xun Qian, Hanze Dong, Peter Richtárik and Tong Zhang
Error compensated loopless SVRG for distributed optimization
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
Federated Learning Paper
[poster] [code: EC-LSVRG]

[149] Xun Qian, Hanze Dong, Peter Richtárik and Tong Zhang
Error compensated proximal SGD and RDA
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[poster] [code: EC-SGD, EC-RDA]

[148] Eduard Gorbunov, Filip Hanzely, and Peter Richtárik
Local SGD: unified theory and new efficient methods
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Federated Learning Paper
[arXiv] [5 min video] [poster] [code: S-Local-SVRG]

[147] Dmitry Kovalev, Anastasia Koloskova, Martin Jaggi, Peter Richtárik, and Sebastian U. Stich
A linearly convergent algorithm for decentralized optimization: sending less bits for free!
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Federated Learning Paper
[arXiv] [3 min video]

[146] Wenlin Chen, Samuel Horváth, and Peter Richtárik
Optimal client sampling for federated learning
Privacy Preserving Machine Learning (NeurIPS 2020 Workshop)
Federated Learning Paper
[arXiv] [code: OCS, AOCS]

[145] Eduard Gorbunov, Dmitry Kovalev, Dmitry Makarenko, and Peter Richtárik
Linearly converging error compensated SGD
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Federated Learning Paper
[arXiv] [5 min video] [code: EC-SGD-DIANA, EC-LSVRG-DIANA, EC-LSVRGstar, ...]

[144] Alyazeed Albasyoni, Mher Safaryan, Laurent Condat, and Peter Richtárik
Optimal gradient compression for distributed and federated learning
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [poster] [video]

[143] Filip Hanzely, Slavomír Hanzely, Samuel Horváth, and Peter Richtárik
Lower bounds and optimal algorithms for personalized federated learning
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Federated Learning Paper
[arXiv] [5 min video] [code: APGD1, APGD2, IAPGD, AL2SGD+]

[142] Laurent Condat, Grigory Malinovsky, and Peter Richtárik
Distributed proximal splitting algorithms with rates and acceleration
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
Spotlight Talk
[arXiv] [poster]

[141] Robert M. Gower, Mark Schmidt, Francis Bach and Peter Richtárik
Variance-reduced methods for machine learning
Proceedings of the IEEE 108 (11):1968--1983, 2020
[arXiv]

[140] Xun Qian, Peter Richtárik, and Tong Zhang
Error compensated distributed SGD can be accelerated
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
Federated Learning Paper
[arXiv] [poster] [code: ECLK]

[139] Albert S. Berahas, Majid Jahani, Peter Richtárik, and Martin Takáč
Quasi-Newton methods for deep learning: forget the past, just sample
To appear in: Optimization Methods and Software, 2021
[arXiv] [code: S-LBFGS, S-LSR1]

[138] Zhize Li, Hongyan Bao, Xiangliang Zhang and Peter Richtárik
PAGE: A simple and optimal probabilistic gradient estimator for nonconvex optimization
38th International Conference on Machine Learning (ICML 2021)
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop) (Spotlight Talk)
[arXiv] [5 min video] [code: PAGE]

[137] Dmitry Kovalev, Adil Salim, and Peter Richtárik
Optimal and practical algorithms for smooth and strongly convex decentralized optimization
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
[arXiv] [code: APAPC, OPAPC, Algorithm 3]

[136] Ahmed Khaled, Othmane Sebbouh, Nicolas Loizou, Robert M. Gower, and Peter Richtárik
Unified analysis of stochastic gradient methods for composite convex and smooth optimization
[arXiv] [code: SGD]

[135] Samuel Horváth and Peter Richtárik
A better alternative to error feedback for communication-efficient distributed learning
9th International Conference on Learning Representations (ICLR 2021)
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
The Best Paper Award at NeurIPS-20 Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [poster] [code: DCSGD]

[134] Adil Salim and Peter Richtárik
Primal dual interpretation of the proximal stochastic gradient Langevin algorithm
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
[arXiv] [code: PGSLA]

[133] Zhize Li and Peter Richtárik
A unified analysis of stochastic gradient methods for nonconvex federated optimization
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [video]

[132] Konstantin Mishchenko, Ahmed Khaled, and Peter Richtárik
Random reshuffling: simple analysis with vast improvements
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
[arXiv] [4 min video] [code: RR, SO, IG]

[131] Motasem Alfarra, Slavomír Hanzely, Alyazeed Albasyoni, Bernard Ghanem, and Peter Richtárik
Adaptive learning of the optimal mini-batch size of SGD
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster]

[130] Adil Salim, Laurent Condat, Konstantin Mishchenko, and Peter Richtárik
Dualize, split, randomize: fast nonsmooth optimization algorithms
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster] [code: PDDY, SPDDY, SPD3O, SPAPC]

[129] Atal Narayan Sahu, Aritra Dutta, Aashutosh Tiwari, and Peter Richtárik
On the convergence analysis of asynchronous SGD for solving consistent linear systems
[arXiv] [code: DASGD]

[128] Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, and Peter Richtárik
From local SGD to local fixed point methods for federated learning
37th International Conference on Machine Learning (ICML 2020)
Federated Learning Paper
[arXiv] [5 min video] [code: LDFPM, RDFPM]

[127] Aleksandr Beznosikov, Samuel Horváth, Peter Richtárik and Mher Safaryan
On biased compression for distributed learning
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [poster] [code: CGD, Distributed SGD with Error Feedback]

[126] Zhize Li, Dmitry Kovalev, Xun Qian and Peter Richtárik
Acceleration for compressed gradient descent in distributed and federated optimization
37th International Conference on Machine Learning (ICML 2020)
Federated Learning Paper
[arXiv] [code: ACGD, ADIANA]

[125] Dmitry Kovalev, Robert M. Gower, Peter Richtárik and Alexander Rogozin
Fast linear convergence of randomized BFGS
[arXiv] [code: RBFGS]

[124] Filip Hanzely, Nikita Doikov, Peter Richtárik and Yurii Nesterov
Stochastic subspace cubic Newton method
37th International Conference on Machine Learning (ICML 2020)
[arXiv] [code: SSCN]

[123] Mher Safaryan, Egor Shulgin and Peter Richtárik
Uncertainty principle for communication compression in distributed and federated learning and the search for an optimal compressor
Information and Inference: A Journal of the IMA, 1--24, 2021
Federated Learning Paper
[arXiv]

[122] Filip Hanzely and Peter Richtárik
Federated learning of a mixture of global and local models
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [slides] [poster] [video]

[121] Samuel Horváth, Lihua Lei, Peter Richtárik and Michael I. Jordan
Adaptivity of stochastic gradient methods for nonconvex optimization
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster]

[120] Filip Hanzely, Dmitry Kovalev and Peter Richtárik
Variance reduced coordinate descent with acceleration: new method with a surprising application to finite-sum problems
37th International Conference on Machine Learning (ICML 2020)
[arXiv]

[119] Ahmed Khaled and Peter Richtárik
Better theory for SGD in the nonconvex world
[arXiv]

Prepared in 2019

[118] Ahmed Khaled, Konstantin Mishchenko and Peter Richtárik
Tighter theory for local SGD on identical and heterogeneous data
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
Federated Learning Paper
[arXiv]

[117] Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Adil Salim, Peter Richtárik and Martin Takáč
Distributed fixed point methods with compressed iterates
Federated Learning Paper
[arXiv]

[116] Samuel Horváth, Chen-Yu Ho, Ľudovít Horváth, Atal Narayan Sahu, Marco Canini and Peter Richtárik
IntML: Natural compression for distributed deep learning
Workshop on AI Systems at Symposium on Operating Systems Principles 2019 (SOSP'19)
[pdf]

[115] Dmitry Kovalev, Konstantin Mishchenko and Peter Richtárik
Stochastic Newton and cubic Newton methods with simple local linear-quadratic rates
NeurIPS 2019 Workshop Beyond First Order Methods in ML
[arXiv] [poster] [code: SN, SCN]

[114] Ahmed Khaled, Konstantin Mishchenko and Peter Richtárik
Better communication complexity for local SGD
NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality
Federated Learning Paper
[arXiv] [poster] [code: local SGD]

[113] Ahmed Khaled and Peter Richtárik
Gradient descent with compressed iterates
NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality
Federated Learning Paper
[arXiv] [poster] [code: GDCI]

[112] Ahmed Khaled, Konstantin Mishchenko and Peter Richtárik
First analysis of local GD on heterogeneous data
NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality
Federated Learning Paper
[arXiv] [poster] [code: local GD]

[111] Jinhui Xiong, Peter Richtárik and Wolfgang Heidrich
Stochastic convolutional sparse coding
International Symposium on Vision, Modeling and Visualization 2019
VMV Best Paper Award, 2019 [link]
[arXiv] [code: SBCSC, SOCSC]

[110] Xun Qian, Zheng Qu and Peter Richtárik
L-SVRG and L-Katyusha with arbitrary sampling
Journal of Machine Learning Research 22(112):1−47, 2021
[arXiv] [5 min video] [code: L-SVRG, L-Katyusha]

[109] Xun Qian, Alibek Sailanbayev, Konstantin Mishchenko and Peter Richtárik
MISO is making a comeback with better proofs and rates
[arXiv] [code: MISO]

[108] Eduard Gorbunov, Adel Bibi, Ozan Sezer, El Houcine Bergou and Peter Richtárik
A stochastic derivative free optimization method with momentum
8th International Conference on Learning Representations(ICLR 2020)
[arXiv] [poster] [code: SMTP]

[107] Mher Safaryan and Peter Richtárik
Stochastic Sign Descent Methods: New Algorithms and Better Theory
38th International Conference on Machine Learning (ICML 2021)
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster] [code: signSGD, signSGDmaj]

[106] Adil Salim, Dmitry Kovalev and Peter Richtárik
Stochastic proximal Langevin algorithm: potential splitting and nonasymptotic rates
33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
[arXiv] [poster] [code: SPLA]

[105] Aritra Dutta, El Houcine Bergou, Yunming Xiao, Marco Canini and Peter Richtárik
Direct nonlinear acceleration
[arXiv] [code: DNA]

[104] Konstantin Mishchenko and Peter Richtárik
A stochastic decoupling method for minimizing the sum of smooth and non-smooth functions
[arXiv] [code: SDM]

[103] Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin, Peter Richtárik and Yura Malitsky
Revisiting stochastic extragradient
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
NeuriPS 2019 Workshop on Smooth Games Optimization and Machine Learning
[arXiv]

[102] Filip Hanzely and Peter Richtárik
One method to rule them all: variance reduction for data, parameters and many new methods
[arXiv] [code: GJS + 17 algorithms]

[101] Eduard Gorbunov, Filip Hanzely and Peter Richtárik
A unified theory of SGD: variance reduction, sampling, quantization and coordinate descent
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
[arXiv]

[100] Samuel Horváth, Chen-Yu Ho, Ľudovít Horváth, Atal Narayan Sahu, Marco Canini and Peter Richtárik
Natural compression for distributed deep learning
Federated Learning Paper
[arXiv] [poster]

[99] Robert M. Gower, Dmitry Kovalev, Felix Lieder and Peter Richtárik
RSN: Randomized Subspace Newton
33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
[arXiv] [poster]

[98] Aritra Dutta, Filip Hanzely, Jingwei Liang and Peter Richtárik
Best pair formulation & accelerated scheme for non-convex principal component pursuit
IEEE Transactions on Signal Processing 68:6128-6141, 2020
[arXiv]

[97] Nicolas Loizou and Peter Richtárik
Revisiting randomized gossip algorithms: general framework, convergence rates and novel block and accelerated protocols
To appear in: IEEE Transactions on Information Theory
[arXiv]

[96] Nicolas Loizou and Peter Richtárik
Convergence analysis of inexact randomized iterative methods
SIAM Journal on Scientific Computing 42(6), A3979–A4016, 2020
[arXiv] [code: iBasic, iSDSA, iSGD, iSPM, iRBK, iRBCD]

[95] Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan R. K. Ports and Peter Richtárik
Scaling distributed machine learning with in-network aggregation
The 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI '21 Fall)
[arXiv] [code: SwitchML]

[94] Samuel Horváth, Dmitry Kovalev, Konstantin Mishchenko, Peter Richtárik and Sebastian Stich
Stochastic distributed learning with gradient quantization and variance reduction
Federated Learning Paper
[arXiv] [code: DIANA, VR-DIANA, SVRG-DIANA]

[93] El Houcine Bergou, Eduard Gorbunov and Peter Richtárik
Stochastic three points method for unconstrained smooth minimization
SIAM Journal on Optimization 30(4):2726-2749, 2020
[arXiv] [code: STP]

[92] Adel Bibi, El Houcine Bergou, Ozan Sener, Bernard Ghanem and Peter Richtárik
A stochastic derivative-free optimization method with importance sampling
34th AAAI Conference on Artificial Intelligence (AAAI-20)
[arXiv] [code: STP_IS]

[91] Konstantin Mishchenko, Filip Hanzely and Peter Richtárik
99% of distributed optimization is a waste of time: the issue and how to fix it
To appear in: Uncertainty in Artificial Intelligence, 2020 (UAI 2020)
Federated Learning Paper
[arXiv] [code: IBCD, ISAGA, ISGD, IASGD, ISEGA]

[90] Konstantin Mishchenko, Eduard Gorbunov, Martin Takáč and Peter Richtárik
Distributed learning with compressed gradient differences
Federated Learning Paper
[arXiv] [code: DIANA]

[89] Robert Mansel Gower, Nicolas Loizou, Xun Qian, Alibek Sailanbayev, Egor Shulgin and Peter Richtárik
SGD: general analysis and improved rates
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5200-5209, 2019
[arXiv] [poster] [code: SGD-AS]

[88] Dmitry Kovalev, Samuel Horváth and Peter Richtárik
Don’t jump through hoops and remove those loops: SVRG and Katyusha are better without the outer loop
31st International Conference on Learning Theory (ALT 2020)
[arXiv] [code: L-SVRG, L-Katyusha]

[87] Xun Qian, Zheng Qu and Peter Richtárik
SAGA with arbitrary sampling
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5190-5199, 2019
[arXiv] [poster] [code: SAGA-AS]

Prepared in 2018

[86] Lam M. Nguyen, Phuong Ha Nguyen, P. Richtárik, Katya Scheinberg, Martin Takáč and Marten van Dijk
New convergence aspects of stochastic gradient algorithms
Journal of Machine Learning Research 20(176):1-49, 2019
[arXiv]

[85] Filip Hanzely, Jakub Konečný, Nicolas Loizou, Peter Richtárik and Dmitry Grishchenko
A privacy preserving randomized gossip algorithm via controlled noise insertion
NeurIPS Privacy Preserving Machine Learning Workshop, 2018
[arXiv] [poster]

[84] Konstantin Mishchenko and Peter Richtárik
A stochastic penalty model for convex and nonconvex optimization with big constraints
[arXiv]

[83] Nicolas Loizou, Michael G. Rabbat and Peter Richtárik
Provably accelerated randomized gossip algorithms
2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019)
[arXiv] [code: AccGossip]

[82] Filip Hanzely and Peter Richtárik
Accelerated coordinate descent with arbitrary sampling and best rates for minibatches
Proceedings of the 22nd Int. Conf. on Artificial Intelligence and Statistics, PMLR 89:304-312, 2019
[arXiv] [poster] [code: ACD]

[81] Samuel Horváth and Peter Richtárik
Nonconvex variance reduced optimization with arbitrary sampling
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2781-2789, 2019
Horváth: Best DS3 Poster Award, Paris, 2018
(link)
[arXiv] [poster] [code: SVRG, SAGA, SARAH]

[80] Filip Hanzely, Konstantin Mishchenko and Peter Richtárik
SEGA: Variance reduction via gradient sketching
Advances in Neural Information Processing Systems 31:2082-2093, 2018
[arXiv] [poster] [slides] [code: SEGA] [video: YouTube]

[79] Filip Hanzely, Peter Richtárik and Lin Xiao
Accelerated Bregman proximal gradient methods for relatively smooth convex optimization
Computational Optimization and Applications 79:405–440, 2021
[arXiv] [code: ABPG, ABDA]

[78] Jakub Mareček, Peter Richtárik and Martin Takáč
Matrix completion under interval uncertainty: highlights
Lecture Notes in Computer Science, ECML-PKDD 2018
[pdf]

[77] Nicolas Loizou and Peter Richtárik
Accelerated gossip via stochastic heavy ball method
56th Annual Allerton Conference on Communication, Control, and Computing, 927-934, 2018
Press coverage [KAUST Discovery]
[arXiv] [poster]

[76] Adel Bibi, Alibek Sailanbayev, Bernard Ghanem, Robert Mansel Gower and Peter Richtárik
Improving SAGA via a probabilistic interpolation with gradient descent
[arXiv] [code: SAGD]

[75] Aritra Dutta, Filip Hanzely and Peter Richtárik
A nonconvex projection method for robust PCA
The Thirty-Third AAAI Conference on Artificial Intelligence, 2019 (AAAI-19)
[arXiv]

[74] Robert M. Gower, Peter Richtárik and Francis Bach
Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
Mathematical Programming, 2020
[arXiv] [slides] [code: JacSketch] [video: YouTube]

[73] Aritra Dutta, Xin Li and Peter Richtárik
Weighted low-rank approximation of matrices and background modeling
[arXiv]

[72] Filip Hanzely and Peter Richtárik
Fastest rates for stochastic mirror descent methods
Computational Optimization and Applications 79:717–766, 2021
[arXiv]

[71] Lam M. Nguyen, Phuong Ha Nguyen, Marten van Dijk, Peter Richtárik, Katya Scheinberg and Martin Takáč
SGD and Hogwild! convergence without the bounded gradients assumption
Proceedings of The 35th International Conference on Machine Learning, PMLR 80:3750-3758, 2018
[arXiv]

[70] Robert M. Gower, Filip Hanzely, Peter Richtárik and Sebastian Stich
Accelerated stochastic matrix inversion: general theory and speeding up BFGS rules for faster second-order optimization

Advances in Neural Information Processing Systems 31:1619-1629, 2018
[arXiv] [poster] [code: ABFGS]

[69] Nikita Doikov and Peter Richtárik
Randomized block cubic Newton method
Proceedings of The 35th International Conference on Machine Learning, PMLR 80:1290-1298, 2018
Doikov: Best Talk Award, "Control, Information and Optimization", Voronovo, Russia, 2018
[arXiv] [bib] [code: RBCN]

[68] Dmitry Kovalev, Eduard Gorbunov, Elnur Gasanov and Peter Richtárik
Stochastic spectral and conjugate descent methods
Advances in Neural Information Processing Systems 31:3358-3367, 2018
[arXiv] [poster] [code: SSD, SconD, SSCD, mSSCD, iSconD, iSSD]

[67] Radoslav Harman, Lenka Filová and Peter Richtárik
A randomized exchange algorithm for computing optimal approximate designs of experiments

Journal of the American Statistical Association
[arXiv] [code: REX, OD_REX, MVEE_REX]

[66] Ion Necoara, Andrei Patrascu and Peter Richtárik
Randomized projection methods for convex feasibility problems: conditioning and convergence rates
SIAM Journal on Optimization 29(4):2814–2852, 2019
[arXiv] [slides]

Prepared in 2017

[65] Nicolas Loizou and Peter Richtárik
Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods
Computational Optimization and Applications 77(3):653-710, 2020
[arXiv]

[64] Aritra Dutta and Peter Richtárik
Online and batch supervised background estimation via L1 regression
IEEE Winter Conference on Applications in Computer Vision, 2019
[arXiv]

[63] Nicolas Loizou and Peter Richtárik
Linearly convergent stochastic heavy ball method for minimizing generalization error
NIPS Workshop on Optimization for Machine Learning, 2017
[arXiv] [poster]

[62] Dominik Csiba and Peter Richtárik
Global convergence of arbitrary-block gradient methods for generalized Polyak-Łojasiewicz functions
[arXiv]

[61] Ademir Alves Ribeiro and Peter Richtárik
The complexity of primal-dual fixed point methods for ridge regression
Linear Algebra and its Applications 556:342-372, 2018
[arXiv]

[60] Matthias J. Ehrhardt, Pawel Markiewicz, Antonin Chambolle, Peter Richtárik, Jonathan Schott and Carola-Bibiane Schoenlieb
Faster PET reconstruction with a stochastic primal-dual hybrid gradient method
Proceedings of SPIE, Wavelets and Sparsity XVII, Volume 10394, pages 1039410-1 - 1039410-11, 2017
[pdf] [poster] [code: SPDHG] [video: YouTube]

[59] Aritra Dutta, Xin Li and Peter Richtárik
A batch-incremental video background estimation model using weighted low-rank approximation of matrices
IEEE International Conference on Computer Vision (ICCV) Workshops, 2017
[arXiv] [code: inWLR]

[58] Filip Hanzely, Jakub Konečný, Nicolas Loizou, Peter Richtárik and Dmitry Grishchenko
Privacy preserving randomized gossip algorithms
[arXiv] [slides]

[57] Antonin Chambolle, Matthias J. Ehrhardt, Peter Richtárik and Carola-Bibiane Schoenlieb
Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications
SIAM Journal on Optimization 28(4):2783-2808, 2018
[arXiv] [slides] [poster] [code: SPDHG] [video: YouTube]

[56] Peter Richtárik and Martin Takáč
Stochastic reformulations of linear systems: algorithms and convergence theory
SIAM Journal on Matrix Analysis and Applications 41(2):487–524, 2020
[arXiv] [slides] [code: basic, parallel and accelerated methods]

[55] Mojmír Mutný and Peter Richtárik
Parallel stochastic Newton method
Journal of Computational Mathematics 36(3):404-425, 2018
[arXiv] [code: PSNM]

Prepared in 2016

[54] Robert M. Gower and Peter Richtárik
Linearly convergent randomized iterative methods for computing the pseudoinverse
[arXiv]

[53] Jakub Konečný and Peter Richtárik
Randomized distributed mean estimation: accuracy vs communication
Frontiers in Applied Mathematics and Statistics 2018
Federated Learning Paper
[arXiv]

[52] Jakub Konečný, H. Brendan McMahan, Felix Yu, Peter Richtárik, Ananda Theertha Suresh and Dave Bacon
Federated learning: strategies for improving communication efficiency
NIPS Private Multi-Party Machine Learning Workshop, 2016
Federated Learning Paper link [selected press coverage: The Verge - Quartz - Vice CBR - Android Authority]
[arXiv] [bib] [poster]

[51] Jakub Konečný, H. Brendan McMahan, Daniel Ramage and Peter Richtárik
Federated optimization: distributed machine learning for on-device intelligence
Federated Learning Paper link [selected press coverage: The Verge - Quartz - Vice CBR - Android Authority]
[arXiv] [bib]

[50] Nicolas Loizou and Peter Richtárik
A new perspective on randomized gossip algorithms
IEEE Global Conference on Signal and Information Processing (GlobalSIP), 440-444, 2016
[arXiv] [bib]

[49] Sashank J. Reddi, Jakub Konečný, Peter Richtárik, Barnabás Póczos, Alex Smola
AIDE: fast and communication efficient distributed optimization
[arXiv] [poster]

[48] Dominik Csiba and Peter Richtárik
Coordinate descent face-off: primal or dual?
Proceedings of Algorithmic Learning Theory, PMLR 83:246-267, 2018
[arXiv] [bib]

[47] Olivier Fercoq and Peter Richtárik
Optimization in high dimensions via accelerated, parallel and proximal coordinate descent
SIAM Review 58(4):739-771, 2016
SIAM SIGEST Award
[arXiv] [bib]

[46] Robert M. Gower, Donald Goldfarb and Peter Richtárik
Stochastic block BFGS: squeezing more curvature out of data
Proceedings of the 33rd International Conference on Machine Learning, PMLR 48:1869-1878, 2016
[arXiv] [bib] [poster]

[45] Dominik Csiba and Peter Richtárik
Importance sampling for minibatches
Journal of Machine Learning Research 19(27):1-21, 2018
[arXiv] [bib]

[44] Robert M. Gower and Peter Richtárik
Randomized quasi-Newton updates are linearly convergent matrix inversion algorithms
SIAM Journal on Matrix Analysis and Applications 38(4):1380-1409, 2017
Most Downloaded SIMAX Paper (6th place: 2018)
[arXiv] [code: SIMI, RBFGS, AdaRBFGS, ...]

Prepared in 2015

[43] Zeyuan Allen-Zhu, Zheng Qu, Peter Richtárik and Yang Yuan
Even faster accelerated coordinate descent using non-uniform sampling
Proceedings of the 33rd International Conference on Machine Learning, PMLR 48:1110-1119, 2016
[arXiv] [bib] [code: NU_ACDM]

[42] Robert M. Gower and Peter Richtárik
Stochastic dual ascent for solving linear systems
[arXiv] [code: SDA] [video: YouTube]

[41] Chenxin Ma, Jakub Konečný, Martin Jaggi, Virginia Smith, Michael I Jordan, P. Richtárik and Martin Takáč Distributed optimization with arbitrary local solvers
Optimization Methods and Software 32(4):813-848, 2017
Most-Read Paper, Optimization Methods and Software, 2017
[arXiv] [code: CoCoA+]

[40] Martin Takáč, Peter Richtárik and Nathan Srebro
Distributed mini-batch SDCA
To appear in: Journal of Machine Learning Research
[arXiv]

[39] Robert M. Gower and Peter Richtárik
Randomized iterative methods for linear systems
SIAM Journal on Matrix Analysis and Applications 36(4):1660-1690, 2015
Most Downloaded SIMAX Paper (1st place: 2017-2020)

Gower: 18th IMA Leslie Fox Prize (2nd Prize), 2017 link
[arXiv] [slides]

[38] Dominik Csiba and Peter Richtárik
Primal method for ERM with flexible mini-batching schemes and non-convex losses
[arXiv] [code: dfSDCA]

[37] Jakub Konečný, Jie Liu, Peter Richtárik and Martin Takáč
Mini-batch semi-stochastic gradient descent in the proximal setting
IEEE Journal of Selected Topics in Signal Processing 10(2): 242-255, 2016
[arXiv] [code: mS2GD]

[36] Rachael Tappenden, Martin Takáč and Peter Richtárik
On the complexity of parallel coordinate descent
Optimization Methods and Software 33(2):372-395, 2018
[arXiv]

[35] Dominik Csiba, Zheng Qu and Peter Richtárik
Stochastic dual coordinate ascent with adaptive probabilities
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:674-683, 2015
Csiba: Best Contribution Award (2nd Place), Optimization and Big Data 2015
Implemented in Tensor Flow
[arXiv] [bib] [poster] [code: AdaSDCA and AdaSDCA+]

[34] Chenxin Ma, Virginia Smith, Martin Jaggi, Michael I. Jordan, Peter Richtárik and Martin Takáč
Adding vs. averaging in distributed primal-dual optimization
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1973-1982, 2015
Smith: 2015 MLconf Industry Impact Student Research Award link
CoCoA+ is now the default linear optimizer in Tensor Flow link
[arXiv] [bib] [poster] [code: CoCoA+]

[33] Zheng Qu, Peter Richtárik, Martin Takáč and Olivier Fercoq
SDNA: Stochastic dual Newton ascent for empirical risk minimization
Proceedings of the 33rd International Conference on Machine Learning, PMLR 48:1823-1832, 2016
[arXiv] [bib] [slides] [poster] [code: SDNA]

Prepared in 2014

[32] Zheng Qu and Peter Richtárik
Coordinate descent with arbitrary sampling II: expected separable overapproximation
Optimization Methods and Software 31(5):858-884, 2016
[arXiv]

[31] Zheng Qu and Peter Richtárik
Coordinate descent with arbitrary sampling I: algorithms and complexity
Optimization Methods and Software 31(5):829-857, 2016
[arXiv] [code: ALPHA]

[30] Jakub Konečný, Zheng Qu and Peter Richtárik
Semi-stochastic coordinate descent
Optimization Methods and Software 32(5):993-1005, 2017
[arXiv] [code: S2CD]

[29] Zheng Qu, Peter Richtárik and Tong Zhang
Quartz: Randomized dual coordinate ascent with arbitrary sampling
Advances in Neural Information Processing Systems 28:865-873, 2015
[arXiv] [slides] [code: QUARTZ] [video: YouTube]

[28] Jakub Konečný, Jie Liu, Peter Richtárik and Martin Takáč
mS2GD: Mini-batch semi-stochastic gradient descent in the proximal setting
NIPS Workshop on Optimization for Machine Learning, 2014
[arXiv] [poster] [code: mS2GD]

[27] Jakub Konečný, Zheng Qu and Peter Richtárik
S2CD: Semi-stochastic coordinate descent
NIPS Workshop on Optimization for Machine Learning, 2014
[pdf] [poster] [code: S2CD]

[26] Jakub Konečný and Peter Richtárik
Simple complexity analysis of simplified direct search
[arXiv] [slides in Slovak] [code: SDS]

[25] Jakub Mareček, Peter Richtárik and Martin Takáč
Distributed block coordinate descent for minimizing partially separable functions
Numerical Analysis and Optimization, Springer Proceedings in Math. and Statistics 134:261-288, 2015
[arXiv]

[24] Olivier Fercoq, Zheng Qu, Peter Richtárik and Martin Takáč
Fast distributed coordinate descent for minimizing non-strongly convex losses
2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2014
[arXiv] [poster] [code: Hydra^2]

[23] Duncan Forgan and Peter Richtárik
On optimal solutions to planetesimal growth models
Technical Report ERGO 14-002, 2014
[pdf]

[22] Jakub Mareček, Peter Richtárik and Martin Takáč
Matrix completion under interval uncertainty
European Journal of Operational Research 256(1):35-42, 2017
[arXiv] [code: MACO]

Prepared in 2013

[21] Olivier Fercoq and Peter Richtárik
Accelerated, Parallel and PROXimal coordinate descent
SIAM Journal on Optimization 25(4):1997-2023, 2015
Fercoq: 17th IMA Leslie Fox Prize (Second Prize), 2015
2nd Most Downloaded SIOPT Paper (Aug 2016 - now)
[arXiv] [poster] [code: APPROX] [video: YouTube]

[20] Jakub Konečný and Peter Richtárik
Semi-stochastic gradient descent methods
Frontiers in Applied Mathematics and Statistics 3:9, 2017
[arXiv] [poster] [slides] [code: S2GD and S2GD+]

[19] Peter Richtárik and Martin Takáč
On optimal probabilities in stochastic coordinate descent methods
Optimization Letters 10(6):1233-1243, 2016
[arXiv] [poster] [code: NSync]

[18] Peter Richtárik and Martin Takáč
Distributed coordinate descent method for learning with big data
Journal of Machine Learning Research 17(75):1-25, 2016
[arXiv] [poster] [code: Hydra]

[17] Olivier Fercoq and Peter Richtárik
Smooth minimization of nonsmooth functions with parallel coordinate descent methods
Springer Proceedings in Mathematics and Statistics 279:57-96, 2019
[arXiv] [code: SPCDM]

[16] Rachael Tappenden, Peter Richtárik and Burak Buke
Separable approximations and decomposition methods for the augmented Lagrangian
Optimization Methods and Software 30(3):643-668, 2015
[arXiv]

[15] Rachael Tappenden, Peter Richtárik and Jacek Gondzio
Inexact coordinate descent: complexity and preconditioning
Journal of Optimization Theory and Applications 170(1):144-176, 2016
[arXiv] [poster] [code: ICD]

[14] Martin Takáč, Selin Damla Ahipasaoglu, Ngai-Man Cheung and Peter Richtárik
TOP-SPIN: TOPic discovery via Sparse Principal component INterference
Springer Proceedings in Mathematics and Statistics 279:157-180, 2019
[arXiv] [poster] [code: TOP-SPIN]

[13] Martin Takáč, Avleen Bijral, Peter Richtárik and Nathan Srebro
Mini-batch primal and dual methods for SVMs
Proceedings of the 30th International Conference on Machine Learning, 2013
[arXiv] [poster] [code: minibatch SDCA and minibatch Pegasos]

Prepared in 2012 or earlier

[12] Peter Richtárik, Majid Jahani, Martin Takáč and Selin Damla Ahipasaoglu
Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes
ptimization and Engineering, 2020
[arXiv] [code: 24am]

[11] William Hulme, Peter Richtárik, Lynne McGuire and Alison Green
Optimal diagnostic tests for sporadic Creutzfeldt-Jakob disease based on SVM classification of RT-QuIC data
Technical Report, 2012
[arXiv]

[10] Peter Richtárik and Martin Takáč
Parallel coordinate descent methods for big data optimization
Mathematical Programming 156(1):433-484, 2016
Takáč: 16th IMA Leslie Fox Prize (2nd Prize), 2013 link
#1 Top Trending Article in Mathematical Programming Ser A and B (2017) link
[arXiv] [slides] [code: PCDM, AC/DC] [video: YouTube]

[9] Peter Richtárik and Martin Takáč
Efficient serial and parallel coordinate descent methods for huge-scale truss topology design
Operations Research Proceedings 2011:27-32, Springer-Verlag, 2012
[Optimization Online] [poster]

[8] Peter Richtárik and Martin Takáč
Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function
Mathematical Programming 144(2):1-38, 2014
Best Student Paper (runner-up), INFORMS Computing Society, 2012
[arXiv] [slides]

[7] Peter Richtárik and Martin Takáč
Efficiency of randomized coordinate descent methods on minimization problems with a composite objective function
Proceedings of Signal Processing with Adaptive Sparse Structured Representations, 2011
[pdf]

[6] Peter Richtárik
Finding sparse approximations to extreme eigenvectors: generalized power method for sparse PCA and extensions
Proceedings of Signal Processing with Adaptive Sparse Structured Representations, 2011
[pdf]

[5] Peter Richtárik
Approximate level method for nonsmooth convex minimization
Journal of Optimization Theory and Applications 152(2):334–350, 2012
[Optimization Online]

[4] Michel Journée, Yurii Nesterov, Peter Richtárik and Rodolphe Sepulchre
Generalized power method for sparse principal component analysis
Journal of Machine Learning Research 11:517–553, 2010
[arXiv] [slides] [poster] [code: GPower]

[3] Peter Richtárik
Improved algorithms for convex minimization in relative scale
SIAM Journal on Optimization 21(3):1141–1167, 2011
[pdf] [slides]

[2] Peter Richtárik
Simultaneously solving seven optimization problems in relative scale
Technical Report, 2009
[Optimization Online]

[1] Peter Richtárik
Some algorithms for large-scale convex and linear minimization in relative scale
PhD Dissertation, School of Operations Research and Information Engineering, Cornell University, 2007