All papers are listed below in reverse chronological order in which they appeared online.

Prepared in 2024

[246] Kai Yi, Georg Meinhardt, Laurent Condat, and Peter Richtárik
FedComLoc: Communication-efficient distributed training of sparse and quantized models
Federated Learning Paper
[arXiv] [code: FedComLoc]

[245] Yury Demidovich, Grigory Malinovsky, and Peter Richtárik
Streamlining in the Riemannian realm: Efficient Riemannian optimization with loopless variance reduction
[arXiv] [code: R-LSVRG, R-PAGE, R-MARINA]

[244] Laurent Condat, Artavazd Maranjyan, and Peter Richtárik
LoCoDL: Communication-efficient distributed learning with local training and compression
Federated Learning Paper
[arXiv] [code: LoCoDL]

[243] Kaja Gruntkowska, Alexander Tyurin, and Peter Richtárik
Improving the worst-case bidirectional communication complexity for nonconvex distributed optimization under function similarity
Federated Learning Paper
[arXiv] [code: MARINA-P, M3]

[242] Alexander Tyurin, Marta Pozzi, Ivan Ilin, and Peter Richtárik
Shadowheart SGD: Distributed asynchronous SGD with optimal time complexity under arbitrary computation and communication heterogeneity
Federated Learning Paper
[arXiv] [code: Shadowheart SGD]

[241] Andrei Panferov, Yury Demidovich, Ahmad Rammal, and Peter Richtárik
Correlated quantization for faster nonconvex distributed optimization
Federated Learning Paper
[arXiv]

Prepared in 2023

[240] Kai Yi, Nidham Gazagnadou, Peter Richtárik, and Lingjuan Lyu
FedP3: Personalized and privacy-friendly federated network pruning under model heterogeneity
12th International Conference on Learning Representations (ICLR 2024)
Federated Learning Paper
[arXiv] [code: FedP3]

[239] Peter Richtárik, Elnur Gasanov, Konstantin Burlachenko
Error feedback reloaded: From quadratic to arithmetic mean of smoothness constants
12th International Conference on Learning Representations (ICLR 2024)
Federated Learning Paper
[arXiv] [code: EF21-W, EF21]

[238] Jihao Xin, Ivan Ilin, Shunkang Zhang, Marco Canini, Peter Richtárik
Kimad: Adaptive gradient compression with bandwidth awareness
Proceedings of the 4th International Workshop on Distributed Machine Learning
Federated Learning Paper
[arXiv]

[237] Konstantin Burlachenko, Abdulmajeed Alrowithi, Fahad Ali Albalawi, and Peter Richtárik
Federated learning is better with non-homomorphic encryption
Proceedings of the 4th International Workshop on Distributed Machine Learning
Federated Learning Paper
[arXiv]

[236] Yury Demidovich, Grigory Malinovsky, Egor Shulgin, and Peter Richtárik
MAST: model-agnostic sparsified training
Federated Learning Paper
[arXiv]

[235] Grigory Malinovsky, Peter Richtárik, Samuel Horváth, and Eduard Gorbunov
Byzantine robustness and partial participation can be achieved simultaneously: just clip gradient differences
Federated Learning Paper
[arXiv] [code: Byz-VR-MARINA-PP]

[234] Massimo Fornasier, Peter Richtárik, Konstantin Riedl, and Lukang Sun
Consensus-based optimization with truncated noise
[arXiv] [code: CBO]

[233] Ahmad Rammal, Kaja Gruntkowska, Nikita Fedin, Eduard Gorbunov, and Peter Richtárik
Communication compression for Byzantine robust learning: New efficient algorithms and improved rates
26th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)
Federated Learning Paper
[arXiv] [code: Byz-VR-MARINA, Byz-DASHA-PAGE, Byz-EF21, Byz-EF21-BC]

[232] Hanmin Li, Avetik Karagulyan, and Peter Richtárik
MARINA meets matrix stepsizes: Variance reduced distributed non-convex optimization
Federated Learning Paper
[arXiv] [code: det-MARINA]

[231] Eduard Gorbunov, Abdurakhmon Sadiev, Marina Danilova, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, and Peter Richtárik
High-probability convergence for composite and distributed stochastic minimization and variational inequalities with heavy-tailed noise
[arXiv] [code: DProx-clipped-SGD-shift, DProx-clipped-SSTM-shift]

[230] Egor Shulgin and Peter Richtárik
Towards a better theoretical understanding of independent subnetwork training
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
Federated Learning Paper
[arXiv] [code: IST]

[229] Rafał Szlendak, Elnur Gasanov, and Peter Richtárik
Understanding progressive training through the framework of randomized coordinate descent
26th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)
[arXiv] [code: RPT]

[228] Michał Grudzień, Grigory Malinovsky, and Peter Richtárik
Improving accelerated federated learning with compression and importance sampling
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
Federated Learning Paper
[arXiv] [code: 5GCS-CC, 5GCS-AB]

[227] Sarit Khirirat, Eduard Gorbunov, Samuel Horváth, Rustem Islamov, Fakhri Karray, and Peter Richtárik
Clip21: Error feedback for gradient clipping
[arXiv] [code: Clip21-Avg, Clip21-GD, DP-Clip21-GD, Press-Clip21-GD]

[226] Jihao Xin, Marco Canini, Peter Richtárik, and Samuel Horváth
Global-QSGD: Practical floatless quantization for distributed learning with theoretical guarantees
Federated Learning Paper
[arXiv] [code: Global-QSGD]

[225] Yury Demidovich, Grigory Malinovsky, Igor Sokolov and Peter Richtárik
A guide through the zoo of biased SGD
Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
[arXiv] [code: BiasedSGD] [poster]

[224] Peter Richtárik, Elnur Gasanov and Konstantin Burlachenko
Error feedback shines when features are rare
Federated Learning Paper
[arXiv] [code: EF21]

[223] Ilyas Fatkhullin, Alexander Tyurin and Peter Richtárik
Momentum provably improves error feedback!
Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
Federated Learning Paper
[arXiv] [code: EF21-SGDM-ideal, EF21-SGDM, EF21-SGD2M] [poster]

[222] Kai Yi, Laurent Condat and Peter Richtárik
Explicit personalization and local training: double communication acceleration in federated learning
Federated Learning Paper
[arXiv] [code: Scafflix]

[221] Alexander Tyurin and Peter Richtárik
Optimal time complexities of parallel stochastic optimization methods under a fixed computation model
Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
[arXiv] [code: Rennala SGD, Malenia SGD]

[220] Alexander Tyurin and Peter Richtárik
2Direction: Theoretically faster distributed training with bidirectional communication compression
Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
Federated Learning Paper
[arXiv] [code: 2Direction]

[219] Hanmin Li, Avetik Karagulyan and Peter Richtárik
Det-CGD: Compressed gradient descent with matrix stepsizes for non-convex optimization
12th International Conference on Learning Representations (ICLR 2024)
Federated Learning Paper
[arXiv] [code: Det-CGD]

[218] Avetik Karagulyan and Peter Richtárik
ELF: Federated Langevin algorithms with primal, dual and bidirectional compression
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
Federated Learning Paper
[arXiv] [code: ELF, P-ELF, D-ELF, B-ELF]

[217] Laurent Condat, Grigory Malinovsky and Peter Richtárik
TAMUNA: Accelerated federated learning with local training and partial participation
Federated Learning Paper
[arXiv] [code: TAMUNA]

[216] Grigory Malinovsky, Samuel Horváth, Konstantin Burlachenko and Peter Richtárik
Federated learning with regularized client participation
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
Federated Learning Paper
[arXiv] [code: RR-CLI]

[215] Abdurakhmon Sadiev, Marina Danilova, Eduard Gorbunov, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov and Peter Richtárik
High-probability bounds for stochastic optimization and variational inequalities: the case of unbounded variance
40th International Conference on Machine Learning (ICML 2023)
[arXiv] [code: clipped-SGD, clipped-SSTM, R-clipped-SSTM]

[214] Xun Qian, Hanze Dong, Tong Zhang and Peter Richtárik
Catalyst acceleration of error compensated methods leads to better communication complexity
25th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)
Federated Learning Paper
[arXiv] [code: ECSPDC, EC-LSVRG + Catalyst, EC-SDCA + Catalyst]

[213] Slavomír Hanzely, Konstantin Mishchenko and Peter Richtárik
Convergence of first-order algorithms for meta-learning with Moreau envelopes
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
[arXiv] [code: FO-MuML]

Prepared in 2022

[212] Michał Grudzień, Grigory Malinovsky and Peter Richtárik
Can 5th generation local training methods support client sampling? Yes!
25th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)
Federated Learning Paper
[arXiv] [code: 5GCS]

[211] Maksim Makarenko, Elnur Gasanov, Rustem Islamov, Abdurakhmon Sadiev and Peter Richtárik
Adaptive compression for communication-efficient distributed training
Transactions on Machine Learning Research (TMLR 2023)
Federated Learning Paper
[arXiv] [code: AdaCGD]

[210] Slavomír Hanzely, Dmitry Kamzolov, Dmitry Pasechnyuk, Alexander Gasnikov, Peter Richtárik and Martin Takáč
A damped Newton method achieves global $O(1/k^2)$ and local quadratic convergence rate
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
[arXiv] [code: AIC Newton]

[209] Artavazd Maranjyan, Mher Safaryan and Peter Richtárik
GradSkip: Communication-accelerated local gradient methods with better computational complexity
Federated Learning Paper
[arXiv] [code: GradSkip, GradSkip+]

[208] Laurent Condat, Ivan Agarský and Peter Richtárik
Provably doubly accelerated federated learning: the first theoretically successful combination of local training and compressed communication
Federated Learning Paper
[arXiv] [code: CompressedScaffnew]

[207] Lukang Sun and Peter Richtárik
Improved Stein variational gradient descent with importance weights
[arXiv] [code: beta-SVGD]

[206] Kaja Gruntkowska, Alexander Tyurin and Peter Richtárik
EF21-P and friends: Improved theoretical communication complexity for distributed optimization with bidirectional compression
40th International Conference on Machine Learning (ICML 2023)
Federated Learning Paper
[arXiv] [code: EF21-P, EF21-P + DIANA, EF21-P + DCGD] [poster]

[205] Soumia Boucherouite, Grigory Malinovsky, Peter Richtárik and El Houcine Bergou
Minibatch stochastic three points method for unconstrained smooth minimization
38th AAAI Conference on Artificial Intelligence (AAAI 2024)
[arXiv] [code: MiSTP]

[204] El Houcine Bergou, Konstantin Burlachenko, Aritra Dutta and Peter Richtárik
Personalized federated learning with communication compression
Transactions on Machine Learning Research (TMLR 2023)
Federated Learning Paper
[arXiv] [code: Compressed L2GD]

[203] Samuel Horváth, Konstantin Mishchenko and Peter Richtárik
Adaptive learning rates for faster stochastic gradient methods
[arXiv] [code: StoPS, GraDs, StoP, GraD]

[202] Laurent Condat and Peter Richtárik
RandProx: Primal-dual optimization algorithms with randomized proximal updates
11th International Conference on Learning Representations (ICLR 2023)
OPT2022: 14th Annual Workshop on Optimization for Machine Learning (NeurIPS 2022 Workshop)
Federated Learning Paper
[arXiv] [poster] [video] [code: RandProx, RandProx-FB, RandProx-LC, RandProx-CP, RandProx-ADMM, RandProx-DY]

[201] Grigory Malinovsky, Kai Yi and Peter Richtárik
Variance reduced ProxSkip: algorithm, theory and application to federated learning
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Federated Learning Paper
[arXiv] [code: ProxSkip-VR, ProxSkip-GD, ProxSkip-SGD, ProxSkip-LSVRG, ProxSkip-HUB]

[200] Abdurakhmon Sadiev, Dmitry Kovalev and Peter Richtárik
Communication acceleration of local gradient methods via an accelerated primal-dual algorithm with inexact prox
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Federated Learning Paper
[arXiv] [code: APDA, APDA with Inexact Prox, APDA with Inexact Prox and Accelerated Gossip]

[199] Egor Shulgin and Peter Richtárik
Shifted compression framework: generalizations and improvements
38th Conference on Uncertainty in Artificial Intelligence (UAI 2022)
Federated Learning Paper
[arXiv] [code: DCGD-SHIFT]

[198] Lukang Sun and Peter Richtárik
A Note on the convergence of mirrored Stein variational gradient descent under (L_0, L_1) smoothness condition
[arXiv] [code: MSVGD]

[197] Abdurakhmon Sadiev, Grigory Malinovsky, Eduard Gorbunov, Igor Sokolov, Ahmed Khaled, Konstantin Burlachenko and Peter Richtárik
Federated optimization algorithms with random reshuffling and gradient compression
Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities (ICML 2023 Workshop)
Federated Learning Paper
[arXiv] [code: Q-RR, DIANA-RR, Q-NASTYA, DIANA-NASTYA] [poster]

[196] Rustem Islamov, Xun Qian, Slavomír Hanzely, Mher Safaryan and Peter Richtárik
Distributed Newton-type methods with communication compression and Bernoulli aggregation
Transactions on Machine Learning Research (TMLR 2023)
NeurIPS Workshop 2022 (Order up! The Benefits of Higher-Order Optimization in Machine Learning)
Federated Learning Paper
[arXiv] [code: Newton-3PC, Newton-3PC-BC, Newton-3PC-BC-PP]

[195] Motasem Alfarra, Juan C. Pérez, Egor Shulgin, Peter Richtárik and Bernard Ghanem
Certified robustness in federated learning
NeurIPS Workshop 2022 (Federated Learning)
Federated Learning Paper
[arXiv]

[194] Alexander Tyurin, Lukang Sun, Konstantin Burlachenko and Peter Richtárik
Sharper rates and flexible framework for nonconvex SGD with client and data sampling
Transactions on Machine Learning Research (TMLR 2023)
Federated Learning Paper
[arXiv] [code: PAGE]

[193] Lukang Sun, Adil Salim and Peter Richtárik
Federated sampling with Langevin algorithm under isoperimetry
Transactions on Machine Learning Research (TMLR 2024)
Federated Learning Paper
[arXiv] [code: Langevin-Marina]

[192] Eduard Gorbunov, Samuel Horváth, Peter Richtárik and Gauthier Gidel
Variance reduction is an antidote to Byzantines: better rates, weaker assumptions and communication compression as a cherry on the top
11th International Conference on Learning Representations (ICLR 2023)
Federated Learning Paper
[arXiv] [poster] [code: Byz-VR-MARINA]

[191] Lukang Sun, Avetik Karagulyan and Peter Richtárik
Convergence of Stein variational gradient descent under a weaker smoothness condition
25th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)
[arXiv] [code: SVGD] [poster]

[190] Alexander Tyurin and Peter Richtárik
A computation and communication efficient method for distributed nonconvex problems in the partial participation setting
Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
Federated Learning Paper
[arXiv] [code: DASHA-PP, DASHA-PP-PAGE, DASHA-PP-FINITE-MVR, DASHA-PP-MVR]

[189] Laurent Condat, Kai Yi and Peter Richtárik
EF-BV: A unified theory of error feedback and variance reduction mechanisms for biased and unbiased compression in distributed optimization
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Federated Learning Paper
[arXiv] [code: EF-BV]

[188] Grigory Malinovsky and Peter Richtárik
Federated random reshuffling with compression and variance reduction
Federated Learning Paper
[arXiv] [code: FedCRR, FedCRR-VR, FedCRR-VR-2]

[187] Samuel Horváth, Maziar Sanjabi, Lin Xiao, Peter Richtárik and Michael Rabbat
FedShuffle: Recipes for better use of local work in federated learning
Transactions on Machine Learning Research (TMLR 2022)
Federated Learning Paper
[arXiv] [code: FedShuffle]

[186] Konstantin Mishchenko, Grigory Malinovsky, Sebastian Stich and Peter Richtárik
ProxSkip: Yes! Local gradient steps provably lead to communication acceleration! Finally!
39th International Conference on Machine Learning (ICML 2022)
Federated Learning Paper
[arXiv] [slides] [71 min video] [code: ProxSkip, Scaffnew, SProxSkip, SplitSkip, Decentralized Scaffnew]

[185] Dmitry Kovalev, Aleksandr Beznosikov, Abdurakhmon Sadiev, Michael Persiianov, Peter Richtárik and Alexander Gasnikov
Optimal algorithms for decentralized stochastic variational inequalities
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
[arXiv] [code: Algorithm 1, Algorithm 2]

[184] Alexander Tyurin and Peter Richtárik
DASHA: Distributed nonconvex optimization with communication compression and optimal oracle complexity
10th International Conference on Learning Representations (ICLR 2023)
Federated Learning Paper
[arXiv] [code: DASHA, DASHA-PAGE, DASHA-MVR]

[183] Peter Richtárik, Igor Sokolov, Ilyas Fatkhullin, Elnur Gasanov, Zhize Li and Eduard Gorbunov
3PC: Three point compressors for communication-efficient distributed training and a better theory for lazy aggregation
39th International Conference on Machine Learning (ICML 2022)
Federated Learning Paper
[arXiv] [poster] [code: 3PC, LAG, CLAG, EF21]

[182] Haoyu Zhao, Boyue Li, Zhize Li, Peter Richtárik and Yuejie Chi
BEER: Fast $O(1/T)$ rate for decentralized nonconvex optimization with communication compression
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Federated Learning Paper
[arXiv] [code: BEER]

[181] Grigory Malinovsky, Konstantin Mishchenko and Peter Richtárik
Server-side stepsizes and sampling without replacement provably help in federated optimization
Federated Learning Paper
[arXiv] [code: Nastya]

Prepared in 2021

[180] Dmitry Kovalev, Alexander Gasnikov and Peter Richtárik
Accelerated primal-dual gradient method for smooth and convex-concave saddle-point problems with bilinear coupling
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
[arXiv] [code: APDG]

[179] Haoyu Zhao, Konstantin Burlachenko, Zhize Li and Peter Richtárik
Faster rates for compressed federated learning with client-variance reduction
To appear in: SIAM Journal on Mathematics of Data Science
Federated Learning Paper
[arXiv] [code: COFIG, FRECON]

[178] Konstantin Burlachenko, Samuel Horváth and Peter Richtárik
FL_PyTorch: optimization research simulator for federated learning
Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning
Federated Learning Paper
[arXiv] [code: FL_PyTorch]

[177] Elnur Gasanov, Ahmed Khaled, Samuel Horváth and Peter Richtárik
FLIX: A simple and communication-efficient alternative to local methods in federated learning
24th International Conference on Artificial Intelligence and Statistics (AISTATS 2022)
Federated Learning Paper
[arXiv] [code: FLIX]

[176] Xun Qian, Rustem Islamov, Mher Safaryan and Peter Richtárik
Basis matters: better communication-efficient second order methods for federated learning
24th International Conference on Artificial Intelligence and Statistics (AISTATS 2022)
Federated Learning Paper
[arXiv] [code: BL1, BL2, BL3]

[175] Aleksandr Beznosikov, Peter Richtárik, Michael Diskin, Max Ryabinin and Alexander Gasnikov
Distributed methods with compressed communication for solving variational inequalities, with theoretical guarantees
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
[arXiv] [code: MASHA1, MASHA2]

[174] Rafał Szlendak, Alexander Tyurin and Peter Richtárik
Permutation compressors for provably faster distributed nonconvex optimization
10th International Conference on Learning Representations (ICLR 2022)
Federated Learning Paper
[arXiv] [code: MARINA] [video]

[173] Ilyas Fatkhullin, Igor Sokolov, Eduard Gorbunov, Zhize Li and Peter Richtárik
EF21 with bells & whistles: practical algorithmic extensions of modern error feedback
Federated Learning Paper
[arXiv] [code: EF21-SGD, EF21-PAGE, EF21-PP, EF21-BC, EF21-HB, EF21-Prox] [github]

[172] Xun Qian, Hanze Dong, Peter Richtárik and Tong Zhang
Error compensated loopless SVRG, Quartz, and SDCA for distributed optimization
Federated Learning Paper
[arXiv] [code: EC-LSVRG, EC-SDCA, EC-Quartz]

[171] Majid Jahani, Sergey Rusakov, Zheng Shi, Peter Richtárik, Michael W. Mahoney and Martin Takáč
Doubly adaptive scaled algorithm for machine learning using second-order information
10th International Conference on Learning Representations (ICLR 2022)
[arXiv] [code: OASIS]

[170] Haoyu Zhao, Zhize Li and Peter Richtárik
FedPAGE: A fast local stochastic gradient method for communication-efficient federated learning
Federated Learning Paper
[arXiv] [code: FedPAGE]

[169] Zhize Li and Peter Richtárik
CANITA: Faster rates for distributed convex optimization with communication compression
Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Federated Learning Paper
[arXiv] [code: CANITA]

[168] 50+ authors
A field guide to federated optimization
Federated Learning Paper
[arXiv]

[167] Peter Richtárik, Igor Sokolov and Ilyas Fatkhullin
EF21: A new, simpler, theoretically better, and practically faster error feedback
Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Federated Learning Paper
[arXiv] [slides] [62 min video] [code: EF21, EF21+] [github]

[166] Dmitry Kovalev, Elnur Gasanov, Peter Richtárik and Alexander Gasnikov
Lower bounds and optimal algorithms for smooth and strongly convex decentralized optimization over time-varying networks
Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
[arXiv] [code: ADOM+]

[165] Bokun Wang, Mher Safaryan and Peter Richtárik
Theoretically better and numerically faster distributed optimization with smoothness-aware quantization techniques
Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Federated Learning Paper
[arXiv] [code: DCGD+, DIANA+]

[164] Adil Salim, Lukang Sun and Peter Richtárik
A convergence theory for SVGD in the population limit under Talagrand’s inequality T1
39th International Conference on Machine Learning (ICML 2022)
[arXiv] [code: SVGD]

[163] Laurent Condat and Peter Richtárik
MURANA: A generic framework for stochastic variance-reduced optimization
Mathematical and Scientific Machine Learning 2022 (MSML 2022)
Federated Learning Paper
[arXiv] [code: MURANA, ELVIRA]

[162] Mher Safaryan, Rustem Islamov, Xun Qian and Peter Richtárik
FedNL: Making Newton-type methods applicable to federated learning
39th International Conference on Machine Learning (ICML 2022)
Federated Learning Paper
[arXiv] [poster] [code: FedNL, FedNL-PP, FedNL-CR, FedNL-LS, FedNL-BC, N0, NS]

[161] Grigory Malinovsky, Alibek Sailanbayev and Peter Richtárik
Random reshuffling with variance reduction: new analysis and better rates
39th Conference on Uncertainty in Artificial Intelligence (UAI 2023)
[arXiv] [3 min video] [code: RR-SVRG, SO-SVRG, Cyclic-SVRG]

[160] Zhize Li, Slavomír Hanzely and Peter Richtárik
ZeroSARAH: Efficient nonconvex finite-sum optimization with zero full gradient computation
Federated Learning Paper
[arXiv] [code: Zero-SARAH]

[159] Adil Salim, Laurent Condat, Dmitry Kovalev and Peter Richtárik
An optimal algorithm for strongly convex minimization under affine constraints
24th International Conference on Artificial Intelligence and Statistics (AISTATS 2022)
[arXiv]

[158] Zhen Shi, Nicolas Loizou, Peter Richtárik and Martin Takáč
AI-SARAH: Adaptive and implicit stochastic recursive gradient methods
Transactions on Machine Learning Research (TMLR 2023)
[arXiv] [code: AI-SARAH]

[157] Dmitry Kovalev, Egor Shulgin, Peter Richtárik, Alexander Rogozin and Alexander Gasnikov
ADOM: Accelerated decentralized optimization method for time-varying networks
38th International Conference on Machine Learning (ICML 2021)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
[arXiv] [5 min video] [poster] [code: ADOM]

[156] Konstantin Mishchenko, Bokun Wang, Dmitry Kovalev and Peter Richtárik
IntSGD: Floatless compression of stochastic gradients
10th International Conference on Learning Representations (ICLR 2022)
Federated Learning Paper
[arXiv] [5 min video] [code: IntSGD, IntDIANA]

[155] Eduard Gorbunov, Konstantin Burlachenko, Zhize Li and Peter Richtárik
MARINA: faster non-convex distributed learning with compression
38th International Conference on Machine Learning (ICML 2021)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [5 min video] [70 min video] [poster] [code: MARINA, VR-MARINA, PP-MARINA]

[154] Mher Safaryan, Filip Hanzely and Peter Richtárik
Smoothness matrices beat smoothness constants: better communication compression techniques for distributed optimization
Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
ICLR Workshop: Distributed and Private Machine Learning
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [5 min video][code: DCGD+, DIANA+, ADIANA+]

[153] Rustem Islamov, Xun Qian and Peter Richtárik
Distributed second order methods with fast rates and compressed communication
38th International Conference on Machine Learning (ICML 2021)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [5 min video] [80 min video] [slides] [poster] [code: NS, MN, NL1, NL2, CNL]

[152] Konstantin Mishchenko, Ahmed Khaled and Peter Richtárik
Proximal and federated random reshuffling
39th International Conference on Machine Learning (ICML 2022)
NSF-TRIPODS Workshop: Communication Efficient Distributed Optimization
Federated Learning Paper
[arXiv] [8 min video] [code:ProxRR, FedRR]

Prepared in 2020

[151] Samuel Horváth, Aaron Klein, Peter Richtárik and Cedric Archambeau
Hyperparameter transfer learning with adaptive complexity
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
[arXiv]

[150] Xun Qian, Hanze Dong, Peter Richtárik and Tong Zhang
Error compensated loopless SVRG for distributed optimization
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
Federated Learning Paper
[poster] [code: EC-LSVRG]

[149] Xun Qian, Hanze Dong, Peter Richtárik and Tong Zhang
Error compensated proximal SGD and RDA
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[poster] [code: EC-SGD, EC-RDA]

[148] Eduard Gorbunov, Filip Hanzely, and Peter Richtárik
Local SGD: unified theory and new efficient methods
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Federated Learning Paper
[arXiv] [5 min video] [poster] [code: S-Local-SVRG]

[147] Dmitry Kovalev, Anastasia Koloskova, Martin Jaggi, Peter Richtárik, and Sebastian U. Stich
A linearly convergent algorithm for decentralized optimization: sending less bits for free!
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Federated Learning Paper
[arXiv] [3 min video]

[146] Wenlin Chen, Samuel Horváth, and Peter Richtárik
Optimal client sampling for federated learning
Transactions on Machine Learning Research (TMLR 2022)
Privacy Preserving Machine Learning (NeurIPS 2020 Workshop)
Federated Learning Paper
[arXiv] [code: OCS, AOCS]

[145] Eduard Gorbunov, Dmitry Kovalev, Dmitry Makarenko, and Peter Richtárik
Linearly converging error compensated SGD
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Federated Learning Paper
[arXiv] [5 min video] [code: EC-SGD-DIANA, EC-LSVRG-DIANA, EC-LSVRGstar, ...]

[144] Alyazeed Albasyoni, Mher Safaryan, Laurent Condat, and Peter Richtárik
Optimal gradient compression for distributed and federated learning
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [poster] [video]

[143] Filip Hanzely, Slavomír Hanzely, Samuel Horváth, and Peter Richtárik
Lower bounds and optimal algorithms for personalized federated learning
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Federated Learning Paper
[arXiv] [5 min video] [code: APGD1, APGD2, IAPGD, AL2SGD+]

[142] Laurent Condat, Grigory Malinovsky, and Peter Richtárik
Distributed proximal splitting algorithms with rates and acceleration
Frontiers in Signal Processing, section Signal Processing for Communications, 2022
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
Spotlight Talk
[arXiv] [poster]

[141] Robert M. Gower, Mark Schmidt, Francis Bach and Peter Richtárik
Variance-reduced methods for machine learning
Proceedings of the IEEE 108 (11):1968--1983, 2020
[arXiv]

[140] Xun Qian, Peter Richtárik, and Tong Zhang
Error compensated distributed SGD can be accelerated
Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
Federated Learning Paper
[arXiv] [poster] [code: ECLK]

[139] Albert S. Berahas, Majid Jahani, Peter Richtárik, and Martin Takáč
Quasi-Newton methods for deep learning: forget the past, just sample
Optimization Methods and Software, 2021
[arXiv] [code: S-LBFGS, S-LSR1]

[138] Zhize Li, Hongyan Bao, Xiangliang Zhang and Peter Richtárik
PAGE: A simple and optimal probabilistic gradient estimator for nonconvex optimization
38th International Conference on Machine Learning (ICML 2021)
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop) (Spotlight Talk)
[arXiv] [5 min video] [code: PAGE]

[137] Dmitry Kovalev, Adil Salim, and Peter Richtárik
Optimal and practical algorithms for smooth and strongly convex decentralized optimization
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
[arXiv] [code: APAPC, OPAPC, Algorithm 3]

[136] Ahmed Khaled, Othmane Sebbouh, Nicolas Loizou, Robert M. Gower, and Peter Richtárik
Unified analysis of stochastic gradient methods for composite convex and smooth optimization
To appear in: Journal of Optimization Theory and Applications
[arXiv] [code: SGD]

[135] Samuel Horváth and Peter Richtárik
A better alternative to error feedback for communication-efficient distributed learning
9th International Conference on Learning Representations (ICLR 2021)
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
The Best Paper Award at NeurIPS-20 Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [poster] [code: DCSGD]

[134] Adil Salim and Peter Richtárik
Primal dual interpretation of the proximal stochastic gradient Langevin algorithm
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
[arXiv] [code: PGSLA]

[133] Zhize Li and Peter Richtárik
A unified analysis of stochastic gradient methods for nonconvex federated optimization
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [video]

[132] Konstantin Mishchenko, Ahmed Khaled, and Peter Richtárik
Random reshuffling: simple analysis with vast improvements
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
[arXiv] [4 min video] [code: RR, SO, IG]

[131] Motasem Alfarra, Slavomír Hanzely, Alyazeed Albasyoni, Bernard Ghanem, and Peter Richtárik
Adaptive learning of the optimal mini-batch size of SGD
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster]

[130] Adil Salim, Laurent Condat, Konstantin Mishchenko, and Peter Richtárik
Dualize, split, randomize: fast nonsmooth optimization algorithms
Journal of Optimization Theory and Applications, 2022
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster] [code: PDDY, SPDDY, SPD3O, SPAPC]

[129] Atal Narayan Sahu, Aritra Dutta, Aashutosh Tiwari, and Peter Richtárik
On the convergence analysis of asynchronous SGD for solving consistent linear systems
Linear Algebra and its Applications, 2022
[arXiv] [code: DASGD]

[128] Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, and Peter Richtárik
From local SGD to local fixed point methods for federated learning
37th International Conference on Machine Learning (ICML 2020)
Federated Learning Paper
[arXiv] [5 min video] [code: LDFPM, RDFPM]

[127] Aleksandr Beznosikov, Samuel Horváth, Peter Richtárik and Mher Safaryan
On biased compression for distributed learning
Accepted to Journal of Machine Learning Research, 2022
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [poster] [code: CGD, Distributed SGD with Error Feedback]

[126] Zhize Li, Dmitry Kovalev, Xun Qian and Peter Richtárik
Acceleration for compressed gradient descent in distributed and federated optimization
37th International Conference on Machine Learning (ICML 2020)
Federated Learning Paper
[arXiv] [code: ACGD, ADIANA]

[125] Dmitry Kovalev, Robert M. Gower, Peter Richtárik and Alexander Rogozin
Fast linear convergence of randomized BFGS
[arXiv] [code: RBFGS]

[124] Filip Hanzely, Nikita Doikov, Peter Richtárik and Yurii Nesterov
Stochastic subspace cubic Newton method
37th International Conference on Machine Learning (ICML 2020)
[arXiv] [code: SSCN]

[123] Mher Safaryan, Egor Shulgin and Peter Richtárik
Uncertainty principle for communication compression in distributed and federated learning and the search for an optimal compressor
Information and Inference: A Journal of the IMA, 1--24, 2021
Federated Learning Paper
[arXiv]

[122] Filip Hanzely and Peter Richtárik
Federated learning of a mixture of global and local models
SpicyFL 2020: NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning
Federated Learning Paper
[arXiv] [slides] [poster] [video]

[121] Samuel Horváth, Lihua Lei, Peter Richtárik and Michael I. Jordan
Adaptivity of stochastic gradient methods for nonconvex optimization
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
SIAM Journal on Mathematics of Data Science 4(2):634--648, 2022
[arXiv] [poster]

[120] Filip Hanzely, Dmitry Kovalev and Peter Richtárik
Variance reduced coordinate descent with acceleration: new method with a surprising application to finite-sum problems
37th International Conference on Machine Learning (ICML 2020)
[arXiv]

[119] Ahmed Khaled and Peter Richtárik
Better theory for SGD in the nonconvex world
Transactions on Machine Learning Research (TMLR 2022)
[arXiv]

Prepared in 2019

[118] Ahmed Khaled, Konstantin Mishchenko and Peter Richtárik
Tighter theory for local SGD on identical and heterogeneous data
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
Federated Learning Paper
[arXiv]

[117] Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Adil Salim, Peter Richtárik and Martin Takáč
Distributed fixed point methods with compressed iterates
Federated Learning Paper
[arXiv] [preprint]

[116] Samuel Horváth, Chen-Yu Ho, Ľudovít Horváth, Atal Narayan Sahu, Marco Canini and Peter Richtárik
IntML: Natural compression for distributed deep learning
Workshop on AI Systems at Symposium on Operating Systems Principles 2019 (SOSP'19)
[pdf]

[115] Dmitry Kovalev, Konstantin Mishchenko and Peter Richtárik
Stochastic Newton and cubic Newton methods with simple local linear-quadratic rates
NeurIPS 2019 Workshop Beyond First Order Methods in ML
[arXiv] [poster] [code: SN, SCN]

[114] Ahmed Khaled, Konstantin Mishchenko and Peter Richtárik
Better communication complexity for local SGD
NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality
Federated Learning Paper
[arXiv] [poster] [code: local SGD]

[113] Ahmed Khaled and Peter Richtárik
Gradient descent with compressed iterates
NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality
Federated Learning Paper
[arXiv] [poster] [code: GDCI]

[112] Ahmed Khaled, Konstantin Mishchenko and Peter Richtárik
First analysis of local GD on heterogeneous data
NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality
Federated Learning Paper
[arXiv] [poster] [code: local GD]

[111] Jinhui Xiong, Peter Richtárik and Wolfgang Heidrich
Stochastic convolutional sparse coding
International Symposium on Vision, Modeling and Visualization 2019
VMV Best Paper Award, 2019 [link]
[arXiv] [code: SBCSC, SOCSC]

[110] Xun Qian, Zheng Qu and Peter Richtárik
L-SVRG and L-Katyusha with arbitrary sampling
Journal of Machine Learning Research 22(112):1−47, 2021
[arXiv] [5 min video] [code: L-SVRG, L-Katyusha]

[109] Xun Qian, Alibek Sailanbayev, Konstantin Mishchenko and Peter Richtárik
MISO is making a comeback with better proofs and rates
[arXiv] [code: MISO]

[108] Eduard Gorbunov, Adel Bibi, Ozan Sezer, El Houcine Bergou and Peter Richtárik
A stochastic derivative free optimization method with momentum
8th International Conference on Learning Representations (ICLR 2020)
[arXiv] [poster] [code: SMTP]

[107] Mher Safaryan and Peter Richtárik
Stochastic Sign Descent Methods: New Algorithms and Better Theory
38th International Conference on Machine Learning (ICML 2021)
OPT2020: 12th Annual Workshop on Optimization for Machine Learning (NeurIPS 2020 Workshop)
[arXiv] [poster] [code: signSGD, signSGDmaj]

[106] Adil Salim, Dmitry Kovalev and Peter Richtárik
Stochastic proximal Langevin algorithm: potential splitting and nonasymptotic rates
33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
[arXiv] [poster] [code: SPLA]

[105] Aritra Dutta, El Houcine Bergou, Yunming Xiao, Marco Canini and Peter Richtárik
Direct nonlinear acceleration
EURO Journal on Computational Optimization 10, 2022, 100047
[arXiv] [code: DNA]

[104] Konstantin Mishchenko and Peter Richtárik
A stochastic decoupling method for minimizing the sum of smooth and non-smooth functions
[arXiv] [code: SDM]

[103] Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin, Peter Richtárik and Yura Malitsky
Revisiting stochastic extragradient
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
NeuriPS 2019 Workshop on Smooth Games Optimization and Machine Learning
[arXiv]

[102] Filip Hanzely and Peter Richtárik
One method to rule them all: variance reduction for data, parameters and many new methods
[arXiv] [code: GJS + 17 algorithms]

[101] Eduard Gorbunov, Filip Hanzely and Peter Richtárik
A unified theory of SGD: variance reduction, sampling, quantization and coordinate descent
The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
[arXiv]

[100] Samuel Horváth, Chen-Yu Ho, Ľudovít Horváth, Atal Narayan Sahu, Marco Canini and Peter Richtárik
Natural compression for distributed deep learning
Mathematical and Scientific Machine Learning 2022 (MSML 2022)
Federated Learning Paper
[arXiv] [poster]

[99] Robert M. Gower, Dmitry Kovalev, Felix Lieder and Peter Richtárik
RSN: Randomized Subspace Newton
33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
[arXiv] [poster]

[98] Aritra Dutta, Filip Hanzely, Jingwei Liang and Peter Richtárik
Best pair formulation & accelerated scheme for non-convex principal component pursuit
IEEE Transactions on Signal Processing 68:6128-6141, 2020
[arXiv]

[97] Nicolas Loizou and Peter Richtárik
Revisiting randomized gossip algorithms: general framework, convergence rates and novel block and accelerated protocols
IEEE Transactions on Information Theory 67(12):8300--8324, 2021
[arXiv]

[96] Nicolas Loizou and Peter Richtárik
Convergence analysis of inexact randomized iterative methods
SIAM Journal on Scientific Computing 42(6), A3979–A4016, 2020
[arXiv] [code: iBasic, iSDSA, iSGD, iSPM, iRBK, iRBCD]

[95] Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan R. K. Ports and Peter Richtárik
Scaling distributed machine learning with in-network aggregation
The 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI '21 Fall)
[arXiv] [code: SwitchML]

[94] Samuel Horváth, Dmitry Kovalev, Konstantin Mishchenko, Peter Richtárik and Sebastian Stich
Stochastic distributed learning with gradient quantization and double variance reduction
Optimization Methods and Software 38(1), 2023
Federated Learning Paper
[arXiv] [code: DIANA, VR-DIANA, SVRG-DIANA]

[93] El Houcine Bergou, Eduard Gorbunov and Peter Richtárik
Stochastic three points method for unconstrained smooth minimization
SIAM Journal on Optimization 30(4):2726-2749, 2020
[arXiv] [code: STP]

[92] Adel Bibi, El Houcine Bergou, Ozan Sener, Bernard Ghanem and Peter Richtárik
A stochastic derivative-free optimization method with importance sampling
34th AAAI Conference on Artificial Intelligence (AAAI 2020)
[arXiv] [code: STP_IS]

[91] Konstantin Mishchenko, Filip Hanzely and Peter Richtárik
99% of distributed optimization is a waste of time: the issue and how to fix it
36th Conference on Uncertainty in Artificial Intelligence (UAI 2020)
Federated Learning Paper
[arXiv] [code: IBCD, ISAGA, ISGD, IASGD, ISEGA]

[90] Konstantin Mishchenko, Eduard Gorbunov, Martin Takáč and Peter Richtárik
Distributed learning with compressed gradient differences
Federated Learning Paper
[arXiv] [code: DIANA]

[89] Robert Mansel Gower, Nicolas Loizou, Xun Qian, Alibek Sailanbayev, Egor Shulgin and Peter Richtárik
SGD: general analysis and improved rates
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5200-5209, 2019
[arXiv] [poster] [code: SGD-AS]

[88] Dmitry Kovalev, Samuel Horváth and Peter Richtárik
Don’t jump through hoops and remove those loops: SVRG and Katyusha are better without the outer loop
31st International Conference on Learning Theory (ALT 2020)
[arXiv] [code: L-SVRG, L-Katyusha]

[87] Xun Qian, Zheng Qu and Peter Richtárik
SAGA with arbitrary sampling
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5190-5199, 2019
[arXiv] [poster] [code: SAGA-AS]

Prepared in 2018

[86] Lam M. Nguyen, Phuong Ha Nguyen, P. Richtárik, Katya Scheinberg, Martin Takáč and Marten van Dijk
New convergence aspects of stochastic gradient algorithms
Journal of Machine Learning Research 20(176):1-49, 2019
[arXiv]

[85] Filip Hanzely, Jakub Konečný, Nicolas Loizou, Peter Richtárik and Dmitry Grishchenko
A privacy preserving randomized gossip algorithm via controlled noise insertion
NeurIPS Privacy Preserving Machine Learning Workshop, 2018
[arXiv] [poster]

[84] Konstantin Mishchenko and Peter Richtárik
A stochastic penalty model for convex and nonconvex optimization with big constraints
[arXiv]

[83] Nicolas Loizou, Michael G. Rabbat and Peter Richtárik
Provably accelerated randomized gossip algorithms
2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019)
[arXiv] [code: AccGossip]

[82] Filip Hanzely and Peter Richtárik
Accelerated coordinate descent with arbitrary sampling and best rates for minibatches
22nd International Conference on Artificial Intelligence and Statistics (2019
[arXiv] [poster] [code: ACD]

[81] Samuel Horváth and Peter Richtárik
Nonconvex variance reduced optimization with arbitrary sampling
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2781-2789, 2019
Horváth: Best DS3 Poster Award, Paris, 2018
(link)
[arXiv] [poster] [code: SVRG, SAGA, SARAH]

[80] Filip Hanzely, Konstantin Mishchenko and Peter Richtárik
SEGA: Variance reduction via gradient sketching
Advances in Neural Information Processing Systems 31:2082-2093, 2018
[arXiv] [poster] [slides] [code: SEGA] [video: YouTube]

[79] Filip Hanzely, Peter Richtárik and Lin Xiao
Accelerated Bregman proximal gradient methods for relatively smooth convex optimization
Computational Optimization and Applications 79:405–440, 2021
[arXiv] [code: ABPG, ABDA]

[78] Jakub Mareček, Peter Richtárik and Martin Takáč
Matrix completion under interval uncertainty: highlights
Lecture Notes in Computer Science, ECML-PKDD 2018
[pdf]

[77] Nicolas Loizou and Peter Richtárik
Accelerated gossip via stochastic heavy ball method
56th Annual Allerton Conference on Communication, Control, and Computing, 927-934, 2018
Press coverage [KAUST Discovery]
[arXiv] [poster]

[76] Adel Bibi, Alibek Sailanbayev, Bernard Ghanem, Robert Mansel Gower and Peter Richtárik
Improving SAGA via a probabilistic interpolation with gradient descent
[arXiv] [code: SAGD]

[75] Aritra Dutta, Filip Hanzely and Peter Richtárik
A nonconvex projection method for robust PCA
33rd AAAI Conference on Artificial Intelligence (AAAI 2019)
[arXiv]

[74] Robert M. Gower, Peter Richtárik and Francis Bach
Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
Mathematical Programming 188:135–192, 2021
[arXiv] [slides] [code: JacSketch] [video: YouTube]

[73] Aritra Dutta, Xin Li and Peter Richtárik
Weighted low-rank approximation of matrices and background modeling
[arXiv]

[72] Filip Hanzely and Peter Richtárik
Fastest rates for stochastic mirror descent methods
Computational Optimization and Applications 79:717–766, 2021
[arXiv]

[71] Lam M. Nguyen, Phuong Ha Nguyen, Marten van Dijk, Peter Richtárik, Katya Scheinberg and Martin Takáč
SGD and Hogwild! convergence without the bounded gradients assumption
Proceedings of The 35th International Conference on Machine Learning, PMLR 80:3750-3758, 2018
[arXiv]

[70] Robert M. Gower, Filip Hanzely, Peter Richtárik and Sebastian Stich
Accelerated stochastic matrix inversion: general theory and speeding up BFGS rules for faster second-order optimization

Advances in Neural Information Processing Systems 31:1619-1629, 2018
[arXiv] [poster] [code: ABFGS]

[69] Nikita Doikov and Peter Richtárik
Randomized block cubic Newton method
Proceedings of The 35th International Conference on Machine Learning, PMLR 80:1290-1298, 2018
Doikov: Best Talk Award, "Control, Information and Optimization", Voronovo, Russia, 2018
[arXiv] [bib] [code: RBCN]

[68] Dmitry Kovalev, Eduard Gorbunov, Elnur Gasanov and Peter Richtárik
Stochastic spectral and conjugate descent methods
32nd Conference on Neural Information Processing Systems (NeurIPS 2018)
[arXiv] [poster] [code: SSD, SconD, SSCD, mSSCD, iSconD, iSSD]

[67] Radoslav Harman, Lenka Filová and Peter Richtárik
A randomized exchange algorithm for computing optimal approximate designs of experiments

Journal of the American Statistical Association, 2020
[arXiv] [code: REX, OD_REX, MVEE_REX]

[66] Ion Necoara, Andrei Patrascu and Peter Richtárik
Randomized projection methods for convex feasibility problems: conditioning and convergence rates
SIAM Journal on Optimization 29(4):2814–2852, 2019
[arXiv] [slides]

Prepared in 2017

[65] Nicolas Loizou and Peter Richtárik
Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods
Computational Optimization and Applications 77(3):653-710, 2020
[arXiv]

[64] Aritra Dutta and Peter Richtárik
Online and batch supervised background estimation via L1 regression
IEEE Winter Conference on Applications in Computer Vision, 2019
[arXiv]

[63] Nicolas Loizou and Peter Richtárik
Linearly convergent stochastic heavy ball method for minimizing generalization error
NIPS Workshop on Optimization for Machine Learning, 2017
[arXiv] [poster]

[62] Dominik Csiba and Peter Richtárik
Global convergence of arbitrary-block gradient methods for generalized Polyak-Łojasiewicz functions
[arXiv]

[61] Ademir Alves Ribeiro and Peter Richtárik
The complexity of primal-dual fixed point methods for ridge regression
Linear Algebra and its Applications 556:342-372, 2018
[arXiv]

[60] Matthias J. Ehrhardt, Pawel Markiewicz, Antonin Chambolle, Peter Richtárik, Jonathan Schott and Carola-Bibiane Schoenlieb
Faster PET reconstruction with a stochastic primal-dual hybrid gradient method
Proceedings of SPIE, Wavelets and Sparsity XVII, Volume 10394, pages 1039410-1 - 1039410-11, 2017
[pdf] [poster] [code: SPDHG] [video: YouTube]

[59] Aritra Dutta, Xin Li and Peter Richtárik
A batch-incremental video background estimation model using weighted low-rank approximation of matrices
IEEE International Conference on Computer Vision (ICCV) Workshops, 2017
[arXiv] [code: inWLR]

[58] Filip Hanzely, Jakub Konečný, Nicolas Loizou, Peter Richtárik and Dmitry Grishchenko
Privacy preserving randomized gossip algorithms
[arXiv] [slides]

[57] Antonin Chambolle, Matthias J. Ehrhardt, Peter Richtárik and Carola-Bibiane Schoenlieb
Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications
SIAM Journal on Optimization 28(4):2783-2808, 2018
[arXiv] [slides] [poster] [code: SPDHG] [video: YouTube]

[56] Peter Richtárik and Martin Takáč
Stochastic reformulations of linear systems: algorithms and convergence theory
SIAM Journal on Matrix Analysis and Applications 41(2):487–524, 2020
[arXiv] [slides] [code: basic, parallel and accelerated methods]

[55] Mojmír Mutný and Peter Richtárik
Parallel stochastic Newton method
Journal of Computational Mathematics 36(3):404-425, 2018
[arXiv] [code: PSNM]

Prepared in 2016

[54] Robert M. Gower and Peter Richtárik
Linearly convergent randomized iterative methods for computing the pseudoinverse
[arXiv]

[53] Jakub Konečný and Peter Richtárik
Randomized distributed mean estimation: accuracy vs communication
Frontiers in Applied Mathematics and Statistics 2018
Federated Learning Paper
[arXiv]

[52] Jakub Konečný, H. Brendan McMahan, Felix Yu, Peter Richtárik, Ananda Theertha Suresh and Dave Bacon
Federated learning: strategies for improving communication efficiency
NIPS Private Multi-Party Machine Learning Workshop, 2016
Federated Learning Paper link [selected press coverage: The Verge - Quartz - Vice CBR - Android Authority]
[arXiv] [bib] [poster]

[51] Jakub Konečný, H. Brendan McMahan, Daniel Ramage and Peter Richtárik
Federated optimization: distributed machine learning for on-device intelligence
Federated Learning Paper link [selected press coverage: The Verge - Quartz - Vice CBR - Android Authority]
[arXiv] [bib]

[50] Nicolas Loizou and Peter Richtárik
A new perspective on randomized gossip algorithms
IEEE Global Conference on Signal and Information Processing (GlobalSIP), 440-444, 2016
[arXiv] [bib]

[49] Sashank J. Reddi, Jakub Konečný, Peter Richtárik, Barnabás Póczos, Alex Smola
AIDE: fast and communication efficient distributed optimization
[arXiv] [poster]

[48] Dominik Csiba and Peter Richtárik
Coordinate descent face-off: primal or dual?
Proceedings of Algorithmic Learning Theory, PMLR 83:246-267, 2018
[arXiv] [bib]

[47] Olivier Fercoq and Peter Richtárik
Optimization in high dimensions via accelerated, parallel and proximal coordinate descent
SIAM Review 58(4):739-771, 2016
SIAM SIGEST Award
[arXiv] [bib]

[46] Robert M. Gower, Donald Goldfarb and Peter Richtárik
Stochastic block BFGS: squeezing more curvature out of data
Proceedings of the 33rd International Conference on Machine Learning, PMLR 48:1869-1878, 2016
[arXiv] [bib] [poster]

[45] Dominik Csiba and Peter Richtárik
Importance sampling for minibatches
Journal of Machine Learning Research 19(27):1-21, 2018
[arXiv] [bib]

[44] Robert M. Gower and Peter Richtárik
Randomized quasi-Newton updates are linearly convergent matrix inversion algorithms
SIAM Journal on Matrix Analysis and Applications 38(4):1380-1409, 2017
Most Downloaded SIMAX Paper (6th place: 2018)
[arXiv] [code: SIMI, RBFGS, AdaRBFGS, ...]

Prepared in 2015

[43] Zeyuan Allen-Zhu, Zheng Qu, Peter Richtárik and Yang Yuan
Even faster accelerated coordinate descent using non-uniform sampling
Proceedings of the 33rd International Conference on Machine Learning, PMLR 48:1110-1119, 2016
[arXiv] [bib] [code: NU_ACDM]

[42] Robert M. Gower and Peter Richtárik
Stochastic dual ascent for solving linear systems
[arXiv] [code: SDA] [video: YouTube]

[41] Chenxin Ma, Jakub Konečný, Martin Jaggi, Virginia Smith, Michael I Jordan, P. Richtárik and Martin Takáč Distributed optimization with arbitrary local solvers
Optimization Methods and Software 32(4):813-848, 2017
Most-Read Paper, Optimization Methods and Software, 2017
[arXiv] [code: CoCoA+]

[40] Martin Takáč, Peter Richtárik and Nathan Srebro
Distributed mini-batch SDCA
To appear in: Journal of Machine Learning Research
[arXiv]

[39] Robert M. Gower and Peter Richtárik
Randomized iterative methods for linear systems
SIAM Journal on Matrix Analysis and Applications 36(4):1660-1690, 2015
Most Downloaded SIMAX Paper (1st place: 2017-2020)

Gower: 18th IMA Leslie Fox Prize (2nd Prize), 2017 link
[arXiv] [slides]

[38] Dominik Csiba and Peter Richtárik
Primal method for ERM with flexible mini-batching schemes and non-convex losses
[arXiv] [code: dfSDCA]

[37] Jakub Konečný, Jie Liu, Peter Richtárik and Martin Takáč
Mini-batch semi-stochastic gradient descent in the proximal setting
IEEE Journal of Selected Topics in Signal Processing 10(2): 242-255, 2016
[arXiv] [code: mS2GD]

[36] Rachael Tappenden, Martin Takáč and Peter Richtárik
On the complexity of parallel coordinate descent
Optimization Methods and Software 33(2):372-395, 2018
[arXiv]

[35] Dominik Csiba, Zheng Qu and Peter Richtárik
Stochastic dual coordinate ascent with adaptive probabilities
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:674-683, 2015
Csiba: Best Contribution Award (2nd Place), Optimization and Big Data 2015
Implemented in Tensor Flow
[arXiv] [bib] [poster] [code: AdaSDCA and AdaSDCA+]

[34] Chenxin Ma, Virginia Smith, Martin Jaggi, Michael I. Jordan, Peter Richtárik and Martin Takáč
Adding vs. averaging in distributed primal-dual optimization
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1973-1982, 2015
Smith: 2015 MLconf Industry Impact Student Research Award link
CoCoA+ is now the default linear optimizer in Tensor Flow link
[arXiv] [bib] [poster] [code: CoCoA+]

[33] Zheng Qu, Peter Richtárik, Martin Takáč and Olivier Fercoq
SDNA: Stochastic dual Newton ascent for empirical risk minimization
Proceedings of the 33rd International Conference on Machine Learning, PMLR 48:1823-1832, 2016
[arXiv] [bib] [slides] [poster] [code: SDNA]

Prepared in 2014

[32] Zheng Qu and Peter Richtárik
Coordinate descent with arbitrary sampling II: expected separable overapproximation
Optimization Methods and Software 31(5):858-884, 2016
[arXiv]

[31] Zheng Qu and Peter Richtárik
Coordinate descent with arbitrary sampling I: algorithms and complexity
Optimization Methods and Software 31(5):829-857, 2016
[arXiv] [code: ALPHA]

[30] Jakub Konečný, Zheng Qu and Peter Richtárik
Semi-stochastic coordinate descent
Optimization Methods and Software 32(5):993-1005, 2017
[arXiv] [code: S2CD]

[29] Zheng Qu, Peter Richtárik and Tong Zhang
Quartz: Randomized dual coordinate ascent with arbitrary sampling
Advances in Neural Information Processing Systems 28:865-873, 2015
[arXiv] [slides] [code: QUARTZ] [video: YouTube]

[28] Jakub Konečný, Jie Liu, Peter Richtárik and Martin Takáč
mS2GD: Mini-batch semi-stochastic gradient descent in the proximal setting
NIPS Workshop on Optimization for Machine Learning, 2014
[arXiv] [poster] [code: mS2GD]

[27] Jakub Konečný, Zheng Qu and Peter Richtárik
S2CD: Semi-stochastic coordinate descent
NIPS Workshop on Optimization for Machine Learning, 2014
[pdf] [poster] [code: S2CD]

[26] Jakub Konečný and Peter Richtárik
Simple complexity analysis of simplified direct search
[arXiv] [slides in Slovak] [code: SDS]

[25] Jakub Mareček, Peter Richtárik and Martin Takáč
Distributed block coordinate descent for minimizing partially separable functions
Numerical Analysis and Optimization, Springer Proceedings in Math. and Statistics 134:261-288, 2015
[arXiv]

[24] Olivier Fercoq, Zheng Qu, Peter Richtárik and Martin Takáč
Fast distributed coordinate descent for minimizing non-strongly convex losses
2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2014
[arXiv] [poster] [code: Hydra^2]

[23] Duncan Forgan and Peter Richtárik
On optimal solutions to planetesimal growth models
Technical Report ERGO 14-002, 2014
[pdf]

[22] Jakub Mareček, Peter Richtárik and Martin Takáč
Matrix completion under interval uncertainty
European Journal of Operational Research 256(1):35-42, 2017
[arXiv] [code: MACO]

Prepared in 2013

[21] Olivier Fercoq and Peter Richtárik
Accelerated, Parallel and PROXimal coordinate descent
SIAM Journal on Optimization 25(4):1997-2023, 2015
Fercoq: 17th IMA Leslie Fox Prize (Second Prize), 2015
2nd Most Downloaded SIOPT Paper (Aug 2016 - now)
[arXiv] [poster] [code: APPROX] [video: YouTube]

[20] Jakub Konečný and Peter Richtárik
Semi-stochastic gradient descent methods
Frontiers in Applied Mathematics and Statistics 3:9, 2017
[arXiv] [poster] [slides] [code: S2GD and S2GD+]

[19] Peter Richtárik and Martin Takáč
On optimal probabilities in stochastic coordinate descent methods
Optimization Letters 10(6):1233-1243, 2016
[arXiv] [poster] [code: NSync]

[18] Peter Richtárik and Martin Takáč
Distributed coordinate descent method for learning with big data
Journal of Machine Learning Research 17(75):1-25, 2016
[arXiv] [poster] [code: Hydra]

[17] Olivier Fercoq and Peter Richtárik
Smooth minimization of nonsmooth functions with parallel coordinate descent methods
Springer Proceedings in Mathematics and Statistics 279:57-96, 2019
[arXiv] [code: SPCDM]

[16] Rachael Tappenden, Peter Richtárik and Burak Buke
Separable approximations and decomposition methods for the augmented Lagrangian
Optimization Methods and Software 30(3):643-668, 2015
[arXiv]

[15] Rachael Tappenden, Peter Richtárik and Jacek Gondzio
Inexact coordinate descent: complexity and preconditioning
Journal of Optimization Theory and Applications 170(1):144-176, 2016
[arXiv] [poster] [code: ICD]

[14] Martin Takáč, Selin Damla Ahipasaoglu, Ngai-Man Cheung and Peter Richtárik
TOP-SPIN: TOPic discovery via Sparse Principal component INterference
Springer Proceedings in Mathematics and Statistics 279:157-180, 2019
[arXiv] [poster] [code: TOP-SPIN]

[13] Martin Takáč, Avleen Bijral, Peter Richtárik and Nathan Srebro
Mini-batch primal and dual methods for SVMs
Proceedings of the 30th International Conference on Machine Learning, 2013
[arXiv] [poster] [code: minibatch SDCA and minibatch Pegasos]

Prepared in 2012 or earlier

[12] Peter Richtárik, Majid Jahani, Martin Takáč and Selin Damla Ahipasaoglu
Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes
Optimization and Engineering 22:1493--1519, 2021
[arXiv] [code: 24am]

[11] William Hulme, Peter Richtárik, Lynne McGuire and Alison Green
Optimal diagnostic tests for sporadic Creutzfeldt-Jakob disease based on SVM classification of RT-QuIC data
Technical Report, 2012
[arXiv]

[10] Peter Richtárik and Martin Takáč
Parallel coordinate descent methods for big data optimization
Mathematical Programming 156(1):433-484, 2016
Takáč: 16th IMA Leslie Fox Prize (2nd Prize), 2013 link
#1 Top Trending Article in Mathematical Programming Ser A and B (2017) link
[arXiv] [slides] [code: PCDM, AC/DC] [video: YouTube]

[9] Peter Richtárik and Martin Takáč
Efficient serial and parallel coordinate descent methods for huge-scale truss topology design
Operations Research Proceedings 2011:27-32, Springer-Verlag, 2012
[Optimization Online] [poster]

[8] Peter Richtárik and Martin Takáč
Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function
Mathematical Programming 144(2):1-38, 2014
Best Student Paper (runner-up), INFORMS Computing Society, 2012
[arXiv] [slides]

[7] Peter Richtárik and Martin Takáč
Efficiency of randomized coordinate descent methods on minimization problems with a composite objective function
Proceedings of Signal Processing with Adaptive Sparse Structured Representations, 2011
[pdf]

[6] Peter Richtárik
Finding sparse approximations to extreme eigenvectors: generalized power method for sparse PCA and extensions
Proceedings of Signal Processing with Adaptive Sparse Structured Representations, 2011
[pdf]

[5] Peter Richtárik
Approximate level method for nonsmooth convex minimization
Journal of Optimization Theory and Applications 152(2):334–350, 2012
[Optimization Online]

[4] Michel Journée, Yurii Nesterov, Peter Richtárik and Rodolphe Sepulchre
Generalized power method for sparse principal component analysis
Journal of Machine Learning Research 11:517–553, 2010
[arXiv] [slides] [poster] [code: GPower]

[3] Peter Richtárik
Improved algorithms for convex minimization in relative scale
SIAM Journal on Optimization 21(3):1141–1167, 2011
[pdf] [slides]

[2] Peter Richtárik
Simultaneously solving seven optimization problems in relative scale
Technical Report, 2009
[Optimization Online]

[1] Peter Richtárik
Some algorithms for large-scale convex and linear minimization in relative scale
PhD Dissertation, School of Operations Research and Information Engineering, Cornell University, 2007