Communication-efficient distributed cubic Newton with compressed lazy Hessian.

Zhang, Zhen; Che, Keqin; Yang, Shaofu; Xu, Wenying

Zhang, Zhen; Che, Keqin; Yang, Shaofu; Xu, Wenying.

Afiliación

Zhang Z; School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, PR China. Electronic address: zhang_zhen@seu.edu.cn.
Che K; School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, PR China. Electronic address: kqche@seu.edu.cn.
Yang S; School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, PR China. Electronic address: sfyang@seu.edu.cn.
Xu W; School of Mathematics, Southeast University, Nanjing, Jiangsu, PR China. Electronic address: wyxu@seu.edu.cn.

Neural Netw ; 174: 106212, 2024 Jun.

Article en En | MEDLINE | ID: mdl-38479185

ABSTRACT

ABSTRACT

Recently, second-order distributed optimization algorithms have been becoming a research hot in distributed learning, due to their faster convergence rate than the first-order algorithms. However, second-order algorithms always suffer from serious communication bottleneck. To conquer such challenge, we propose communication-efficient second-order distributed optimization algorithms in the parameter-server framework, by incorporating cubic Newton methods with compressed lazy Hessian. Specifically, our algorithms require each worker communicate compressed Hessians with the server only at some particular iterations, which can save both communication bits and communication rounds. For non-convex problems, we theoretically prove that our algorithms can reduce the communication cost comparing to the state-of-the-art second-order algorithms, while maintaining the same iteration complexity order O(Ïµ-3/2) as the centralized cubic Newton methods. By further using gradient regularization technique, our algorithms can achieve global convergence for convex problems. Moreover, for strongly convex problems, our algorithms can achieve local superlinear convergence rate without any requirement on initial conditions. Finally, numerical experiments are conducted to show the high efficiency of the proposed algorithms.

Asunto(s)

Algoritmos; Aprendizaje; Comunicación

Palabras clave

Cubic Newton method; Distributed optimization; Efficient communication; Second-order algorithms

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Aprendizaje Idioma: En Revista: Neural Netw Asunto de la revista: NEUROLOGIA Año: 2024 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google