Purdue University Graduate School
thesis_saa.pdf (5.2 MB)

Efficient Decentralized Learning Methods for Deep Neural Networks

Download (5.2 MB)
posted on 2024-03-26, 20:07 authored by Sai Aparna AketiSai Aparna Aketi

Decentralized learning is the key to training deep neural networks (DNNs) over large distributed datasets generated at different devices and locations, without the need for a central server. They enable next-generation applications that require DNNs to interact and learn from their environment continuously. The practical implementation of decentralized algorithms brings about its unique set of challenges. In particular, these algorithms should be (a) compatible with time-varying graph structures, (b) compute and communication efficient, and (c) resilient to heterogeneous data distributions. The objective of this thesis is to enable efficient decentralized learning in deep neural networks addressing the abovementioned challenges. Towards this, firstly a communication-efficient decentralized algorithm (Sparse-Push) that supports directed and time-varying graphs with error-compensated communication compression is proposed. Second, a low-precision decentralized training that aims to reduce memory requirements and computational complexity is proposed. Here, we design ”Range-EvoNorm” as the normalization activation layer which is better suited for low-precision decentralized training. Finally, addressing the problem of data heterogeneity, three impactful advancements namely Neighborhood Gradient Mean (NGM), Global Update Tracking (GUT), and Cross-feature Contrastive Loss (CCL) are proposed. NGM utilizes extra communication rounds to obtain cross-agent gradient information whereas GUT tracks global update information with no communication overhead, improving the performance on heterogeneous data. CCL explores an orthogonal direction of using a data-free knowledge distillation approach to handle heterogeneous data in decentralized setups. All the algorithms are evaluated on computer vision tasks using standard image-classification datasets. We conclude this dissertation by presenting a summary of the proposed decentralized methods and their trade-offs for heterogeneous data distributions. Overall, the methods proposed in this thesis address the critical limitations of training deep neural networks in a decentralized setup and advance the state-of-the-art in this domain.


Degree Type

  • Doctor of Philosophy


  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Kaushik Roy

Additional Committee Member 2

Anand Raghunathan

Additional Committee Member 3

Sumeet K. Gupta

Additional Committee Member 4

Vijay Raghunathan

Usage metrics



    Ref. manager