Resource-Aware Decentralized Federated Learning over Heterogeneous Networks
A recent emphasis of distributed learning research has been on federated learning (FL), in which model training is conducted by the data-collecting devices. In traditional FL algorithms, trained models at the edge are periodically sent to a central server for aggregation, utilizing a star topology as the underlying communication graph. However, assuming access to a central coordinator is not always practical, e.g., in ad hoc wireless network settings, motivating efforts to fully decentralize FL. Consequently, Decentralized federated learning (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are exclusively carried out by the clients without a central server. Inherent challenges due to distributed nature of FL training, i.e., data heterogeneity and resource heterogeneity, become even more prevalent in DFL since it lacks a central server as a coordinator. In this thesis, we present two algorithms for resource-aware DFL, which result in achieving an overall desired performance across the clients in shorter amount of time compared to existing conventional DFL algorithms which do not factor in the resource availability of clients in their approaches.
In the first project, we propose EF-HC, a novel methodology for distributed model aggregations via asynchronous, event-triggered consensus iterations over the network graph topology. We consider personalized/heterogeneous communication event thresholds at each device that weigh the change in local model parameters against the available local resources in deciding whether an aggregation would be beneficial enough to incur a communication delay on the system. In the second project, we propose Decentralized Sporadic Federated Learning (DSpodFL), a DFL methodology built on a generalized notion of sporadicity in both local gradient and aggregation processes. DSpodFL subsumes many existing decentralized optimization methods under a unified algorithmic framework by modeling the per-iteration (i) occurrence of gradient descent at each client and (ii) exchange of models between client pairs as arbitrary indicator random variables, thus capturing heterogeneous and time-varying computation/communication scenarios. We analytically characterize the convergence behavior of both algorithms for strongly convex models using both a constant and a diminishing learning rate, under mild assumptions on the communication graph connectivity, data heterogeneity across clients, and gradient noises. In DSpodFL, we do the same for non-convex models as well. Our numerical experiments demonstrate that both EF-HC and DSpodFL consistently achieve improved training speeds compared with baselines under various system settings.
History
Degree Type
- Master of Science
Department
- Electrical and Computer Engineering
Campus location
- West Lafayette