Purdue University Graduate School
Browse

Enhancing the Reliability of Federated Inference: From Bias Mitigation to Model Calibration

Download (17.18 MB)
thesis
posted on 2025-02-10, 15:56 authored by Yun-Wei ChuYun-Wei Chu

The increasing prevalence of decentralized data in real-world applications, characterized by natural distributions and clustered patterns, has amplified the need for distributed machine learning techniques. This thesis delves into the challenges and opportunities of developing innovative distributed solutions, focusing on real-world applications in social network scenarios, such as student modeling and natural language processing.


This thesis first investigates the limitations of traditional centralized machine learning, which often generalizes poorly to underrepresented patterns due to biases in data availability. Motivated by Federated Learning (FL), we propose a personalized distributed machine learning framework to model student behavior and generate fair predictions based on distinct demographic attributes, addressing biases inherent in education data. Next, we tackle communication constraints--a fundamental challenge in distributed systems, particularly when applying large language models for natural language tasks. We focus on federated multilingual translation, developing an efficient communication method that selectively exchanges only crucial parameters during FL rounds, significantly reducing overhead while maintaining performance.


Building upon these applications, we explore broader challenges in FL -- model initialization and reliability -- which remain under-explored. Unlike centralized approaches, FL often suffers from limited performance gains and imbalanced predictions across distributed participants when initialized with pre-trained models, raising fairness concerns. To address this, we propose a meta-learning-based pre-training strategy that delivers robust and fair initialization across heterogeneous FL tasks. Finally, we examine model calibration in FL, an essential aspect for confidence alignment and decision reliability. We introduce a similarity-based calibration method that seamlessly adapts to diverse FL algorithms, enhancing both performance and model calibration.


In conclusion, this thesis highlights the importance of fairness, efficiency, and reliability in distributed machine learning, offering novel solutions to key challenges and advancing the applicability of distributed machine learning across various domains.

History

Degree Type

  • Doctor of Philosophy

Department

  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Christopher G. Brinton

Additional Committee Member 2

David J. Love

Additional Committee Member 3

Jing Gao

Additional Committee Member 4

Seyyedali Hosseinalipour

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC