Data-based Explanations of Random Forest using Machine Unlearning

Surve, Tanmay Laxman

doi:10.25394/PGS.24712758.v1

Data-based Explanations of Random Forest using Machine Unlearning

thesis

posted on 2023-12-03, 07:23 authored by Tanmay Laxman SurveTanmay Laxman Surve

Tree-based machine learning models, such as decision trees and random forests, are one of the most widely used machine learning models primarily because of their predictive power in supervised learning tasks and ease of interpretation. Despite their popularity and power, these models have been found to produce unexpected or discriminatory behavior. Given their overwhelming success for most tasks, it is of interest to identify root causes of the unexpected and discriminatory behavior of tree-based models. However, there has not been much work on understanding and debugging tree-based classifiers in the context of fairness. We introduce FairDebugger, a system that utilizes recent advances in machine unlearning research to determine training data subsets responsible for model unfairness. Given a tree-based model learned on a training dataset, FairDebugger identifies the top-k training data subsets responsible for model unfairness, or bias, by measuring the change in model parameters when parts of the underlying training data are removed. We describe the architecture of FairDebugger and walk through real-world use cases to demonstrate how FairDebugger detects these patterns and their explanations.

History

Degree Type

Master of Science

Department

Computer and Information Technology

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Romila Pradhan

Additional Committee Member 2

Julia Rayz

Additional Committee Member 3

John A Springer

Usage metrics

Keywords

Model Debugging Example-based explanations Algorithmic Fairness Data Cleaning Data Analytics Fairness in ML Random Forest Debugging

Licence

CC BY 4.0

Data-based Explanations of Random Forest using Machine Unlearning

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Usage metrics

Categories

Keywords

Licence

Exports