Explainable Federated Learning: Detecting Drift with Feature-Level Attribution Using SAGE
World Conference on Explainable Artificial Intelligence 2026
Abstract
Loss curves alone provide limited insights into feature attributions and model behavior in Federated Learning (FL), making it challenging to diagnose performance degradation. To address this, we propose a novel framework that leverages Shapley Additive Global Explanations (SAGE) to attribute model performance to individual features by analyzing their impact on the loss. By extending SAGE to a federated setting, we gain a more detailed understanding of feature attributions across decentralized clients. Building on this capability, our approach detects drift by analyzing shifts over time in the distributions of loss and feature attributions contributed by each client. Our approach allows us to pinpoint which features exhibit significant drift and identify those most contributing to model degradation, thereby facilitating targeted interventions. Across six datasets spanning synthetic and real-world domains, our framework achieves an average detection F1-score of 0.97 with an average delay of 1.9 rounds, consistently matching or outperforming standard drift detection methods while retaining fine-grained, feature-level drift interpretability. Furthermore, Federated SAGE approximates centralized SAGE with mean absolute error typically below 0.01, enabling accurate drift explanation in federated settings.
BibTeX
@inproceedings{jagtani2024explainable,
title={Explainable Federated Learning: Detecting Drift with Feature-Level Attribution Using SAGE},
author={Jagtani, Rohit and Ropinski, Timo and Kasneci, Gjergji},
booktitle={Proceedings of World Conference on Explainable Artificial Intelligence}
year={2026}
}