Abstract
There are synergies of research interests and industrial efforts in modeling fairness and correcting algorithmic bias in machine learning. In this paper, we present a scalable algorithm for spectral clustering (SC) with group fairness constraints. Group fairness is also known as statistical parity where in each cluster, each protected group is represented with the same proportion as in the entirety. While FairSC algorithm (Kleindessner et al., 2019) is able to find the fairer clustering, it is compromised by high computational costs due to the algorithm's kernels of computing nullspaces and the square roots of dense matrices explicitly. We present a new formulation of the underlying spectral computation of FairSC by incorporating nullspace projection and Hotelling's deflation such that the resulting algorithm, called s-FairSC, only involves the sparse matrix-vector products and is able to fully exploit the sparsity of the fair SC model. The experimental results on the modified stochastic block model demonstrate that while it is comparable with FairSC in recovering fair clustering, s-FairSC is 12× faster than FairSC for moderate model sizes. s-FairSC is further demonstrated to be scalable in the sense that the computational costs of s-FairSC only increase marginally compared to the SC without fairness constraints.
Original language | English |
---|---|
Pages (from-to) | 6613-6629 |
Number of pages | 17 |
Journal | Proceedings of Machine Learning Research |
Volume | 206 |
State | Published - 2023 |
Event | 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 - Valencia, Spain Duration: Apr 25 2023 → Apr 27 2023 |
Bibliographical note
Publisher Copyright:Copyright © 2023 by the author(s)
Funding
Wang and Bai was supported in part by NSF grant 1913364. Lu was supported by NSF grant 2110731. David-son was supported in part by NSF grant 1910306, and a gift from Google. We would like to thank anonymous reviewers for their constructive comments that have significantly improved the presentation.
Funders | Funder number |
---|---|
National Science Foundation (NSF) | 1913364, 2110731, 1910306 |
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability