A Gibbs Posterior Framework for Fair Clustering.

Chakraborty, Abhisek; Bhattacharya, Anirban; Pati, Debdeep

Chakraborty, Abhisek; Bhattacharya, Anirban; Pati, Debdeep.

Afiliação

Chakraborty A; Department of Statistics, Texas A&M University, College Station, TX 77843, USA.
Bhattacharya A; Department of Statistics, Texas A&M University, College Station, TX 77843, USA.
Pati D; Department of Statistics, Texas A&M University, College Station, TX 77843, USA.

Entropy (Basel) ; 26(1)2024 Jan 11.

Article em En | MEDLINE | ID: mdl-38248188

ABSTRACT

ABSTRACT

The rise of machine learning-driven decision-making has sparked a growing emphasis on algorithmic fairness. Within the realm of clustering, the notion of balance is utilized as a criterion for attaining fairness, which characterizes a clustering mechanism as fair when the resulting clusters maintain a consistent proportion of observations representing individuals from distinct groups delineated by protected attributes. Building on this idea, the literature has rapidly incorporated a myriad of extensions, devising fair versions of the existing frequentist clustering algorithms, e.g., k-means, k-medioids, etc., that aim at minimizing specific loss functions. These approaches lack uncertainty quantification associated with the optimal clustering configuration and only provide clustering boundaries without quantifying the probabilities associated with each observation belonging to the different clusters. In this article, we intend to offer a novel probabilistic formulation of the fair clustering problem that facilitates valid uncertainty quantification even under mild model misspecifications, without incurring substantial computational overhead. Mixture model-based fair clustering frameworks facilitate automatic uncertainty quantification, but tend to showcase brittleness under model misspecification and involve significant computational challenges. To circumnavigate such issues, we propose a generalized Bayesian fair clustering framework that inherently enjoys decision-theoretic interpretation. Moreover, we devise efficient computational algorithms that crucially leverage techniques from the existing literature on optimal transport and clustering based on loss functions. The gain from the proposed technology is showcased via numerical experiments and real data examples.

Palavras-chave

algorithmic fairness; balance; generalized Bayes; minimum cost flow; optimal transport

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: Entropy (Basel) Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google