========= Metrics ========= FairLangProc provides comprehensive fairness metrics to measure discrimination in NLP models. Supported Metrics ----------------- FairLangProc supports different fairness metrics to measure discrimination in NLP. Broadly, they can be classified into three categories: - **Embedding metrics** (WEAT, SEAT): if they measure bias by examining the model's hidden representations of input text. - **Probability metrics** (LPBS, CBS, CPS, AUL): if they measure bias by computing the probabilities of certain tokens or sentences. - **Generated text metrics** (DR, SA, HONEST): if they measure bias by examining text generated by the model, looking for harmful or stereotypical words. The implemented metrics are: - :ref:`Generalized association tests (WEAT) ` `(Caliskan et al., 2016) `_. - :ref:`Log Probability Bias Score (LPBS) ` `(Kurita et al., 2019) `_. - :ref:`Categorical Bias Score (CBS) ` `(Ahn et al., 2021) `_. - :ref:`CrowS-Pairs Score (CPS) ` `(Nangia et al., 2020) `_. - :ref:`All Unmasked Score (AUL) ` `(Kaneko et al., 2021) `_. - :ref:`Demographic Representation (DR) ` `(Liang et al., 2022) `_. - :ref:`Stereotypical Association (SA) ` `(Liang et al., 2022) `_. - :ref:`HONEST ` `(Nozza et al., 2021) `_. Embedding Metrics ----------------- .. _weat: WEAT ~~~~ The most famous embedding metric is given the Word Embedding Association Test (WEAT) `(Caliskan et al., 2016) `_, which aims to measure associations between demographic and neutral attributes. Demographic attributes are usually binary and denoted by :math:`A_1, A_2`, denoting two different societal groups (male and female, christians and atheist,...). Neutral attributes, on the other hand, are denoted by :math:`W_1, W_2` and represent two different stereotypes whose demographic association we are interested in. .. math:: s(a, W_1, W_2) = \sum_{w_1\in \mathbb{W}_1} \frac{\cos(a, w_1)}{|\mathbb{W}_1|} - \sum_{w_2\in \mathbb{W}_2} \frac{\cos(a, w_2)}{|\mathbb{W}_2|}, .. math:: WEAT(A_1, A_2, W_1, W_2) = \frac{\sum_{a_1 \in A_1} s(a_1, W_1, W_2)/ |A_1| - \sum_{a_2 \in A_2} s(a_2, W_1, w_2)/ |A_2| }{\text{std}_{a\in A_1 \cup A_2} s(a, W_1, W_2)}. .. autoclass:: FairLangProc.metrics.embedding.WEAT :members: __init__, _get_embedding, metric :no-index: Probability Metrics ------------------- .. _metric-lpbs: LPBS ~~~~ LPBS `(Kurita et al., 2019) `_ measures bias for a binary demographic group. .. math:: \text{LPBS} = \log\frac{p_1}{p_{prior, 1}} - \log\frac{p_2}{p_{prior, 2}}. .. autofunction:: FairLangProc.metrics.probability.LPBS :no-index: .. _metric-cbs: CBS ~~~ CBS `(Ahn et al., 2021) `_ generalizes measurement of bias for non-binary demographic groups. .. math:: \text{CBS} = \text{Var}_{a\in \mathbb{A}}\log\frac{p_a}{p_{prior, a}}. .. autofunction:: FairLangProc.metrics.probability.CBS :no-index: .. _metric-cps: CPS ~~~ CPS `(Nangia et al., 2020) `_ uses sentence pairs which coincide in a series of unmodified tokens. .. math:: \text{CPS}(S) = \sum_{u\in U} \log \mathbb{P}(u| U_{\backslash u}, A), .. autofunction:: FairLangProc.metrics.probability.CPS :no-index: .. _metric-aul: AUL ~~~ AUL `(Kaneko et al., 2021) `_ predicts the probability of all tokens in the sentence without masking. .. math:: \text{AUL}(S) = \frac{1}{|S|} \sum_{s\in S} \log \mathbb{P}(s|S). .. autofunction:: FairLangProc.metrics.probability.AUL :no-index: Generated Text Metrics ---------------------- .. _metric-dr: Demographic Representation ~~~~~~~~~~~~~~~~~~~~~~~~~~ DR `(Liang et al., 2022) `_ is computed as follows: .. math:: \text{DR}(a) = \sum_{w_i \in \mathbb{A}}\sum_{\hat{Y} \in \hat{\mathbb{Y}}} C(w_i, \hat{Y}), .. autofunction:: FairLangProc.metrics.generated_text.DemRep :no-index: .. _metric-sa: Stereotypical Association ~~~~~~~~~~~~~~~~~~~~~~~~~ SA `(Liang et al., 2022) `_ is computed as follows: .. math:: \text{DR}(a) = \sum_{w_i \in \mathbb{A}}\sum_{\hat{Y} \in \hat{\mathbb{Y}}} C(w_i, \hat{Y}), .. autofunction:: FairLangProc.metrics.generated_text.StereoAsoc :no-index: .. _metric-honest: HONEST ~~~~~~ HONEST `(Nozza et al., 2021) `_ measures how many of the top :math:`k` completions contain harmful words. .. math:: \text{HONEST}(\hat{\mathbb{Y}} ) = \frac{\sum_{\hat{Y}_k \in\hat{\mathbb{Y}}_k} \sum_{\hat{y} \in \hat{Y}_k} \mathbf{1}(\hat{y} \in \mathbb{Y}_{hurt} ) }{|\mathbb{\hat{Y}}| k}. .. autofunction:: FairLangProc.metrics.generated_text.HONEST :no-index: .. seealso:: - :doc:`tutorials` - Interactive Jupyter notebooks (`DemoMetrics.ipynb `_) for bias measurement