4.1.1. Pre-processing¶

Pre-processors are fairness processors that modify the model inputs.

The supported methods are:

Counterfactual Data Augmentation (CDA) (Webster et al. 2020).
Projection based debiasing (Bolukbasi et al., 2023).
Bias removaL wIth No Demographics (BLIND) (Orgad et al., 2023).

4.1.1.1. Counterfactual Data Augmentation (CDA)¶

Data augmentation is the process of curating or upsampling the dataset to obtain a more representative distribution to train the model on. In particular, Counterfactual Data Augmentation (CDA) (Webster et al. 2020) consists of flipping words with demographic information while preserving semantic correctness. This procedure can be one-sided and discard the original sentence or two-sided to consider both the original and its augmented version.

FairLangProc.algorithms.preprocessors.augmentation.CDA(batch: dict, pairs: dict[str, str], columns: list[str] | None = None, bidirectional: bool = True) → dict[source]

Perform CDA on a batch of training instances.

Parameters:

batch (dict) – Batch of training instances
pairs (dict) – Dictionary of counterfactual pairs
columns (list[str]) – List of columns on which CDA should be performed. If none, applies CDA to all columns.
bidirectional (bool) – If true, applies bidirectional CDA (preserves original training instance). If false, deletes original training instance.

Returns:

output (dict) – Augmented training instance.
modified (dict) – Whether or not the training instance was augmented.

Example

>>> from FairLangProc.algorithms.preprocessors import CDA
>>> gendered_pairs = [('he', 'she'), ('him', 'her'), ('his', 'hers'), ('actor', 'actress'), ('priest', 'nun'),
... ('father', 'mother'), ('dad', 'mom'), ('daddy', 'mommy'), ('waiter', 'waitress'), ('James', 'Jane')]
>>>
>>> cda_train = Dataset.from_dict(CDA(imdb['train'][:], pairs = dict(gendered_pairs)))
>>> train_CDA = cda_train.map(tokenize_function, batched=True)
>>> train_CDA.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])

4.1.1.2. Projection-based debiasing¶

Projection-based debiasing methods (Bolukbasi et al., 2023) operate on latent space, looking to identify a bias subspace given by an orthogonal basis, \(\{v_i\}_{i=1}^{n_{bias}}\). Then, the hidden representation of any input can be debiased by removing its projection onto this space, formally

\[h_{proj} = h - \sum_{i = 1}^{n_{bias} } \langle h, v_i \rangle \, v_i.\]

This can be done either at the word or sentence level. In either case the bias subspace is generally identified through PCA, and usually its dimension is one, resulting in the construction of a bias direction.

4.1.1. Pre-processing¶

4.1.1.1. Counterfactual Data Augmentation (CDA)¶

4.1.1.2. Projection-based debiasing¶

4.1.1.3. BLIND debiasing¶

FairLangProc

Navigation

Related Topics