2. Datasets

The FairLangProc datasets module provides access to standard benchmarks for evaluating gender, racial, religious, and other social biases in NLP models.

2.1. Overview

The BiasDataLoader is the main entry point for loading bias evaluation datasets. It supports multiple output formats and dataset configurations.

Supported Benchmark Datasets

Data Set

Size

Bias target

Reference

BBQ

58,492

Gender, race, religion,…

(Parrish et al., 2021)

BEC-Pro

5,400

Gender

(Bartl et al., 2020)

BOLD

23,679

Gender, race, religion,…

(Dhamala et al., 2021)

BUG

108,419

Gender

(Levy et al., 2021)

Crow-SPairs

1,508

Age, disability, gender, nationality,…

(Nangia et al., 2020)

GAP

8,908

Gender

(Webster et al., 2018)

HolisticBias

460,000

Age, disability, gender, nationality,…

(Smith et al., 2022)

HONEST

420

Gender

(Nozza et al., 2021)

StereoSet

16,995

Gender, race, religion,…

(Nadeem et al., 2020)

UnQover

30

Gender, nationality, race,…

(Li et al., 2020)

WinoBias+

1,367

Gender

(Vanmassenhove et al., 2021)

WinoBias

3,160

Gender

(Zhao et al., 2018)

WinoGender

720

Gender

(Rudinger et al., 2018)

2.2. API Reference and usage examples

FairLangProc.datasets.fairness_datasets.BiasDataLoader(dataset: str | None = None, config: str | None = None, format: str = 'hf', benchmark_path: str | None = None) Dict[str, pandas.DataFrame | List[str] | Dataset | datasets.Dataset] | None[source]

Load specified bias evaluation dataset.

Requires downloading the Fair-LLM-Benchmark repository (https://github.com/i-gallegos/Fair-LLM-Benchmark , credits to Isabel O. Gallegos et al).

Parameters:
  • dataset (str) – name of the dataset.

  • config (str) – dataset configuration if applicable.

  • format (str) – output format - ‘raw’, ‘hf’ (hugging face), or ‘pt’ (pytorch).

  • benchmark_path (str) – path where the Fair-LLM-Benchmark resides. If none, it looks for it in FairLangProc/FairLangProc/datasets/Fair-LLM-Benchmark

Returns:

dataDict – Dictionary with datasets in the appropriate format.

Return type:

dict

Example

>>> from FairLangProc.datasets import BiasDataLoader
>>> BiasDataLoader()
Available datasets:
====================
BBQ
BEC-Pro
BOLD
BUG
CrowS-Pairs
GAP
HolisticBias
StereoSet
WinoBias+
WinoBias
Winogender
>>> BiasDataLoader(dataset = 'BBQ')
Available configurations:
====================
Age
Disability_Status
Gender_identity
Nationality
Physical_appearance
Race_ethnicity
Race_x_gender
Race_x_SES
Religion
SES
Sexual_orientation
all
>>> ageBBQ = BiasDataLoader(dataset = 'BBQ', config = 'Age')

See also

  • Tutorials - Interactive Jupyter notebooks (DemoDatasets.ipynb) <https://github.com/arturo-perez-peralta/FairLangProc/blob/main/notebooks/DemoDatasets.ipynb> demonstrating dataset usage