========================= Citation and bibliography ========================= Citation -------- If you use FairLangProc in your research, please cite our paper, **FairLangProc: A Python package for fairness in NLP**. BibTeX ~~~~~~ .. code-block:: bibtex @misc{pérezperalta2025fairlangprocpythonpackagefairness, title={FairLangProc: A Python package for fairness in NLP}, author={Arturo Pérez-Peralta and Sandra Benítez-Peña and Rosa E. Lillo}, year={2025}, eprint={2508.03677}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2508.03677}, } APA Style ~~~~~~~~~ Pérez-Peralta, A., Benítez-Peña, S., & Lillo, R. E. (2025). FairLangProc: A Python package for fairness in NLP. arXiv preprint arXiv:2508.03677. IEEE Style ~~~~~~~~~~ A. Pérez-Peralta, S. Benítez-Peña, and R. E. Lillo, "FairLangProc: A Python package for fairness in NLP," arXiv preprint arXiv:2508.03677, 2025. MLA Style ~~~~~~~~~ Pérez-Peralta, Arturo, et al. "FairLangProc: A Python package for fairness in NLP." arXiv preprint arXiv:2508.03677 (2025). Related Publications -------------------- If you use specific datasets, metrics, or algorithms from FairLangProc, please also cite the original papers: Datasets ~~~~~~~~ - **BBQ** (Bias Benchmark for Question Answering): Parrish, A., et al. (2021). BBQ: A hand-built bias benchmark for question answering. arXiv:2110.08193. - **BEC-Pro** (Bias Evaluation Corpus for Professions): Bartl, M., et al. (2020). Uncovering Hidden Bias in NLP. arXiv:2010.14534. - **BOLD** (Bias in Open-Ended Language Generation): Dhamala, J., et al. (2021). BOLD: A large-scale bias benchmark for language model evaluation. ACM FAccT. - **BUG** (Bias evaluation Universal Benchmark): Levy, S., et al. (2021). A Unified Approach to Benchmarking and Mitigating Bias. arXiv:2109.03858. - **CrowS-Pairs**: Nangia, N., et al. (2020). CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. EMNLP. - **GAP** (Gender Ambiguous Pronouns): Webster, K., et al. (2018). Mind the GAP: A systematic approach to measuring gender bias in coreference resolution. ACL. - **HolisticBias**: Smith, E. M., et al. (2022). HolisticBias: Sensitivity analysis across sociodemographic identity. arXiv:2205.09209. - **HONEST** (HOlingistic Neutrality Evalutation for gender-based Stereotypes): Nozza, D., et al. (2021). HONEST: Measuring Hurtful Language in Gendered Audio. NAACL. - **StereoSet**: Nadeem, M., et al. (2020). StereoSet: Measuring stereotypical bias in language models. arXiv:2004.09456. - **UnQover**: Li, T., et al. (2020). UnQovering Unintended Bias in Language Models through Compositional Reasoning. arXiv:2010.02428. - **WinoBias+**: Vanmassenhove, E., et al. (2021). Toward Controllable Gender Bias Mitigation in Neural Machine Translation. arXiv:2109.06105. - **WinoBias**: Zhao, J., et al. (2018). Gender Bias in Coreference Resolution. ACL. - **WinoGender**: Rudinger, R., et al. (2018). Gender Bias in Neural Coreference Resolution. ACL. Metrics ~~~~~~~ - **WEAT** (Word Embedding Association Test): Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science. - **LPBS** (Log Probability Bias Score): Kurita, K., et al. (2019). Measuring bias in contextualized word representations. ACL. - **CBS** (Categorical Bias Score): Ahn, H., et al. (2021). Why Do Pretrained Language Models Cross-persona Conversational Tasks? EMNLP. - **CPS** (CrowS-Pairs Score): Nangia, N., et al. (2020). CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. EMNLP. - **AUL** (All Unmasked Log-likelihood): Kaneko, M., et al. (2021). Gender and Racial Bias in Language Models. arXiv:2104.07496. - **DR** (Demographic Representation): Liang, P. P., et al. (2022). Towards Fairness in Text Generation. arXiv:2211.09110. - **SA** (Stereotypical Association): Liang, P. P., et al. (2022). Towards Fairness in Text Generation. arXiv:2211.09110. - **HONEST** (HOlingistic Neutrality Evalutation for gender-based Stereotypes): Nozza, D., Bianchi, F., & Hovy, D. (2021). HONEST: Measuring Hurtful Language in Gendered Audio. NAACL. Algorithms ~~~~~~~~~~ Pre-processing: - **CDA** (Counterfactual Data Augmentation): Webster, K., et al. (2020). Measuring and Reducing the Correlation Between Word Embeddings and Gendered Constructs. ACL. - **Projection-based Debiasing**: Bolukbasi, M., et al. (2016). Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. NeurIPS. - **BLIND** (Bias removaL wIth No Demographics): Orgad, H., et al. (2023). BLIND: Bias Removal via Debiasing at Inference Time. ACL. In-processing: - **ADELE** (Adapter-based DEbiasing of LanguagE models): Lauscher, A., et al. (2021). AdapterFusion: Non-invasive Transfer of Knowledge for Text Classification. arXiv:2109.03646. - **EAR** (Entropy Attention Regularizer): Attanasio, G., et al. (2022). SENTeX: BERT-based Sentiment Analysis for Italian. arXiv:2203.09192. - **Embedding-based Regularizer**: Liu, Y., et al. (2020). Fairness without Demographics through Adversarial Learning. arXiv:1910.10486. - **Selective Unfreezing**: Gira, M., et al. (2022). Selective Knowledge Transfer for Democratizing NLP. ACL. Intra-processing: - **MoDDiffy** (Modular Debiasing with Diff Subnetworks): Hauzenberger, L., et al. (2023). Modular Debiasing for Language Models. ACL Findings. - **EAT** (Entropy Attention Temperature scaling): Zayed, A., et al. (2023). DeBiasing: Learning about Bias Mitigation from Plain Models. arXiv:2305.13088.