Skip to content

MSR-LIT/MultilingualBias

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer

Introduction

This repository contains the code and data for replicating results from

Intrinsic Bias

- Prerequisite

  • Download/Generate fastText aligned embeddings from fastText
  • Generate bias-reduced EN embeddings (ENDEB) using Hard-Debias

- Multilingual Intrinsic Bias Dataset:

We include all the occupations as well as the gender seed words for each language under intrinsic folder.

- Codes:

To evaluate intrinsic bias in each language, refer to inBias.ipynb for bias analysis and results.

Extrinsic Bias

- Multilingual BiosBias (MLBs) Dataset:

To replicate the MLBs dataset, please refer to replicateMLBs folder. For EN dataset, please refer to biosbias

- Codes:

The codes for downstream task is under bios_codes folder.

If you use this code or use the EN MLB dataset, please also cite Bias in Bios: A Case Study of Semantic Representation Bias in a High Stakes Setting

@inproceedings{de2019bias,
  title={Bias in bios: A case study of semantic representation bias in a high-stakes setting},
  author={De-Arteaga, Maria and Romanov, Alexey and Wallach, Hanna and Chayes, Jennifer and Borgs, Christian and Chouldechova, Alexandra and Geyik, Sahin and Kenthapadi, Krishnaram and Kalai, Adam Tauman},
  booktitle={Proceedings of the Conference on Fairness, Accountability, and Transparency},
  pages={120--128},
  year={2019}
}

Citation

@inproceedings{zhao-etal-2020-gender,
    title = "Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer",
    author = "Zhao, Jieyu  and
      Mukherjee, Subhabrata  and
      Hosseini, saghar  and
      Chang, Kai-Wei  and
      Hassan Awadallah, Ahmed",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    pages = "2896--2907",
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published