In this repo, we tried to reproduce the results claimed in Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation (ACL 2020) as part of Reproducibility Challenge 2020 hosted by PaperswithCode
You can find our report here. Unfortunately, it got rejected for ReScience Journal.
Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models. Some commonly adopted debiasing approaches, including the seminal Hard Debias algorithm, apply post-processing procedures that project pre-trained word embeddings into a subspace orthogonal to an inferred gender subspace. We discover that semantic-agnostic corpus regularities such as word frequency captured by the word embeddings negatively impact the performance of these algorithms. We propose a simple but effective technique, Double Hard Debias, which purifies the word embeddings against such corpus regularities prior to inferring and removing the gender subspace. Experiments on three bias mitigation benchmarks show that our approach preserves the distributional semantics of the pre-trained word embeddings while reducing gender bias to a significantly larger degree than prior approaches.
Despite widespread use in natural language processing (NLP) tasks, word embeddings have been criticized for inheriting unintended gender bias
from training corpora. Bolukbasi et al. (2016) highlights that in word2vec
embeddings trained on the Google News dataset (Mikolov et al., 2013a), programmer
is more closely associated with man
and homemaker
is more closely associated with woman
. Such gender bias also propagates to downstream tasks. Studies have shown that coreference resolution systems exhibit gender bias in predictions due to the use of biased word embeddings (Zhao et al., 2018a; Rudinger et al., 2018).
Python >= 3.6
.Word Embeddings Benchmarks
. Install them following the instructions here.
Clone the repo:
git clone https://github.com/hassiahk/Double-Hard-Debias.git
Install the dependencies:
pip install -r requirements.txt
To run in develop mode, this is needed if you are just running our notebooks without changing anything:
python setup.py develop
Please download the below data and keep them in the data folder.
Word Embeddings
- You can find the authors debiased embeddings and ours here.Special Word Lists
- You can find them in the data folder.Google Word Analogy
- Word Analogy dataset by Google. You can find it here.MSR Word Analogy
- MSR Word Analogy dataset. You can find it here.
You can find all the external data used in our experiments here.
You can find the detailed procedure to implement Double-Hard Debias
in GloVe_Double_Hard_Debias.ipynb
. (PyPi
package coming soon)
We had to make minor changes as the authors code did not include the code to Double-Hard Debias
the original GloVe
embeddings and store them in a file.
- In
Normalized_Unnormalized_GloVe_Evaluate.ipynb
, we experimented with both normalized and unnormalized embeddings to see which one gives better results. - You can find the benchmarks results for Double-Hard Debias and other debiasing approaches on
GloVe
inGloVe_Evaluate.ipynb
. - We also did some qualitative analysis by computing bias of some highly biased words before and after debiasing. You can find the analysis in
Qualitative_Analysis.ipynb
.