WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
Previously it was considering n-1 words for the 2 sets of targets, therefore, it was omitting the last words of the targets, which was incorrect with respect to the original definition of the paper.
Now it correctly occupies all the target words.
Full Changelog: https://github.com/dccuchile/wefe/compare/0.4.0...0.4.1
Version 0.4.0 Changelog
This version mainly fixes a bug in RNSB and updates case study scores in examples. The changes are as follows:
This new version includes a new debias module as well as a complete refactoring to the preprocessing of word embeddings. Changelog:
preprocessors
together with the parameter strategy
indicating whether to consider all the transformed words ('all'
) or only the first one encountered ('first'
).model
and model_name
to wv
and name
respectively.word_embedding
argument to model
in every metric.Some bug fixes and RIPA integration prior to the release of the new version.
The main change in this version is an improvement in how WEFE transforms word sets into embeddings. Now all this work is contained in the WordEmbeddingModel class. It also contains several improvements, both to this process and to the library in general. See the changelog for more information.
Note: Contains changes that may not be compatible with previous versions.
run_query
parameter warn_filtered_words
to
warn_not_found_words
.word_preprocessor_args
parameter to run_query
that allows to
specify transformations prior to searching for words in word embeddings.secondary_preprocessor_args
parameter to run_query
which allows
to specify a second pre-processor transformation to words before searching them
in word embeddings. It is not necessary to specify the first preprocessor to
use this one.__getitem__
function in WordEmbeddingModel. This method allows to
obtain an embedding from a word from the model stored in the instance using
indexers.corr
method.random_state
in RNSB to allow replication of the experiments.