WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
.. -- mode: rst --
|License|_ |GithubActions|_ |ReadTheDocs|_ |Downloads|_ |Pypy|_ |CondaVersion|_
.. |License| image:: https://img.shields.io/github/license/dccuchile/wefe .. _License: https://github.com/dccuchile/wefe/blob/master/LICENSE
.. |ReadTheDocs| image:: https://readthedocs.org/projects/wefe/badge/?version=latest .. _ReadTheDocs: https://wefe.readthedocs.io/en/latest/?badge=latest
.. |GithubActions| image:: https://github.com/dccuchile/wefe/actions/workflows/ci.yaml/badge.svg?branch=master .. _GithubActions: https://github.com/dccuchile/wefe/actions
.. |Downloads| image:: https://pepy.tech/badge/wefe .. _Downloads: https://pepy.tech/project/wefe
.. |Pypy| image:: https://badge.fury.io/py/wefe.svg .. _Pypy: https://pypi.org/project/wefe/
.. |CondaVersion| image:: https://anaconda.org/pbadilla/wefe/badges/version.svg .. _CondaVersion: https://anaconda.org/pbadilla/wefe
.. image:: ./docs/logos/WEFE_2.png :width: 300 :alt: WEFE Logo :align: center
Word Embedding Fairness Evaluation (WEFE) is an open source library for measuring an mitigating bias in word embedding models. It generalizes many existing fairness metrics into a unified framework and provides a standard interface for:
WEFE also standardizes the process of mitigating bias through an interface similar
to the scikit-learn
fit-transform
.
This standardization separates the mitigation process into two stages:
fit
).transform
).The official documentation can be found at this link <https://wefe.readthedocs.io/>
_.
There are two different ways to install WEFE:
To install the package with pip
::
pip install wefe
To install the package with conda
::
conda install -c pbadilla wefe
These package will be installed along with the package, in case these have not already been installed:
You can download the code executing ::
git clone https://github.com/dccuchile/wefe
To contribute, visit the Contributing <https://wefe.readthedocs.io/en/latest/user_guide/contribute.html>
_ section in the documentation.
To install the necessary dependencies for the development, testing and compilation of WEFE documentation, run ::
pip install -r requirements-dev.txt
All unit tests are in the wefe/tests folder. It uses pytest
as a framework to
run them.
To run the test, execute::
pytest tests
To check the coverage, run::
pytest tests --cov-report xml:cov.xml --cov wefe
And then::
coverage report -m
The documentation is created using sphinx. It can be found in the docs folder at the root of the project. To compile the documentation, run:
.. code-block:: bash
cd docs
make html
Then, you can visit the documentation at docs/_build/html/index.html
preprocessors
together with
the parameter strategy
indicating whether to consider all the transformed words
('all'
) or only the first one encountered ('first'
).model
and model_name
to
wv
and name
respectively.word_embedding
argument to model
in every metric.run_query
parameter warn_filtered_words
to
warn_not_found_words
.word_preprocessor_args
parameter to run_query
that allow specifying
transformations prior to searching for words in word embeddings.secondary_preprocessor_args
parameter to run_query
which allows
specifying a second pre-processor transformation to words before searching them in
word embeddings. It is not necessary to specify the first preprocessor to use this
one.__getitem__
function in WordEmbeddingModel
. This method
allows obtaining an embedding from a word from the model stored in the instance
using indexers.corr
method.random_state
in RNSB to allow replication of the experiments.Please cite the following paper if using this package in an academic publication:
P. Badilla, F. Bravo-Marquez, and J. Pérez
WEFE: The Word Embeddings Fairness Evaluation Framework In Proceedings of the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI 2020), Yokohama, Japan. <https://www.ijcai.org/Proceedings/2020/60>
_
Bibtex:
.. code-block:: latex
@InProceedings{wefe2020,
title = {WEFE: The Word Embeddings Fairness Evaluation Framework},
author = {Badilla, Pablo and Bravo-Marquez, Felipe and Pérez, Jorge},
booktitle = {Proceedings of the Twenty-Ninth International Joint Conference on
Artificial Intelligence, {IJCAI-20}},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
pages = {430--436},
year = {2020},
month = {7},
doi = {10.24963/ijcai.2020/60},
url = {https://doi.org/10.24963/ijcai.2020/60},
}
Pablo Badilla <https://github.com/pbadillatorrealba/>
_.Felipe Bravo-Marquez <https://felipebravom.com/>
_.Jorge Pérez <https://users.dcc.uchile.cl/~jperez/>
_.María José Zambrano <https://github.com/mzambrano1/>
_.We thank all our contributors who have allowed WEFE to grow, especially
stolenpyjak <https://github.com/stolenpyjak/>
_ and
mspl13 <https://github.com/mspl13/>
_ for implementing new metrics.
We also thank alan-cueva <https://github.com/alan-cueva/>
_ for initiating the development
of metrics for contextualized embedding models and
harshvr15 <https://github.com/harshvr15/>
_ for the examples of multi-language bias measurement.
Thank you very much 😊!