A Python package to assess and improve fairness of machine learning models.
Added bootstrapping to :class:MetricFrame
, along with a new section in the user guide
Added intersectionality in mental health care example notebook.
Added user guide for :class:fairlearn.postprocessing.ThresholdOptimizer
fairlearn.metrics.selection_rate
to handle TypeError
when input is scalar.fairlearn.reductions
.sklearn.preprocessing.OneHotEncoder
in fairlearn.adversarial
to ensure compatibility with scikit-learn 1.2.parser
of sklearn.datasets.fetch_openml
in fairlearn.datasets
to liac-arff
to match behavior before scikit-learn 1.2.fairlearn.reductions
by matrix multiplication which can lead to substantial speed-ups for fairlearn.reductions.ExponentiatedGradient
for simple estimators like logistic regression.as_frame
(with default True
) argument to fairlearn.datasets.fetch_diabetes_hospital
.as_frame
default to True
for all remaining datasets.fairlearn.metrics.MetricFrame
so that all results (including aggregations) are computed in the constructor and then cached.fairlearn.adversarial
that changes the loss function NLLLoss
to CrossEntropyLoss
.X
in _validate_and_reformat_input()
since that is the concern of the underlying estimator and not Fairlearn.MetricFrame
. Methods group_max
,group_min
, difference
and ratio
now accept errors
as a parameter, which could either be raise
or coerce
.grid
object to aGridSearch
reduction would result in a KeyError
if the column names were not ordered integers.fairlearn.preprocessing.CorrelationRemover
now exposes n_features_in_
and feature_names_in_
.sphinxcontrib-bibtex
extension to manage citations in documentation using bibtex.fairlearn.reductions.ExponentiatedGradient
. Added support for cost sensitive classification in fairlearn.reductions.ErrorRate
.fairlearn.metrics.MetricFrame
. Some results may now have a more appropriate type thanobject
, but otherwise the only visible difference should be a substantial speed increase.fairlearn.metrics.plot_model_comparison
to create scatter plots for comparing multiple models along two metrics.fairlearn.adversarial.AdversarialFairnessClassifier
and fairlearn.adversarial.AdversarialFairnessRegressor
.count()
metric, so that the number of data points in each
group is noted when using MetricFrame
MetricFrame
constructor API, so metric
argument is now metrics
and
all positional arguments are now keyword arguments. Previous call format still works
(until v0.10.0), but issues a deprecation warning.postprocessing.ThresholdOptimizer
now accepts predict_method
as a
parameter which allows users to define which estimator method should be used
to get the prediction values: "predict_proba" and "decision_function"
for
soft values and "predict"
for hard values from classifiers.fairlearn.widgets
module including the FairlearnDashboard
.
Instead, the fairlearn.metrics.MetricFrame
supports plotting as explained
in the corresponding user guide section.self
) to fairlearn.reductions.ExponentiatedGradient
._merge_columns()
when using multiple sensitive features with
long names. This previously caused groups to get merged if the concatenation
of their string representations was identical until the cutoff limit.CorrelationRemover
preprocessing technique. This removes correlations
between sensitive and non-sensitive features while retaining as much information
as possiblecontrol_features
to the classification moments. These allow for data
stratification, with fairness constraints enforced within each stratum, but
not between stratamake_derived_metric()
to use MetricFrame
ExponentiatedGradient
signature by renaming argument T
to
max_iter
, eta_mul
to eta0
, and by adding run_linprog_step
.eps
within
ExponentiatedGradient
. It is now solely responsible for setting the L1
norm bound in the optimization (which controls the excess constraint
violation beyond what is allowed by the constraints
object).
The other usage of eps
as the right-hand side of constraints is
now captured directly in the moment classes as follows:
ConditionalSelectionRate
renamed to
UtilityParity
and its subclasses have new arguments on the constructor:
difference_bound
- for difference-based constraints such as
demographic parity differenceratio_bound_slack
- for ratio-based constraints such as demographic
parity ratioratio_bound
argument which represents the
argument previously called ratio
.ConditionalLossMoment
and its subclasses have a new
argument upper_bound
with the same purpose for newly enabled regression
scenarios on ExponentiatedGradient
.
For a comprehensive overview of available constraints refer to the new user
guide on fairness constraints for reductions methods.ErrorRateRatio
renamed to ErrorRateParity
, and
TruePositiveRateDifference
renamed to TruePositiveRateParity
since the
desired pattern is <metric name>Parity
with the exception of
EqualizedOdds
and DemographicParity
.ConditionalSelectionRate
renamed to UtilityParity
.GroupLossMoment
renamed to BoundedGroupLoss
in order to have a
descriptive name and for consistency with the paper. Similarly,
AverageLossMoment
renamed to MeanLoss
.
For a comprehensive overview of available constraints refer to the new user
guide on fairness constraints for reductions methods.TrueNegativeRateParity
to provide the opposite constraint of
TruePositiveRateParity
to be used with reductions techniques.ThresholdOptimizer
InterpolatedThresholder
to represent the fitted
ThresholdOptimizer
fairlearn.datasets
module.ExponentiatedGradient
from pickle.dump
to sklearn.clone
.sample_weight_name
to GridSearch
and
ExponentiatedGradient
to control how sample_weight
is supplied to
estimator.fit
.MetricFrame
has been
introduced, and make_group_summary()
removed (along with related
functions). Please see the documentation and examples for more information.GroupMetricResult
type in favor of a Bunch
.metric_by_group
changed to group_summary
make_group_metric
changed to make_metric_group_summary
{difference,ratio,group_min,group_max}_from_group_summary
.make_derived_metric
.{true,false}_{positive,negative}_rate
<metric>_group_summary
<metric>_{difference,ratio,group_min,group_max}
{demographic_parity,equalized_odds}_{difference,ratio}
fallout_rate
in favor of false_positive_rate
miss_rate
in favor of false_negative_rate
specificity_score
in favor of true_negative_rate
mean_{over,under}prediction
and {balanced_,}root_mean_squared_error
changed to the versions with a leading underscoredtype
when creating an empty
pandas.Series
.GridSearch
for more than two sensitive features values.fairlearn.reductions
including:
TruePositiveRateDifference
ExponentiatedGradient
require 0-1 labels for classification problems,
pending a better solution for Issue 339.ThresholdOptimizer
:
ThresholdOptimizer
into its own plotting function.ThresholdOptimizer
now performs validations during fit
, and not during
__init__
. It also stores the fitted given estimator in the estimator_
attribute.ThresholdOptmizer
is now a scikit-learn meta-estimator, and accepts
an estimator through the estimator
parameter. To use a pre-fitted
estimator, pass prefit=True
._create_group_metric_set_()
private by prepending with _
.
Also changed the arguments, so that this routine requires
dictionaries for the predictions and sensitive features. This is a
breaking change.Reduction
base class for reductions methods and replace it with
sklearn.base.BaseEstimator
and sklearn.base.MetaEstimatorMixin
.ExponentiatedGradientResult
and GridSearchResult
in favor of
storing the values and objects resulting from fitting the meta-estimator
directly in the ExponentiatedGradient
and GridSearch
objects,
respectively.X
if it is
provided as a pandas.DataFrame
.