Python Library for Model Interpretation/Explanations
flip_orientation
Credits: To all contributors who have helped in moving the project forward.
New Features:
Credits:
More improvement in planned in the subsequent release. Stay tuned.
Skater till now has been an interpretation engine to enable post-hoc model evaluation and interpretation. With this PR Skater starts its journey to support interpretable models. Rule List algorithms are highly popular in the space of Interpretable Models because the trained models are represented as simple decision lists. In the latest release, we enable support for Bayesian Rule Lists(BRL). The probabilistic classifier( estimating P(Y=1|X) for each X ) optimizes the posterior of a Bayesian hierarchical model over the pre-mined rules.
Usage Example:
from skater.core.global_interpretation.interpretable_models.brlc import BRLC
import pandas as pd
from sklearn.datasets.mldata import fetch_mldata
input_df = fetch_mldata("diabetes")
...
Xtrain, Xtest, ytrain, ytest = train_test_split(input_df, y, test_size=0.20, random_state=0)
sbrl_model = BRLC(min_rule_len=1, max_rule_len=10, iterations=10000, n_chains=20, drop_features=True)
# Train a model, by default discretizer is enabled. So, you wish to exclude features then exclude them using
# the undiscretize_feature_list parameter
model = sbrl_model.fit(Xtrain, ytrain, bin_labels="default")
Other minor bug fixes and documentation update
Credits: Special thanks to Professor Cynthia Rudin, Hongyu Yang and @tmadl(Tamas Madl) for helping enable this feature.
This release includes:
Now, after you create a Skater model with:
model = InMemoryModel(predict_fn, examples=examples, model_type="classifier")
The model object now provides a .scorers api, which allows you to store predictions against training labels. Based on whether your model is a regressor, classifier that returns labels, or classifier that returns probabilities, scorers will automatically expose various scoring algorithms specific to your model. For instance, in the example above, we could do:
model.scorers.f1(labels, model(X))
model.scorers.cross_entropy(labels, model(X))
if it were a regression, we could do:
model.scorers.mse(labels, model(X))
Calling model.scorers.default(labels, model(X)) or simply model.scorers(labels, model(X)) will execute the default scorer for your model, which are:
regression: mean absolute error classifier (probabilities): cross entropy classifier (labels): f1
Let us know if you'd like more scorers, or even better, feel free to make a PR to add more yourself!
The default method of computing feature importance is done by perturbing each feature, and observing how much those perturbations affect predictions.
With the addition of model scoring, we now also provide a method based on observing changes in model scoring functions; the less accurate your model becomes based on perturbing a feature, the more important it is.
To enable scoring based feature importance, you must load training labels into your interpretation object, like:
interpreter = Interpretation(training_data=training_data, training_labels=training_labels)
interpreter.feature_importance.plot_feature_importance(model, method='model-scoring')
Originall Skater tried to infer the type of your model based on the types of predictions it made. Now when you create a model, you can define these explicitely with model_type
and probability
keyword arguments to skater model types:
model = InMemoryModel(predict_fn, model_type='classifier', probability=True)
or
model = InMemoryModel(predict_fn, model_type='regressor')
bug fixes
Includes changes to support distribution of library through conda-forge