A lightweight implementation of removal-based explanations for ML models.
This repository implements a large number of removal-based explanations, a class of model explanation approaches that unifies many existing methods (e.g., SHAP, LIME, Meaningful Perturbations, L2X, permutation tests). Our paper presents a framework that allows us to implement many of these methods in a lightweight, modular codebase.
Our implementation does not take advantage of certain approximation approaches that make these methods fast in practice, so you may prefer to continue using the original implementations (e.g., SHAP, LIME, SAGE). We also haven't implemented every method, e.g., we do not support image blurring or feature selection approaches.
To begin, you need to clone the repository and install the library into your Python environment:
pip install .
Our code is designed around the framework described in the paper. Each model explanation method is specified by three choices:
The general use pattern looks like this:
from rexplain import removal, behavior, summary
# Get model and data
x, y = ...
model = ...
# 1) Feature removal
extension = removal.MarginalExtension(x[:512], model)
# 2) Model behavior
game = behavior.PredictionGame(x[0], extension)
# 3) Summary technique
attr = summary.ShapleyValue(game)
plt.bar(np.arange(len(attr)), attr)
For usage examples, see the following notebooks:
Ian Covert, Scott Lundberg, Su-In Lee. "Explaining by Removing: A Unified Framework For Model Explanation." arXiv preprint:2011.14878