Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠(ICLR 2019)
Produces hierarchical interpretations for a single prediction made by a pytorch neural network. Official code for Hierarchical interpretations for neural network predictions (ICLR 2019 pdf).
Documentation • Demo notebooks
Note: this repo is actively maintained. For any questions please file an issue.
pip install acd
(or clone and run python setup.py install
)Inspecting NLP sentiment models | Detecting adversarial examples | Analyzing imagenet models |
---|---|---|
net.modules()
, you may need to write a custom function to iterate through some layers of your network (for examples see cd.py
).scores/score_funcs.py
also contains simple pytorch implementations of integrated gradients and the simple interpration technique gradient * input
@inproceedings{
singh2019hierarchical,
title={Hierarchical interpretations for neural network predictions},
author={Chandan Singh and W. James Murdoch and Bin Yu},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=SkEqro0ctQ},
}