Interpret Versions Save

Fit interpretable models. Explain blackbox machine learning.

v0.3.1

1 year ago

[v0.3.1] - 2023-03-13

Added

Mac m1 support in conda-forge
SPOTGreedy prototype selection (PR #392)

Fixed

fix visualization when both cloud and non-cloud environments are detected (PR #210)
fix ShapTree bug where it was treating classifiers as regressors
resolve scikit-learn warnings occurring when models were trained using Pandas DataFrames
change the defaults to prefer 'continuous' over 'nominal' when a feature has 1 or 2 unique float64 values

Breaking Changes

in the blackbox and greybox explainers, change from accepting a predict_fn to accepting either a model or a predict_fn
feature type 'categorical' has been renamed to 'nominal' for the remaining feature_type parameters in the package (EBMs were already using 'nominal')
removed the unused sampler parameters to the Explainer classes

v0.3.0

1 year ago

[v0.3.0] - 2022-11-16

Added

Full Complexity EBMs with higher order interactions supported: GA3M, GA4M, GA5M, etc... 3-way and higher-level interactions lose exact global interpretability, but retain exact local explanations Higher level interactions need to be explicitly specified. No automatic FAST detection yet
Mac m1 support
support for ordinals
merge_ebms now supports merging models with interactions, including higher-level interactions
added classic composition option during Differentially Private binning
support for different kinds of feature importances (avg_weight, min_max)
exposed interaction detection API (FAST algorithm)
API to calculate and show the importances of groups of features and terms.

Changed

memory efficiency: About 20x less memory is required during fitting
predict time speed improvements. About 50x faster for Pandas CategoricalDType, and varying levels of improvements for other data types
handling of the differential privacy DPOther bin, and non-DP unknowns has been unified by having a universal unknown bin
bin weights have been changed from per-feature to per-term and are now multi-dimensional
improved scikit-learn compliance: We now conform to the scikit-learn 1.0 feature names API by using self.feature_names_in_ for the X column names and self.n_features_in_. We use the matching self.feature_types_in_ for feature types, and self.term_names_ for the additive term names.

Fixed

merge_ebms now distributes bin weights proportionally according to volume when splitting bins
DP-EBMs now use sample weights instead of bin counts, which preserves privacy budget
improved scikit-learn compliance: The following init attributes are no longer overwritten during calls to fit: self.interactions, self.feature_names, self.feature_types
better handling of floating point overflows when calculating gain and validation metrics

Breaking Changes

EBMUtils.merge_models function has been renamed to merge_ebms
renamed binning type 'quantile_humanized' to 'rounded_quantile'
feature type 'categorical' has been specialized into separate 'nominal' and 'ordinal' types

EBM models have changed public attributes:

feature_groups_ -> term_features_
global_selector -> n_samples_, unique_val_counts_, and zero_val_counts_
domain_size_ -> min_target_, max_target_
additive_terms_ -> term_scores_
bagged_models_ -> BaseCoreEBM has been depricated and the only useful attribute has been moved 
                  into the main EBM class (bagged_models_.model_ -> bagged_scores_)
feature_importances_ -> has been changed into the function term_importances(), which can now also 
                        generate different types of importances
preprocessor_ & pair_preprocessor_ -> attributes have been moved into the main EBM model class (details below)

EBMPreprocessor attributes have been moved to the main EBM model class

col_names_ -> feature_names_in_
col_types_ -> feature_types_in_
col_min_ -> feature_bounds_
col_max_ -> feature_bounds_
col_bin_edges_ -> bins_
col_mapping_ -> bins_
hist_counts_ -> histogram_counts_
hist_edges_ -> histogram_edges_
col_bin_counts_ -> bin_weights_ (and is now a per-term tensor)

v0.2.7

2 years ago

v0.2.7 - 2021-09-23

Added

Synapse cloud support for visualizations.

Fixed

All category names in bar charts now visible for inline rendering (used in cloud environments).
Joblib preference was previously being overriden. This has been reverted to honor the user's preference.
Bug in categorical binning for differentially privatized EBMs has been fixed.

v0.2.6

2 years ago

v0.2.6 - 2021-07-20

Adde6

Differential-privacy augmented EBMs now available as interpret.privacy.{DPExplainableBoostingClassifier,DPExplainableBoostingRegressor}.
Packages interpret and interpret-core now distributed via docker.

Changed

Sampling code including stratification within EBM now performed in native code.

Fixed

Computer provider with joblib can now support multiple engines with serialization support.
Labels are now all shown for inline rendering of horizontal bar charts.
JS dependencies updated.

v0.2.5

2 years ago

v0.2.5 - 2021-06-21

Added

Sample weight support added for EBM.
Joint predict_and_contrib added to EBM where both predictions and feature contributions are generated in one call.
EBM predictions now substantially faster with categorical featured predictions.
Preliminary documentation for all of interpret now public at https://interpret.ml/docs.
Decision trees now work in cloud environments (InlineRenderer support).
Packages interpret and interpret-core now distributed via sdist.

Fixed

EBM uniform binning bug fixed where empty bins can raise exceptions.
Users can no longer include duplicate interaction terms for EBM.
CSS adjusted for inline rendering such that it does not interfere with its hosting environment.
JS dependencies updated.

Experimental

Ability to merge multiple EBM models into one. Found in interpret.glassbox.ebm.utils.

v0.2.4

3 years ago

v0.2.4 - 2021-01-19

Fixed

Bug fix on global EBM plots.
Rendering fix for AzureML notebooks.

Changed

JavaScript dependencies for inline renderers updated.

v0.2.3

3 years ago

v0.2.3 - 2021-01-13

Major upgrades to EBM in this release. Automatic interaction detection is now included by default. This will increase accuracy substantially in most cases. Numerous optimizations to support this, especially around binary classification. Expect similar or slightly slower training times due to interactions.

Fixed

Automated interaction detection uses low-resolution binning for both FAST and pairwise training.

Changed

EBM argument has been reduced from outer_bags=16 to outer_bags=8.
EBM now includes interactions by default from interactions=0 to interactions=10.
Algorithm treeinterpreter is now unstable due to upstream dependencies.
Automated interaction detection now operates from two-pass to one-pass.
Numeric approximations used in boosting (i.e. approx log / exp).
Some arguments have been re-ordered for EBM initialization.

v0.2.2

3 years ago

v0.2.2 - 2020-10-19

Fixed

Fixed bug on predicting unknown categories with EBM.
Fixed bug on max value being placed in its own bin for EBM pre-processing.
Numerous native fixes and optimizations.

Added

Added max_interaction_bins as argument to EBM learners for different sized bins on interactions, separate to mains.
New binning method 'quantile_humanized' for EBM.

Changed

Interactions in EBM now use their own pre-processing, separate to mains.
Python 3.5 no longer supported.
Switched from Python to native code for binning.
Switched from Python to native code for PRNG in EBM.

v0.2.1

3 years ago

v0.2.1 - 2020-08-07

Added

Python 3.8 support.

Changed

Dash based visualizations will always default to listen port 7001 on first attempt; if the first attempt fails it will try a random port between 7002-7999.

Experimental (WIP)

Further cloud environment support.
Improvements for multiclass EBM global graphs.

v0.2.0

3 years ago

v0.2.0 - 2020-07-21

Breaking Changes

With warning, EBM classifier adapts internal validation size when there are too few instances relative to number of unique classes. This ensures that there is at least one instance of each class in the validation set.

Cloud Jupyter environments now use a CDN to fix major rendering bugs and performance.

CDN currently used is https://unpkg.com

If you want to specify your own CDN, add the following as the top cell

from interpret import set_visualize_provider
from interpret.provider import InlineProvider
from interpret.version import __version__

# Change this to your custom CDN.
JS_URL = "https://unpkg.com/@interpretml/interpret-inline@{}/dist/interpret-inline.js".format(__version__)
set_visualize_provider(InlineProvider(js_url=JS_URL))

EBM has changed initialization parameters:

schema -> DROPPED
n_estimators -> outer_bags
holdout_size -> validation_size
scoring -> DROPPED
holdout_split -> DROPPED
main_attr -> mains
data_n_episodes -> max_rounds
early_stopping_run_length -> early_stopping_rounds
feature_step_n_inner_bags -> inner_bags
training_step_epsiodes -> DROPPED
max_tree_splits -> max_leaves
min_cases_for_splits -> DROPPED
min_samples_leaf -> ADDED (Minimum number of samples that are in a leaf)
binning_strategy -> binning
max_n_bins -> max_bins

EBM has changed public attributes:

n_estimators -> outer_bags
holdout_size -> validation_size
scoring -> DROPPED
holdout_split -> DROPPED
main_attr -> mains
data_n_episodes -> max_rounds
early_stopping_run_length -> early_stopping_rounds
feature_step_n_inner_bags -> inner_bags
training_step_epsiodes -> DROPPED
max_tree_splits -> max_leaves
min_cases_for_splits -> DROPPED
min_samples_leaf -> ADDED (Minimum number of samples that are in a leaf)
binning_strategy -> binning
max_n_bins -> max_bins

attribute_sets_ -> feature_groups_
attribute_set_models_ -> additive_terms_ (Pairs are now transposed)
model_errors_ -> term_standard_deviations_

main_episode_idxs_ -> breakpoint_iteration_[0]
inter_episode_idxs_ -> breakpoint_iteration_[1]

mean_abs_scores_ -> feature_importances_

Fixed

Internal fixes and refactor for native code.
Updated dependencies for JavaScript layer.
Fixed rendering bugs and performance issues around cloud Jupyter notebooks.
Logging flushing bug fixed.
Labels that are shaped as nx1 matrices now automatically transform to vectors for training.

Experimental (WIP)

Added support for AzureML notebook VM.
Added local explanation visualizations for multiclass EBM.