Mlxtend Versions Save

A library of extension and helper modules for Python's data analysis and machine learning libraries.

v0.23.1

4 months ago

Version 0.23.1 (5 Jan 2024)

Changes

v0.23.0

7 months ago
Downloads
Changes
New Features and Enhancements
  • Document how to use SequentialFeatureSelector and multiclass ROC AUC.

v0.22.0

1 year ago
Changes
New Features and Enhancements

v0.21.0

1 year ago
New Features and Enhancements
  • The mlxtend.evaluate.feature_importance_permutation function has a new feature_groups argument to treat user-specified feature groups as single features, which is useful for one-hot encoded features. (#955)
  • The mlxtend.feature_selection.ExhaustiveFeatureSelector and SequentialFeatureSelector also gained support for feature_groups with a behavior similar to the one described above. (#957 and #965 via Nima Sarajpoor)
Changes
  • The custom_feature_names parameter was removed from the ExhaustiveFeatureSelector due to redundancy and to simplify the code base. The ExhaustiveFeatureSelector documentation illustrates how the same behavior and outcome can be achieved using pandas DataFrames. (#957)
Bug Fixes
  • None

v0.20.0

1 year ago

New Features and Enhancements

Downloads
New Features and Enhancements
  • The mlxtend.evaluate.bootstrap_point632_score now supports fit_params. (#861)
  • The mlxtend/plotting/decision_regions.py function now has a contourf_kwargs for matplotlib to change the look of the decision boundaries if desired. (#881 via [pbloem])
  • Add a norm_colormap parameter to mlxtend.plotting.plot_confusion_matrix, to allow normalizing the colormap, e.g., using matplotlib.colors.LogNorm() (#895)
  • Add new GroupTimeSeriesSplit class for evaluation in time series tasks with support of custom groups and additional parameters in comparison with scikit-learn's TimeSeriesSplit. (#915 via Dmitry Labazkin)
Changes
  • Due to compatibility issues with newer package versions, certain functions from six.py have been removed so that mlxtend may not work anymore with Python 2.7.
  • As an internal change to speed up unit testing, unit testing is now faciliated by GitHub workflows, and Travis CI and Appveyor hooks have been removed.
  • Improved axis label rotation in mlxtend.plotting.heatmap and mlxtend.plotting.plot_confusion_matrix (#872)
  • Fix various typos in McNemar guides.
  • Raises a warning if non-bool arrays are used in the frequent pattern functions apriori, fpmax, and fpgrowth. (#934 via NimaSarajpoor)
Bug Fixes
  • Fix unreadable labels in heatmap for certain colormaps. (#852)
  • Fix an issue in mlxtend.plotting.plot_confusion_matrix when string class names are passed (#894)

v0.19.0

2 years ago

Version 0.19.0 (09/02/2021)

New Features
  • Adds a second "balanced accuracy" interpretation ("balanced") to evaluate.accuracy_score in addition to the existing "average" option to compute the scikit-learn-style balanced accuracy. (#764)
  • Adds new scatter_hist function to mlxtend.plotting for generating a scattered histogram. (#757 via Maitreyee Mhasaka)
  • The evaluate.permutation_test function now accepts a paired argument to specify to support paired permutation/randomization tests. (#768)
  • The StackingCVRegressor now also supports multi-dimensional targets similar to StackingRegressor via StackingCVRegressor(..., multi_output=True). (#802 via Marco Tiraboschi)
Changes
  • Updates unit tests for scikit-learn 0.24.1 compatibility. (#774)
  • StackingRegressor now requires setting StackingRegressor(..., multi_output=True) if the target is multi-dimensional; this allows for better input validation. (#802)
  • Removes deprecated res argument from plot_decision_regions. (#803)
  • Adds a title_fontsize parameter to plot_learning_curves for controlling the title font size; also the plot style is now the matplotlib default. (#818)
  • Internal change using 'c': 'none' instead of 'c': '' in mlxtend.plotting.plot_decision_regions's scatterplot highlights to stay compatible with Matplotlib 3.4 and newer. (#822)
  • Adds a fontcolor_threshold parameter to the mlxtend.plotting.plot_confusion_matrix function as an additional option for determining the font color cut-off manually. (#827)
  • The frequent_patterns.association_rules now raises a ValueError if an empty frequent itemset DataFrame is passed. (#843)
  • The .632 and .632+ bootstrap method implemented in the mlxtend.evaluate.bootstrap_point632_score function now use the whole training set for the resubstitution weighting term instead of the internal training set that is a new bootstrap sample in each round. (#844)
Bug Fixes

0.18.0

3 years ago
New Features
  • The bias_variance_decomp function now supports optional fit_params for the estimators that are fit on bootstrap samples. (#748)
  • The bias_variance_decomp function now supports Keras estimators. (#725 via @hanzigs)
  • Adds new mlxtend.classifier.OneRClassifier (One Rule Classfier) class, a simple rule-based classifier that is often used as a performance baseline or simple interpretable model. (#726
  • Adds new create_counterfactual method for creating counterfactuals to explain model predictions. (#740)
Changes
  • permutation_test (mlxtend.evaluate.permutation) ìs corrected to give the proportion of permutations whose statistic is at least as extreme as the one observed. (#721 via Florian Charlier)
  • Fixes the McNemar confusion matrix layout to match the convention (and documentation), swapping the upper left and lower right cells. (#744 via mmarius)
Bug Fixes
  • The loss in LogisticRegression for logging purposes didn't include the L2 penalty for the first weight in the weight vector (this is not the bias unit). However, since this loss function was only used for logging purposes, and the gradient remains correct, this does not have an effect on the main code. (#741)
  • Fixes a bug in bias_variance_decomp where when the mse loss was used, downcasting to integers caused imprecise results for small numbers. (#749)

0.17.3

3 years ago
New Features
  • Add predict_proba kwarg to bootstrap methods, to allow bootstrapping of scoring functions that take in probability values. (#700 via Adam Li)
  • Add a cell_values parameter to mlxtend.plotting.heatmap() to optionally suppress cell annotations by setting cell_values=False. (#703
Changes
  • Implemented both use_clones and fit_base_estimators (previously refit in EnsembleVoteClassifier) for EnsembleVoteClassifier and StackingClassifier. (#670 via Katrina Ni)
  • Switched to using raw strings for regex in mlxtend.text to prevent deprecation warning in Python 3.8 (#688)
  • Slice data in sequential forward selection before sending to parallel backend, reducing memory consumption.
Bug Fixes
  • Fixes axis DeprecationWarning in matplotlib v3.1.0 and newer. (#673)
  • Fixes an issue with using meshgrid in no_information_rate function used by the bootstrap_point632_score function for the .632+ estimate. (#688)
  • Fixes an issue in fpmax that could lead to incorrect support values. (#692 via Steve Harenberg)

v0.17.2

4 years ago
New Features
Changes
Bug Fixes

v0.17.1

4 years ago
New Features
  • The SequentialFeatureSelector now supports using pre-specified feature sets via the fixed_features parameter. (#578)
  • Adds a new accuracy_score function to mlxtend.evaluate for computing basic classifcation accuracy, per-class accuracy, and average per-class accuracy. (#624 via Deepan Das)
  • StackingClassifier and StackingCVClassifiernow have a decision_function method, which serves as a preferred choice over predict_proba in calculating roc_auc and average_precision scores when the meta estimator is a linear model or support vector classifier. (#634 via Qiang Gu)
Changes
  • Improve the runtime performance for the apriori frequent itemset generating function when low_memory=True. Setting low_memory=False (default) is still faster for small itemsets, but low_memory=True can be much faster for large itemsets and requires less memory. Also, input validation for apriori, ̀ fpgrowthandfpmaxtakes a significant amount of time when input pandas DataFrame is large; this is now dramatically reduced when input contains boolean values (and not zeros/ones), which is the case when usingTransactionEncoder`. (#619 via Denis Barbier)
  • Add support for newer sparse pandas DataFrame for frequent itemset algorithms. Also, input validation for apriori, ̀ fpgrowthandfpmax` runs much faster on sparse DataFrame when input pandas DataFrame contains integer values. (#621 via Denis Barbier)
  • Let fpgrowth and fpmax directly work on sparse DataFrame, they were previously converted into dense Numpy arrays. (#622 via Denis Barbier)
Bug Fixes
  • Fixes a bug in mlxtend.plotting.plot_pca_correlation_graph that caused the explaind variances not summing up to 1. Also, improves the runtime performance of the correlation computation and adds a missing function argument for the explained variances (eigenvalues) if users provide their own principal components. (#593 via Gabriel Azevedo Ferreira)
  • Behavior of fpgrowth and apriori consistent for edgecases such as min_support=0. (#573 via Steve Harenberg)
  • fpmax returns an empty data frame now instead of raising an error if the frequent itemset set is empty. (#573 via Steve Harenberg)
  • Fixes and issue in mlxtend.plotting.plot_confusion_matrix, where the font-color choice for medium-dark cells was not ideal and hard to read. #588 via sohrabtowfighi)
  • The svd mode of mlxtend.feature_extraction.PrincipalComponentAnalysis now also n-1 degrees of freedom instead of n d.o.f. when computing the eigenvalues to match the behavior of eigen. #595
  • Disable input validation for StackingCVClassifier because it causes issues if pipelines are used as input. #606