Mlxtend Versions Save

A library of extension and helper modules for Python's data analysis and machine learning libraries.

v0.23.1

4 months ago

Version 0.23.1 (5 Jan 2024)

Changes

Updated dependency on distutils for python 3.12 and above ([#1072](https://github.com/rasbt/mlxtend/issues/1072) via [peanutsee](https://github.com/peanutsee))

v0.23.0

7 months ago

Downloads

Changes

Address NumPy deprecations to make mlxtend compatible to NumPy 1.24
Changed the signature of the LinearRegression model of sklearn in the test removing the normalize parameter as it is deprecated. ([#1036](https://github.com/rasbt/mlxtend/issues/1036))
Add pyproject.toml to support PEP 518 builds ([#1065](https://github.com/rasbt/mlxtend/issues/1065) via [jmahlik](https://github.com/jmahlik))
Fixed installation from sdist failing ([#1065](https://github.com/rasbt/mlxtend/issues/1065) via [jmahlik](https://github.com/jmahlik))
Converted configuration to pyproject.toml ([#1065](https://github.com/rasbt/mlxtend/issues/1065) via [jmahlik](https://github.com/jmahlik))
Remove mlxtend.image submodule with face recognition functions due to poor dlib support in modern environments.

New Features and Enhancements

Document how to use SequentialFeatureSelector and multiclass ROC AUC.

v0.22.0

1 year ago

Changes

When ExhaustiveFeatureSelector is run with n_jobs == 1, joblib is now disabled, which enables more immediate (live) feedback when the verbose mode is enabled. (#985 via Nima Sarajpoor)
Disabled unnecessary warning in EnsembleVoteClassifier (#941)
Fixed various documentation issues (#849 and #951 via Lekshmanan Natarajan)
Fixed "Edit on GitHub" button (#1024)

New Features and Enhancements

The mlxtend.frequent_patterns.association_rules function has a new metric - Zhang's Metric, which measures both association and dissociation. (#980)
Internal mlxtend.frequent_patterns.fpmax code improvement that avoids casting a sparse DataFrame into a dense NumPy array. (#1000 via Tim Kellogg)
The plot_decision_regions function now has a n_jobs parameter to parallelize the computation. (In a particular use case, on a small dataset, there was a 21x speed-up (449 seconds vs 21 seconds on local HPC instance of 36 cores). (#998 via Khalid ElHaj)
Added mlxtend.frequent_patterns.hmine algorithm and documentation for mining frequent itemsets using the H-Mine algorithm. (#1020 via Fatih Sen)

v0.21.0

1 year ago

New Features and Enhancements

The mlxtend.evaluate.feature_importance_permutation function has a new feature_groups argument to treat user-specified feature groups as single features, which is useful for one-hot encoded features. (#955)
The mlxtend.feature_selection.ExhaustiveFeatureSelector and SequentialFeatureSelector also gained support for feature_groups with a behavior similar to the one described above. (#957 and #965 via Nima Sarajpoor)

Changes

The custom_feature_names parameter was removed from the ExhaustiveFeatureSelector due to redundancy and to simplify the code base. The ExhaustiveFeatureSelector documentation illustrates how the same behavior and outcome can be achieved using pandas DataFrames. (#957)

Bug Fixes

None

v0.20.0

1 year ago

New Features and Enhancements

Downloads

New Features and Enhancements

The mlxtend.evaluate.bootstrap_point632_score now supports fit_params. (#861)
The mlxtend/plotting/decision_regions.py function now has a contourf_kwargs for matplotlib to change the look of the decision boundaries if desired. (#881 via [pbloem])
Add a norm_colormap parameter to mlxtend.plotting.plot_confusion_matrix, to allow normalizing the colormap, e.g., using matplotlib.colors.LogNorm() (#895)
Add new GroupTimeSeriesSplit class for evaluation in time series tasks with support of custom groups and additional parameters in comparison with scikit-learn's TimeSeriesSplit. (#915 via Dmitry Labazkin)

Changes

Due to compatibility issues with newer package versions, certain functions from six.py have been removed so that mlxtend may not work anymore with Python 2.7.
As an internal change to speed up unit testing, unit testing is now faciliated by GitHub workflows, and Travis CI and Appveyor hooks have been removed.
Improved axis label rotation in mlxtend.plotting.heatmap and mlxtend.plotting.plot_confusion_matrix (#872)
Fix various typos in McNemar guides.
Raises a warning if non-bool arrays are used in the frequent pattern functions apriori, fpmax, and fpgrowth. (#934 via NimaSarajpoor)

Bug Fixes

Fix unreadable labels in heatmap for certain colormaps. (#852)
Fix an issue in mlxtend.plotting.plot_confusion_matrix when string class names are passed (#894)

v0.19.0

2 years ago

Version 0.19.0 (09/02/2021)

New Features

Adds a second "balanced accuracy" interpretation ("balanced") to evaluate.accuracy_score in addition to the existing "average" option to compute the scikit-learn-style balanced accuracy. (#764)
Adds new scatter_hist function to mlxtend.plotting for generating a scattered histogram. (#757 via Maitreyee Mhasaka)
The evaluate.permutation_test function now accepts a paired argument to specify to support paired permutation/randomization tests. (#768)
The StackingCVRegressor now also supports multi-dimensional targets similar to StackingRegressor via StackingCVRegressor(..., multi_output=True). (#802 via Marco Tiraboschi)

Changes

Updates unit tests for scikit-learn 0.24.1 compatibility. (#774)
StackingRegressor now requires setting StackingRegressor(..., multi_output=True) if the target is multi-dimensional; this allows for better input validation. (#802)
Removes deprecated res argument from plot_decision_regions. (#803)
Adds a title_fontsize parameter to plot_learning_curves for controlling the title font size; also the plot style is now the matplotlib default. (#818)
Internal change using 'c': 'none' instead of 'c': '' in mlxtend.plotting.plot_decision_regions's scatterplot highlights to stay compatible with Matplotlib 3.4 and newer. (#822)
Adds a fontcolor_threshold parameter to the mlxtend.plotting.plot_confusion_matrix function as an additional option for determining the font color cut-off manually. (#827)
The frequent_patterns.association_rules now raises a ValueError if an empty frequent itemset DataFrame is passed. (#843)
The .632 and .632+ bootstrap method implemented in the mlxtend.evaluate.bootstrap_point632_score function now use the whole training set for the resubstitution weighting term instead of the internal training set that is a new bootstrap sample in each round. (#844)

Bug Fixes

Fixes a typo in the SequentialFeatureSelector documentation (#835 via João Pedro Zanlorensi Cardoso)

0.18.0

3 years ago

New Features

The bias_variance_decomp function now supports optional fit_params for the estimators that are fit on bootstrap samples. (#748)
The bias_variance_decomp function now supports Keras estimators. (#725 via @hanzigs)
Adds new mlxtend.classifier.OneRClassifier (One Rule Classfier) class, a simple rule-based classifier that is often used as a performance baseline or simple interpretable model. (#726
Adds new create_counterfactual method for creating counterfactuals to explain model predictions. (#740)

Changes

permutation_test (mlxtend.evaluate.permutation) ìs corrected to give the proportion of permutations whose statistic is at least as extreme as the one observed. (#721 via Florian Charlier)
Fixes the McNemar confusion matrix layout to match the convention (and documentation), swapping the upper left and lower right cells. (#744 via mmarius)

Bug Fixes

The loss in LogisticRegression for logging purposes didn't include the L2 penalty for the first weight in the weight vector (this is not the bias unit). However, since this loss function was only used for logging purposes, and the gradient remains correct, this does not have an effect on the main code. (#741)
Fixes a bug in bias_variance_decomp where when the mse loss was used, downcasting to integers caused imprecise results for small numbers. (#749)

0.17.3

3 years ago

New Features

Add predict_proba kwarg to bootstrap methods, to allow bootstrapping of scoring functions that take in probability values. (#700 via Adam Li)
Add a cell_values parameter to mlxtend.plotting.heatmap() to optionally suppress cell annotations by setting cell_values=False. (#703

Changes

Implemented both use_clones and fit_base_estimators (previously refit in EnsembleVoteClassifier) for EnsembleVoteClassifier and StackingClassifier. (#670 via Katrina Ni)
Switched to using raw strings for regex in mlxtend.text to prevent deprecation warning in Python 3.8 (#688)
Slice data in sequential forward selection before sending to parallel backend, reducing memory consumption.

Bug Fixes

Fixes axis DeprecationWarning in matplotlib v3.1.0 and newer. (#673)
Fixes an issue with using meshgrid in no_information_rate function used by the bootstrap_point632_score function for the .632+ estimate. (#688)
Fixes an issue in fpmax that could lead to incorrect support values. (#692 via Steve Harenberg)

v0.17.2

4 years ago

New Features

Changes

The previously deprecated OnehotTransactions has been removed in favor of the TransactionEncoder.
Removed SparseDataFrame support in frequent pattern mining functions in favor of pandas >=1.0's new way for working sparse data. If you used SparseDataFrame formats, please see pandas' migration guide at https://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating (#667)

Bug Fixes

v0.17.1

4 years ago

New Features

The SequentialFeatureSelector now supports using pre-specified feature sets via the fixed_features parameter. (#578)
Adds a new accuracy_score function to mlxtend.evaluate for computing basic classifcation accuracy, per-class accuracy, and average per-class accuracy. (#624 via Deepan Das)
StackingClassifier and StackingCVClassifiernow have a decision_function method, which serves as a preferred choice over predict_proba in calculating roc_auc and average_precision scores when the meta estimator is a linear model or support vector classifier. (#634 via Qiang Gu)

Changes

Improve the runtime performance for the apriori frequent itemset generating function when low_memory=True. Setting low_memory=False (default) is still faster for small itemsets, but low_memory=True can be much faster for large itemsets and requires less memory. Also, input validation for apriori, ̀ fpgrowthandfpmaxtakes a significant amount of time when input pandas DataFrame is large; this is now dramatically reduced when input contains boolean values (and not zeros/ones), which is the case when usingTransactionEncoder`. (#619 via Denis Barbier)
Add support for newer sparse pandas DataFrame for frequent itemset algorithms. Also, input validation for apriori, ̀ fpgrowthandfpmax` runs much faster on sparse DataFrame when input pandas DataFrame contains integer values. (#621 via Denis Barbier)
Let fpgrowth and fpmax directly work on sparse DataFrame, they were previously converted into dense Numpy arrays. (#622 via Denis Barbier)

Bug Fixes

Fixes a bug in mlxtend.plotting.plot_pca_correlation_graph that caused the explaind variances not summing up to 1. Also, improves the runtime performance of the correlation computation and adds a missing function argument for the explained variances (eigenvalues) if users provide their own principal components. (#593 via Gabriel Azevedo Ferreira)
Behavior of fpgrowth and apriori consistent for edgecases such as min_support=0. (#573 via Steve Harenberg)
fpmax returns an empty data frame now instead of raising an error if the frequent itemset set is empty. (#573 via Steve Harenberg)
Fixes and issue in mlxtend.plotting.plot_confusion_matrix, where the font-color choice for medium-dark cells was not ideal and hard to read. #588 via sohrabtowfighi)
The svd mode of mlxtend.feature_extraction.PrincipalComponentAnalysis now also n-1 degrees of freedom instead of n d.o.f. when computing the eigenvalues to match the behavior of eigen. #595
Disable input validation for StackingCVClassifier because it causes issues if pipelines are used as input. #606