ModAL Versions Save

A modular active learning framework for Python

0.4.2

11 months ago

0.4.1

3 years ago

Release notes

This release includes a fix for a new feature added in 0.4.0.

Fixes

  • #108: if the data transformation is learned, the transformed data cannot be stored and needs to be re-calculated every time a query is done. This was fixed by @BoyanH in #113.

0.4.0

3 years ago

Release notes

modAL 0.4.0 is finally here! This new release is made possible by the contributions of @BoyanH, @damienlancry, and @OskarLiew, many thanks to them!

New features

  • pandas.DataFrame support, thanks to @BoyanH! This was a frequently requested feature which I was unable to properly implement, but @BoyanH has found a solution for this in #105.
  • Support for scikit-learn pipelines, also by @BoyanH. Now learners support querying on the transformed data by setting on_transformed=True upon initialization.

Changes

  • Query strategies should no longer return the selected instances, only the indices for the queried objects. (See #104 by @BoyanH.)

Fixes

  • Committee sets classes when fitting, this solves the error which occurred when no training data was provided during initialization. This fix was contributed in #100 by @OskarLiew, thanks for that!
  • Some typos in the ranked batch mode sampling example, fixed by @damienlancry.

0.3.6

3 years ago

Fixes

  • Updating of known classes for Committee.teach() (#63)

0.3.5

4 years ago

Changes

  • ActiveLearner now supports np.nan and np.inf in the data by setting force_all_finite=False upon initialization. #58
  • Bayesian optimization fixed for multidimensional functions.
  • Calls to check_X_y no longer converts between datatypes. #49
  • Expected error reduction implementation error fixed. #45
  • modAL.utils.data_vstack now falls back to numpy.concatenate if possible.
  • Multidimensional data for ranked batch sampling and expected error reduction fixed. #41

Fixes by @zhangyu94:

  • modAL.selection.shuffled_argmax #32
  • Cold start instance in modAL.batch.ranked_batch fixed. #30
  • Best instance index in modAL.batch.select_instance fixed. #29

0.3.4

5 years ago

New features

  • To handle the case when the maximum utility score is not unique, a random tie break option was introduced. From this version, passing random_tie_break=True to the query strategies first shuffles the pool then uses a stable sorting to find the instances to query. In the case where the maximum utility score is not unique, it is equivalent of randomly sampling from the top scoring instances.

Changes

  • modAL.expected_error.expected_error_reduction runtime improved by omitting unnecessary cloning of the estimator for every instance in the pool.

0.3.3

5 years ago

New features

In this small release, the expected error and log loss reduction algorithms (Roy and McCallum, 2001) were added.

0.3.2

5 years ago

New features

In this release, the focus was on multilabel active learning strategies. The following algorithms were added:

0.3.1

5 years ago

Release notes

The new release of modAL is here! This is a milestone in its evolution, because it has just received its first contributions from the open source community! :) Thanks for @dataframing and @nikolay-bushkov for their work! Hoping to see many more contributions from the community, because modAL still has a long way to go! :)

New features

  • Ranked batch mode queries by @dataframing. With this query strategy, several instances can be queried for labeling, which alleviates a lot of problems in uncertainty sampling. For details, see Ranked batch mode learning by Cardoso et al.
  • Sparse matrix support by @nikolay-bushkov. From now, if the estimator can handle sparse matrices, you can use them to fit the active learning models!
  • Cold start support has been added to all the models. This means that now learner.query() can be used without training the model first.

Changes

  • The documentation has gone under a major refactoring thanks to @nikolay-bushkov! Type annotations have been added and the docstrings were refactored to follow Google style docstrings. The website has been changed accordingly. Instead of GitHub pages, ReadTheDocs are used and the old website is merged with the API reference. Regarding the examples, Jupyter notebooks were added by @dataframing. For details, check it out at https://modAL-python.github.io/!
  • .query() methods changed for BaseLearner and BaseCommittee to allow more general arguments for query strategies. Now it can accept any argument as long as the query_strategy function supports it.
  • .score() method was added for Committee. Fixes #6.
  • The modAL.density module was refactored using functions from sklearn.metrics.pairwise. This resulted in a major increase in performance as well as a more sustainable codebase for the module.

Bugfixes

  • 1D array handling issues fixed, numpy.vstack calls replaced with numpy.concatenate. Fixes #15.
  • np.sum(generator) calls were replaced with np.sum(np.from_iter(generator)) because deprecation of the original one.

0.3.0

6 years ago

Release notes

New features

  • Bayesian optimization. Bayesian optimization is a method for optimizing black box functions for which evaluation may be expensive and derivatives may not be available. It uses a query loop very similar to active learning, which makes it possible to implement it using an API identical to the ActiveLearner. Sampling for values are made by strategies estimating the possible gains for each point. Among these, three strategies are implemented currently: probability of improvement, expected improvement and upper confidence bounds.

Changes

  • modAL.models.BaseLearner abstract base class implemented. ActiveLearner and BayesianOptimizer both inherit from it.
  • modAL.models.ActiveLearner.query() now passes the ActiveLearner object to the query function instead of just the estimator.

Fixes

  • modAL.utils.selection.multi_argmax() now works for arrays with shape (-1, ) as well as (-1, 1).