Kmodes Versions Save

Python implementations of the k-modes and k-prototypes clustering algorithms, for clustering categorical data

0.12.2

1 year ago

What's changed

Full Changelog: https://github.com/nicodv/kmodes/compare/0.12.1...0.12.2

0.12.1

2 years ago

What's changed

Full Changelog: https://github.com/nicodv/kmodes/compare/0.12.0...0.12.1

0.12.0

2 years ago

What's changed

Full Changelog: https://github.com/nicodv/kmodes/compare/0.11.1...0.12.0

0.11.1

2 years ago

What's Changed

Full Changelog: https://github.com/nicodv/kmodes/compare/0.11.0...0.11.1

0.11.0

3 years ago
  • Python 3.9 support
  • Minimum sklearn version upgrade to 0.22
  • Default init method for k-prototypes is now the Cao method (same as k-modes and in line with documentation), courtesy of @larroy
  • Optimizations

0.10.2

4 years ago
  • Added Jaccard dissimilarity function, courtesy of @BikashPandey17 (#129 )
  • Return the costs per epoch after training, courtesy of @daffidwilde (#79 )
  • Python 3.8 now supported
  • Python 3.4 no longer supported because sklearn dropped it too
  • Various bugfixes and improvements

0.9

4 years ago
  • Bugfixes

0.7

4 years ago
  • Categorical variables are now automatically encoded and decoded between original data values and integers (used internally by k-modes). User does not have to use to the categorical variable mapping anymore when looking at the cluster centroids.
  • Support for custom dissimilarity measures
  • Python 3.6 support
  • More robust manual initialization

0.8

4 years ago
  • Huge speedup for k-prototypes, especially for large numbers of samples (#45). A k-prototypes benchmark script is included in examples now.
  • Offer an implementation of Ng's dissimilarity measure, which could improve convergence (#37).
  • Allow pandas DataFrames to be presented to the algorithm, instead of just numpy arrays (#40).
  • Improved handling of dependencies (#49, #53).
  • Various small bugfixes and improvements.

0.10.0

4 years ago
  • Support for more than 256 clusters
  • Optional parallel execution of the multiple initialization runs (courtesy of @rphes )
  • Enhanced error checking when using pandas DataFrames as inputs to the algorithms
  • Various bug fixes and improvements
  • Semantic versioning from now on