A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
datetime.timedelta
conversion.is_min_optimal
, is_max_optimal
for BuiltinMetrics
. #1890libcatboostr-darwin.dylib
instead of libcatboostr-darwin.so
on macOS. #1834CatBoostError: (No such file or directory) bad new file name
when using grid_search
. #1893:warning: PySpark support is broken in this release.. Please use release 1.0.3 instead.
rsm
< 1.calc_feature_statistics
for cat features. #1882metric_period
has been specifiedeval_metric
for Multitarget trainingIn this release, we decided to increment the major version as we think that CatBoost is pretty stable and production-ready. We know, that CatBoost is used a lot in many different companies and individual projects, and we think, that all the features we added in the last year are worth incrementing major version. And of course, as many programmers, we love the magic of binary numbers and we want to celebrate 100₂ anniversary since CatBoost first release on Github 🥳
We've improved training speed on numeric datasets:
use_best_model
and early stopping works independently on each fold, as we are trying to make single fold training as close to regular training as possible. If one model stops at iteration i
we use the last value of metric in the mean score plot for points with [i+1; last iteration)
.MultiRMSEWithMissingValues
loss functionpredict_proba
function from X
to data
, fixes #1785eval_metrics
. Thanks to @ebalukova.numba
(if available)use_weights
for some eval_metrics on GPU - use_weights=False
is always respected nowEvalMetricsResult.get_metric()
by @RoffildThis release includes CatBoost for Apache Spark package that supports training, model application and feature evaluation on Apache Spark platform. We've prepared CatBoost for Apache Spark introduction and CatBoost for Apache Spark Architecture videos for introduction. More details available at CatBoost for Apache Spark home page.
CatBoost supports recursive feature elimination procedure - when you have lot's of feature candidates and you want to select only most influential features by training models and selecting only strongest by feature importance. You can look for details in our tutorial
leaf_estimation_method=Exact
explicitly, in next releases we are planning to set it by default.pathlib.Path
in python packagedcg==1
when there is no relevant objects in group (when ideal DCG equals zero), later we used score==0
in that case.boost_from_average
for MultiRMSE
loss. Issue #1515feature_importances_
for fstr with textsscore()
method for RMSEWithUncertainty
issue #1482prediction_type
in score()