Scikit Survival Versions Save

Survival analysis built on top of scikit-learn

v0.17.0

2 years ago

This release adds support for scikit-learn 1.0, which includes support for feature names. If you pass a pandas dataframe to fit, the estimator will set a feature_names_in_ attribute containing the feature names. When a dataframe is passed to predict, it is checked that the column names are consistent with those passed to fit. See the scikit-learn release highlights for details.

Bug fixes

Fix a variety of build problems with LLVM (#243).

Enhancements

Add support for feature_names_in_ and n_features_in_ to all estimators and transforms.
Add sksurv.preprocessing.OneHotEncoder.get_feature_names_out.
Update bundeled version of Eigen to 3.3.9.

Backwards incompatible changes

Drop min_impurity_split parameter from sksurv.ensemble.GradientBoostingSurvivalAnalysis.
base_estimators and meta_estimator attributes of sksurv.meta.Stacking do not contain fitted models anymore, use estimators_ and final_estimator_, respectively.

Deprecations

The normalize parameter of sksurv.linear_model.IPCRidge is deprecated and will be removed in a future version. Instead, use a sciki-learn pipeline: make_pipeline(StandardScaler(with_mean=False), IPCRidge()).

v0.16.0

2 years ago

This release adds support for changing the evaluation metric that is used in estimators’ score method. This is particular useful for hyper-parameter optimization using scikit-learn’s GridSearchCV. You can now use sksurv.metrics.as_concordance_index_ipcw_scorer, sksurv.metrics.as_cumulative_dynamic_auc_scorer, or sksurv.metrics.as_integrated_brier_score_scorer to adjust the score method to your needs. A detailed example is available in the User Guide.

Moreover, this release adds sksurv.ensemble.ExtraSurvivalTrees to fit an ensemble of randomized survival trees, and improves the speed of sksurv.compare.compare_survival() significantly. The documentation has been extended by a section on the time-dependent Brier score.

Bug fixes

Columns are dropped in sksurv.column.encode_categorical() despite allow_drop=False (#199).
Ensure sksurv.column.categorical_to_numeric() always returns series with int64 dtype.

Enhancements

Add sksurv.ensemble.ExtraSurvivalTrees ensemble (#195).
Faster speed for sksurv.compare.compare_survival() (#215).
Add wrapper classes sksurv.metrics.as_concordance_index_ipcw_scorer, sksurv.metrics.as_cumulative_dynamic_auc_scorer, and sksurv.metrics.as_integrated_brier_score_scorer to override the default score method of estimators (#192).
Remove use of deprecated numpy dtypes.
Remove use of inplace in pandas’ set_categories.

Documentation

Remove comments and code suggesting log-transforming times prior to training Survival SVM (#203).
Add documentation for max_samples parameter to sksurv.ensemble.ExtraSurvivalTrees and sksurv.ensemble.RandomSurvivalForest (#217).
Add section on time-dependent Brier score (#220).
Add section on using alternative metrics for hyper-parameter optimization.

v0.15.0

3 years ago

This release adds support for scikit-learn 0.24 and Python 3.9. scikit-survival now requires at least pandas 0.25 and scikit-learn 0.24. Moreover, if sksurv.ensemble.GradientBoostingSurvivalAnalysis or sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis are fit with loss='coxph', predict_cumulative_hazard_function and predict_survival_function are now available. sksurv.metrics.cumulative_dynamic_auc now supports evaluating time-dependent predictions, for instance for a sksurv.ensemble.RandomSurvivalForest as illustrated in the User Guide.

Bug fixes

Allow passing pandas data frames to all fit and predict methods (#148).
Allow sparse matrices to be passed to sksurv.ensemble.GradientBoostingSurvivalAnalysis.predict.
Fix example in user guide using GridSearchCV to determine alphas for CoxnetSurvivalAnalysis (#186).

Enhancements

Add score method to sksurv.meta.Stacking, sksurv.meta.EnsembleSelection, and sksurv.meta.EnsembleSelectionRegressor (#151).
Add support for predict_cumulative_hazard_function and predict_survival_function to sksurv.ensemble.GradientBoostingSurvivalAnalysis. and sksurv.ensemble.GradientBoostingSurvivalAnalysis if model was fit with loss='coxph'.
Add support for time-dependent predictions to sksurv.metrics.cumulative_dynamic_auc See the User Guide for an example (#134).

Backwards incompatible changes

The score method of sksurv.linear_model.IPCRidge, sksurv.svm.FastSurvivalSVM, and sksurv.svm.FastKernelSurvivalSVM (if rank_ratio is smaller than 1) now converts predictions on log(time) scale to risk scores prior to computing the concordance index.
Support for cvxpy and cvxopt solver in sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM has been dropped. The default solver is now ECOS, which was used by cvxpy (the previous default) internally. Therefore, results should be identical.
Dropped the presort argument from sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis.
The X_idx_sorted argument in sksurv.tree.SurvivalTree.fit has been deprecated in scikit-learn 0.24 and has no effect now.
predict_cumulative_hazard_function and predict_survival_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree now return an array of sksurv.functions.StepFunction objects by default. Use return_array=True to get the old behavior.
Support for Python 3.6 has been dropped.
Increase minimum supported versions of dependencies. We now require:

Package Minimum Version

Pandas 0.25.0

scikit-learn 0.24.0

Package	Minimum Version
Pandas	0.25.0
scikit-learn	0.24.0

v0.14.0

3 years ago

This release features a complete overhaul of the documentation. It features a new visual design, and the inclusion of several interactive notebooks in the User Guide.

In addition, it includes important bug fixes. It fixes several bugs in sksurv.linear_model.CoxnetSurvivalAnalysis where predict, predict_survival_function, and predict_cumulative_hazard_function returned wrong values if features of the training data were not centered. Moreover, the score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis will now correctly compute the concordance index if loss='ipcwls' or loss='squared'.

Bug fixes

sksurv.column.standardize() modified data in-place. Data is now always copied.
sksurv.column.standardize() works with integer numpy arrays now.
sksurv.column.standardize() used biased standard deviation for numpy arrays (ddof=0), but unbiased standard deviation for pandas objects (ddof=1). It always uses ddof=1 now. Therefore, the output, if the input is a numpy array, will differ from that of previous versions.
Fixed sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function() and sksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function(), which returned wrong values if features of training data were not already centered. This adds an offset_ attribute that accounts for non-centered data and is added to the predicted risk score. Therefore, the outputs of predict, predict_survival_function, and predict_cumulative_hazard_function will be different to previous versions for non-centered data (#139).
Rescale coefficients of sksurv.linear_model.CoxnetSurvivalAnalysis if normalize=True.
Fix score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis if loss='ipcwls' or loss='squared' is used. Previously, it returned 1.0 - true_cindex.

Enhancements

Add sksurv.show_versions() that prints the version of all dependencies.
Add support for pandas 1.1
Include interactive notebooks in documentation on readthedocs.
Add user guide on penalized Cox models.
Add user guide on gradient boosted models.

v0.13.1

3 years ago

This release fixes warnings that were introduced with 0.13.0.

Bug fixes

Explicitly pass return_array=True in sksurv.tree.SurvivalTree.predict to avoid FutureWarning.
Fix error when fitting sksurv.tree.SurvivalTree with non-float dtype for time (#127).
Fix RuntimeWarning: invalid value encountered in true_divide in sksurv.nonparametric.kaplan_meier_estimator.
Fix PendingDeprecationWarning about use of matrix when fitting sksurv.svm.FastSurvivalSVM if optimizer is PRSVM or simple.

v0.13.0

3 years ago

The highlights of this release include the addition of sksurv.metrics.brier_score and sksurv.metrics.integrated_brier_score and compatibility with scikit-learn 0.23.

predict_survival_function and predict_cumulative_hazard_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree can now return an array of sksurv.functions.StepFunction, similar to sksurv.linear_model.CoxPHSurvivalAnalysis by specifying return_array=False. This will be the default behavior starting with 0.14.0.

Note that this release fixes a bug in estimating inverse probability of censoring weights (IPCW), which will affect all estimators relying on IPCW.

Enhancements

Make build system compatible with PEP-517/518.
Added sksurv.metrics.brier_score and sksurv.metrics.integrated_brier_score (#101).
sksurv.functions.StepFunction can now be evaluated at multiple points in a single call.
Update documentation on usage of predict_survival_function and predict_cumulative_hazard_function (#118).
The default value of alpha_min_ratio of sksurv.linear_model.CoxnetSurvivalAnalysis will now depend on the n_samples/n_features ratio. If n_samples > n_features, the default value is 0.0001 If n_samples <= n_features, the default value is 0.01.
Add support for scikit-learn 0.23 (#119).

Deprecations

predict_survival_function and predict_cumulative_hazard_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree will return an array of sksurv.functions.StepFunction in the future (as sksurv.linear_model.CoxPHSurvivalAnalysis does). For the old behavior, use return_array=True.

Bug fixes

Fix deprecation of importing joblib via sklearn.
Fix estimation of censoring distribution for tied times with events. When estimating the censoring distribution, by specifying reverse=True when calling sksurv.nonparametric.kaplan_meier_estimator, we now consider events to occur before censoring. For tied time points with an event, those with an event are not considered at risk anymore and subtracted from the denominator of the Kaplan-Meier estimator. The change affects all functions relying on inverse probability of censoring weights, namely:
Throw an exception when trying to estimate c-index from uncomparable data (#117).
Estimators in sksurv.svm will now throw an exception when trying to fit a model to data with uncomparable pairs.

v0.12.0

4 years ago

This release adds support for scikit-learn 0.22, thereby dropping support for older versions. Moreover, the regularization strength of the ridge penalty in sksurv.linear_model.CoxPHSurvivalAnalysis can now be set per feature. If you want one or more features to enter the model unpenalized, set the corresponding penalty weights to zero. Finally, sklearn.pipeline.Pipeline will now be automatically patched to add support for predict_cumulative_hazard_function and predict_survival_function if the underlying estimator supports it.

Deprecations

Add scikit-learn's deprecation of presort in sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis.
Add warning that default alpha_min_ratio in sksurv.linear_model.CoxnetSurvivalAnalysis will depend on the ratio of the number of samples to the number of features in the future (#41).

Enhancements

Add references to API doc of sksurv.ensemble.GradientBoostingSurvivalAnalysis (#91).
Add support for pandas 1.0 (#100).
Add ccp_alpha parameter for Minimal Cost-Complexity Pruning to sksurv.ensemble.GradientBoostingSurvivalAnalysis.
Patch sklearn.pipeline.Pipeline to add support for predict_cumulative_hazard_function and predict_survival_function if the underlying estimator supports it.
Allow per-feature regularization for sksurv.linear_model.CoxPHSurvivalAnalysis (#102).
Clarify API docs of sksurv.metrics.concordance_index_censored (#96).

v0.11

4 years ago

This release adds sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest, which are based on the log-rank split criterion. It also adds the OSQP solver as option to sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM, which will replace the now deprecated cvxpy and cvxopt options in a future release.

This release removes support for sklearn 0.20 and requires sklearn 0.21.

Deprecations

The cvxpy and cvxopt options for solver in sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM are deprecated and will be removed in a future version. Choosing osqp is the preferred option now.

Enhancements

Add support for pandas 0.25.
Add OSQP solver option to sksurv.svm.MinlipSurvivalAnalysis, and sksurv.svm.HingeLossSurvivalSVM which has no additional dependencies.
Fix issue when using cvxpy 1.0.16 or later.
Explicitly specify utf-8 encoding when reading README.rst (#89).
Add sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest (#90).

Bug fixes

Exclude Cython-generated files from source distribution because they are not forward compatible.

v0.10

4 years ago

This release adds the ties argument to sksurv.linear_model.CoxPHSurvivalAnalysis to choose between Breslow’s and Efron’s likelihood in the presence of tied event times. Moreover, sksurv.compare.compare_survival() has been added, which implements the log-rank hypothesis test for comparing the survival function of 2 or more groups.

Enhancements

Update API doc of predict function of boosting estimators (#75).
Clarify documentation for GradientBoostingSurvivalAnalysis (#78).
Implement Efron’s likelihood for handling tied event times.
Implement log-rank test for comparing survival curves.
Add support for scipy 1.3.1 (#66).

Bug fixes

Re-add baseline_survival_ and cum_baseline_hazard_ attributes to sksurv.linear_model.CoxPHSurvivalAnalysis (#76).

v0.9

4 years ago

This release adds support for sklearn 0.21 and pandas 0.24.

Enhancements

Add reference to IPCRidge (#65).
Use scipy.special.comb instead of deprecated scipy.misc.comb.
Add support for pandas 0.24 and drop support for 0.20.
Add support for scikit-learn 0.21 and drop support for 0.20 (#71).
Explain use of intercept in ComponentwiseGradientBoostingSurvivalAnalysis (#68)
Bump Eigen to 3.3.7.

Bug fixes

Disallow scipy 1.3.0 due to scipy regression (#66).