Survival analysis built on top of scikit-learn
This release adds support for scikit-learn 1.0, which includes support for feature names. If you pass a pandas dataframe to fit
, the estimator will set a feature_names_in_
attribute containing the feature names. When a dataframe is passed to predict
, it is checked that the column names are consistent with those passed to fit
. See the scikit-learn release highlights for details.
feature_names_in_
and n_features_in_
to all estimators and transforms.sksurv.preprocessing.OneHotEncoder.get_feature_names_out
.min_impurity_split
parameter from sksurv.ensemble.GradientBoostingSurvivalAnalysis
.base_estimators
and meta_estimator
attributes of sksurv.meta.Stacking
do not contain fitted models anymore, use estimators_
and final_estimator_
, respectively.normalize
parameter of sksurv.linear_model.IPCRidge
is deprecated and will be removed in a future version. Instead, use a sciki-learn pipeline: make_pipeline(StandardScaler(with_mean=False), IPCRidge())
.This release adds support for changing the evaluation metric that is used in estimators’ score
method. This is particular useful for hyper-parameter optimization using scikit-learn’s GridSearchCV
. You can now use sksurv.metrics.as_concordance_index_ipcw_scorer, sksurv.metrics.as_cumulative_dynamic_auc_scorer, or sksurv.metrics.as_integrated_brier_score_scorer to adjust the score
method to your needs. A detailed example is available in the User Guide.
Moreover, this release adds sksurv.ensemble.ExtraSurvivalTrees to fit an ensemble of randomized survival trees, and improves the speed of sksurv.compare.compare_survival() significantly. The documentation has been extended by a section on the time-dependent Brier score.
allow_drop=False
(#199).score
method of estimators (#192).inplace
in pandas’ set_categories
.max_samples
parameter to sksurv.ensemble.ExtraSurvivalTrees and sksurv.ensemble.RandomSurvivalForest (#217).This release adds support for scikit-learn 0.24 and Python 3.9. scikit-survival now requires at least pandas 0.25 and scikit-learn 0.24. Moreover, if sksurv.ensemble.GradientBoostingSurvivalAnalysis or sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis are fit with loss='coxph'
, predict_cumulative_hazard_function and predict_survival_function are now available. sksurv.metrics.cumulative_dynamic_auc now supports evaluating time-dependent predictions, for instance for a sksurv.ensemble.RandomSurvivalForest as illustrated in the User Guide.
fit
and predict
methods (#148).loss='coxph'
.The score method of sksurv.linear_model.IPCRidge, sksurv.svm.FastSurvivalSVM, and sksurv.svm.FastKernelSurvivalSVM (if rank_ratio
is smaller than 1) now converts predictions on log(time) scale to risk scores prior to computing the concordance index.
Support for cvxpy and cvxopt solver in sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM has been dropped. The default solver is now ECOS, which was used by cvxpy (the previous default) internally. Therefore, results should be identical.
Dropped the presort
argument from sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis.
The X_idx_sorted
argument in sksurv.tree.SurvivalTree.fit has been deprecated in scikit-learn 0.24 and has no effect now.
predict_cumulative_hazard_function and predict_survival_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree now return an array of sksurv.functions.StepFunction objects by default. Use return_array=True
to get the old behavior.
Support for Python 3.6 has been dropped.
Increase minimum supported versions of dependencies. We now require:
Package Minimum Version Pandas 0.25.0 scikit-learn 0.24.0
This release features a complete overhaul of the documentation. It features a new visual design, and the inclusion of several interactive notebooks in the User Guide.
In addition, it includes important bug fixes. It fixes several bugs in sksurv.linear_model.CoxnetSurvivalAnalysis where predict
, predict_survival_function
, and predict_cumulative_hazard_function
returned wrong values if features of the training data were not centered. Moreover, the score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis will now correctly compute the concordance index if loss='ipcwls'
or loss='squared'
.
ddof=0
), but unbiased standard deviation for pandas objects (ddof=1
). It always uses ddof=1
now. Therefore, the output, if the input is a numpy array, will differ from that of previous versions.predict
, predict_survival_function
, and predict_cumulative_hazard_function
will be different to previous versions for non-centered data (#139).normalize=True
.loss='ipcwls'
or loss='squared'
is used. Previously, it returned 1.0 - true_cindex
.sksurv.show_versions()
that prints the version of all dependencies.This release fixes warnings that were introduced with 0.13.0.
return_array=True
in sksurv.tree.SurvivalTree.predict to avoid FutureWarning.PRSVM
or simple
.The highlights of this release include the addition of sksurv.metrics.brier_score and sksurv.metrics.integrated_brier_score and compatibility with scikit-learn 0.23.
predict_survival_function
and predict_cumulative_hazard_function
of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree can now return an array of sksurv.functions.StepFunction, similar to sksurv.linear_model.CoxPHSurvivalAnalysis by specifying return_array=False
. This will be the default behavior starting with 0.14.0.
Note that this release fixes a bug in estimating inverse probability of censoring weights (IPCW), which will affect all estimators relying on IPCW.
predict_survival_function
and
predict_cumulative_hazard_function
(#118).alpha_min_ratio
of sksurv.linear_model.CoxnetSurvivalAnalysis will now depend on the n_samples/n_features
ratio. If n_samples > n_features
, the default value is 0.0001 If n_samples <= n_features
, the default value is 0.01.predict_survival_function
and predict_cumulative_hazard_function
of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree will return an array of sksurv.functions.StepFunction in the future (as sksurv.linear_model.CoxPHSurvivalAnalysis does). For the old behavior, use return_array=True
.reverse=True
when calling sksurv.nonparametric.kaplan_meier_estimator, we now consider events to occur before censoring. For tied time points with an event, those with an event are not considered at risk anymore and subtracted from the denominator of the Kaplan-Meier estimator. The change affects all functions relying on inverse probability of censoring weights, namely:
sksurv.svm
will now throw an exception when trying to fit a model to data with uncomparable pairs.This release adds support for scikit-learn 0.22, thereby dropping support for older versions. Moreover, the regularization strength of the ridge penalty in sksurv.linear_model.CoxPHSurvivalAnalysis can now be set per feature. If you want one or more features to enter the model unpenalized, set the corresponding penalty weights to zero. Finally, sklearn.pipeline.Pipeline will now be automatically patched to add support for predict_cumulative_hazard_function
and predict_survival_function
if the underlying estimator supports it.
presort
in sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis.alpha_min_ratio
in sksurv.linear_model.CoxnetSurvivalAnalysis will depend on the ratio of the number of samples to the number of features in the future (#41).ccp_alpha
parameter for Minimal Cost-Complexity Pruning to sksurv.ensemble.GradientBoostingSurvivalAnalysis.predict_cumulative_hazard_function
and predict_survival_function
if the underlying estimator supports it.This release adds sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest,
which are based on the log-rank split criterion. It also adds the OSQP solver as option to sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM, which will replace the now deprecated cvxpy
and cvxopt
options in a future release.
This release removes support for sklearn 0.20 and requires sklearn 0.21.
cvxpy
and cvxopt
options for solver
in sksurv.svm.MinlipSurvivalAnalysis and
sksurv.svm.HingeLossSurvivalSVM are deprecated and will be removed in a future version. Choosing osqp
is the preferred option now.This release adds the ties argument to sksurv.linear_model.CoxPHSurvivalAnalysis to choose between Breslow’s and Efron’s likelihood in the presence of tied event times. Moreover, sksurv.compare.compare_survival() has been added, which implements the log-rank hypothesis test for comparing the survival function of 2 or more groups.
This release adds support for sklearn 0.21 and pandas 0.24.