Skip to content

xgbse._meta.XGBSEBootstrapEstimator

Bootstrap meta-estimator for XGBSE models:

  • allows for confidence interval estimation for XGBSEDebiasedBCE and XGBSEStackedWeibull
  • provides variance stabilization for all models, specially for XGBSEKaplanTree

Performs simple bootstrap with sample size equal to training set size.

__init__(self, base_estimator, n_estimators=10, random_state=42) special

Parameters:

Name Type Description Default
base_estimator XGBSEBaseEstimator

Base estimator for bootstrap procedure

required
n_estimators int

Number of estimators to fit in bootstrap procedure

10
random_state int

Random state for resampling function

42

fit(self, X, y, **kwargs)

Fit several (base) estimators and store them.

Parameters:

Name Type Description Default
X [pd.DataFrame, np.array]

Features to be used while fitting XGBoost model

required
y structured array(numpy.bool_, numpy.number

Binary event indicator as first field, and time of event or time of censoring as second field.

required
**kwargs

Keyword arguments to be passed to .fit() method of base_estimator

{}

Returns:

Type Description
XGBSEBootstrapEstimator

Trained instance of XGBSEBootstrapEstimator

predict(self, X, return_ci=False, ci_width=0.683, return_interval_probs=False)

Predicts survival as given by the base estimator. A survival function, its upper and lower confidence intervals can be returned for each sample of the dataframe X.

Parameters:

Name Type Description Default
X pd.DataFrame

data frame with samples to generate predictions

required
return_ci Bool

whether to include confidence intervals

False
ci_width Float

width of confidence interval

0.683

Returns:

Type Description
([(pd.DataFrame, np.array, np.array), pd.DataFrame])

preds_df: A dataframe of survival probabilities for all times (columns), from a time_bins array, for all samples of X (rows). If return_interval_probs is True, the interval probabilities are returned instead of the cumulative survival probabilities.

upper_ci: Upper confidence interval for the survival probability values

lower_ci: Lower confidence interval for the survival probability values